* * * READ ME * * * * * * Symantec Storage Foundation HA 6.1 * * * * * * Patch 6.1.0.300 * * * Patch Date: 2014-05-30 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH * KNOWN ISSUES PATCH NAME ---------- Symantec Storage Foundation HA 6.1 Patch 6.1.0.300 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- RHEL5 x86-64 PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSllt VRTSlvmconv VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Symantec Cluster Server 6.1 * Symantec Dynamic Multi-Pathing 6.1 * Symantec Storage Foundation 6.1 * Symantec Storage Foundation Cluster File System HA 6.1 * Symantec Storage Foundation for Oracle RAC 6.1 * Symantec Storage Foundation HA 6.1 * Symantec Volume Manager 6.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: VRTSvxvm-6.1.0.100 * 3444900 (3399131) For PowerPath (PP) enclosure, both DA_TPD and DA_COEXIST_TPD flags are set. * 3445234 (3358904) System with ALUA enclosures sometimes panics during path fault scenarios * 3445235 (3374117) I/O hangs on VxVM SmartIO enabled data volume with one detached plex. * 3445236 (3381006) System panics when collecting stats for SmartIO by VxIO driver. * 3446001 (3380481) vxdiskadm is throwing following error "/usr/lib/vxvm/voladm.d/lib/vxadm_syslib.sh: line 2091: return: -1: invalid option" if a Removed disk is selected while performing "5 Replace a failed or removed disk" operation. * 3446112 (3288744) In a FSS(Flexible Storage Sharing) diskgroup whenever new mirror is added to a volume DCO (Data Change Object) associated with volume is not mirrored. * 3446126 (3338208) writes from fenced out node on Active-Passive (AP/F) shared storage device fail with unexpected error * 3447244 (3427171) I/Os issued immediately after system boot on a VxVM (Veritas Volume Manage) SmartIO associated volumes will lead to system panic. * 3447245 (3385905) SSD : data corruption seen with multiple randio threads in parallel * 3447306 (3424798) Veritas Volume Manager (VxVM) mirror attach operations (e.g., plex attach, vxassist mirror, and third-mirror break-off snapshot resynchronization) may take longer time under heavy application I/O load. * 3447530 (3410839) In FSS environment, we should avoid creating layered volume, even beyond 1GB * 3447894 (3353211) A. After EMC Symmetrix BCV (Business Continuance Volume) device switches to read-write mode, continuous vxdmp (Veritas Dynamic Multi Pathing) error messages flood syslog. B. DMP metanode/path under DMP metanode gets disabled unexpectedly. * 3449714 (3417185) Rebooting host, after the exclusion of a dmpnode while I/O is in progress on it, leads to vxconfigd core dump. * 3452709 (3317430) The vxdiskunsetup utility throws error after upgradation from 5.1SP1RP4. * 3452727 (3279932) The vxdisksetup and vxdiskunsetup utilities were failing on disk which is part of a deported disk group (DG), even if "-f" option is specified. * 3452811 (3445120) Change tunable VOL_MIN_LOWMEM_SZ value to trigger early readback. * 3453105 (3079819) vxconfigbackup and vxconfigrestore failing for FSS(Flexible Storage Sharing) diskgroup having remote disks. * 3453163 (3331769) vxconfigrestore command does not restore the latest configuration. * 3455455 (3409612) The value of reclaim_on_delete_start_time cannot be set to values outside the range: 22:00-03:59 * 3456729 (3428025) System running Symantec Replication Option (VVR) and configured as VVR primary crashes when heavy parallel I/Os load are issued * 3458036 (3418830) Node boot-up is getting hung while starting vxconfigd * 3458799 (3197987) vxconfigd dumps core, when 'vxddladm assign names file=' is executed and the file has one or more invalid values for enclosure vendor ID or product ID. * 3470346 (3377383) The vxconfigd crashes when a disk under Dynamic Multi-pathing (DMP) reports device failure. * 3484547 (3383625) When node contributing storage to FSS(Flexible Storage Sharing) disk group rejoins the cluster, disks brought back by that node are not getting reattached. * 3498795 (3394926) vx* commands hang when mirror-stripe format is used after a reboot of master node. Patch ID: VRTSlvmconv-6.1.0.100 * 3496079 (3496077) The vxvmconvert(1m) command fails with an error message while converting Logical Volume Manager (LVM) Volume Groups (VG) into VxVM disk group. Patch ID: VRTSllt-6.1.0.200 * 3458677 (3460985) System panics and logs an error message in the syslog. Patch ID: VRTSllt-6.1.0.100 * 3376505 (3410309) The LLT driver fails to load and logs a message in syslog. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: VRTSvxvm-6.1.0.100 * 3444900 (Tracking ID: 3399131) SYMPTOM: The following command fails with an error for a path managed by Third Party Driver (TPD) which co-exists with DMP. # vxdmpadm -f disable path= VxVM vxdmpadm ERROR V-5-1-11771 Operation not supported DESCRIPTION: The Third Party Drivers manage the devices with or without the co-existence of Dynamic Multi Pathing driver. Disabling the paths managed by a third party driver which does not co-exist with DMP is not supported. But due to bug in the code, disabling the paths managed by a third party driver which co-exists with DMP also fails. The same flags are set for all third party driver devices. RESOLUTION: The code has been modified to block this command only for the third party drivers which cannot co-exist with DMP. * 3445234 (Tracking ID: 3358904) SYMPTOM: System panics with following stack: dmp_alua_get_owner_state() dmp_alua_get_path_state() dmp_get_path_state() dmp_get_enabled_ctlrs() dmp_info_ioctl() gendmpioctl() dmpioctl() vol_dmp_ktok_ioctl() dmp_get_enabled_cntrls() vx_dmp_config_ioctl() quiescesio_start() voliod_iohandle() voliod_loop() kernel_thread() DESCRIPTION: System running with Asymmetric Logical Unit Access (ALUA) LUNs sometimes panics during path fault scenarios. This happens due to possible NULL pointer access in some of cases due to bug in the code . RESOLUTION: Code changes have been made to fix the bug. * 3445235 (Tracking ID: 3374117) SYMPTOM: Write I/O issued on a layered VxVM volume may hang if it has a detached plex and has VxVM SmartIO enabled. DESCRIPTION: VxVM uses interlocking to prevent parallel writes issued on a data volume. In certain scenarios involving a detached plex, due to a bug, write I/O issued on a SmartIO enabled VxVM volume tries to take the interlock multiple times which leads to I/O hang. RESOLUTION: Changes have been made to avoid double interlocking in above mentioned scenarios. * 3445236 (Tracking ID: 3381006) SYMPTOM: System panic happens with the below stack, when I/Os are issued on volume associated with Veritas Volume Manager(VxVM) block level SmartIO caching. stack trace of panic: volmv_accum_cache_stats+0x380 vol_cache_write_done+0x178 volkcontext_process+0x6a4 voldiskiodone+0x10a8 ... DESCRIPTION: Improper lock for the stats is taken when collecting SmartIO stats. This leads to system panic as another thread modified the pointer to NULL. RESOLUTION: Proper lock is held before collecting stats for SmartIO in kernel. * 3446001 (Tracking ID: 3380481) SYMPTOM: vxdiskadm throws following error after selecting operation "5 Replace a failed or removed disk" "/usr/lib/vxvm/voladm.d/lib/vxadm_syslib.sh: line 2091: return: -1: invalid option". DESCRIPTION: From bash version 4.0, negative error values are not accepted. Few vxvm scripts used to return negative values to bash, hence causing error. RESOLUTION: Respective code changes are made so that negative value is not returned to bash. * 3446112 (Tracking ID: 3288744) SYMPTOM: In a FSS(Flexible Storage Sharing) diskgroup whenever new mirror is added to a volume DCO (Data Change Object) associated with volume is not mirrored. DESCRIPTION: In FSS environment if volume's DCO object is not mirrored across all hosts which are contributing to storage of the volume, there is a possibility that volume recovery and mirror attaches for detached mirrors requires full-re-synchronization data. There is also a possibility of losing snapshots if nodes which are contributing to storage of DCO volume go down. This essentially defeats the purpose of having default DCO volume in FSS environments. RESOLUTION: The code has been changed such that whenever a new data mirror is added to a volume in a FSS diskgroup, a new DCO mirror is also added on the same disks as the data volume. This increases the redundancy of DCO volume associated to data volume and helps in avoiding full data re-synchronization cases described above. * 3446126 (Tracking ID: 3338208) SYMPTOM: writes from fenced out LDOM guest node on Active-Passive (AP/F) shared storage device fail. I/O failure messages seen in system log.. .. Mon Oct 14 06:03:39.411: I/O retry(6) on Path c0d8s2 belonging to Dmpnode emc_clariion0_48 Mon Oct 14 06:03:39.951: SCSI error occurred on Path c0d8s2: opcode=0x2a reported device not ready (status=0x2, key=0x2, asc=0x4, ascq=0x3) LUN not ready, manual intervention required Mon Oct 14 06:03:39.951: I/O analysis done as DMP_PATH_FAILURE on Path c0d8s2 belonging to Dmpnode emc_clariion0_48 Mon Oct 14 06:03:40.311: Marked as failing Path c0d1s2 belonging to Dmpnode emc_clariion0_48 Mon Oct 14 06:03:40.671: Disabled Path c0d8s2 belonging to Dmpnode emc_clariion0_48 due to path failure .. .. DESCRIPTION: write SCSI commands from fenced out host should fail with reservation conflict from shared device. This error code needs to be propagated to upper layers for appropriate action. In DMP ioctl interface, DMP first sends command through available active paths. If command fails then command was unnecessarily tried on passive paths. This causes command to be failed with not ready error code and this error code getting propagated to upper layer instead of reservation conflict. RESOLUTION: Added code changes to not retry IO SCSI commands on passive paths in case of Active-Passive (AP/F) shared storage device. * 3447244 (Tracking ID: 3427171) SYMPTOM: System panic happens with the below stack, when I/Os are issued on volume associated with Veritas Volume Manager(VxVM) block level SmartIO caching immediately after system reboot. stack trace of panic: _vol_unspinlock+0000CC volmv_accum_cache_stats+0001C0 vol_cache_read_done+000148 volkcontext_process+0003C8 voldiskiodone+001610 ... DESCRIPTION: VxVM detects all the CPUs on which I/O is happening. VxVM SmartIO uses this facility to enable stats collection for each CPU detected. A bug in SmartIO has caused SmartIO not to detect this CPU change and use wrong pointers for unlocking leading to system panic. RESOLUTION: Changes have been made to properly identify the newly detected CPU before using locks associated with the same. * 3447245 (Tracking ID: 3385905) SYMPTOM: Corrupt data might be returned for a read on a data volume with VxVM(Veritas Volume Manager) SmartIO caching enabled if the VxVM cachearea is made offline with the command "sfcache offline " and later made online without a system reboot. DESCRIPTION: When the cachearea is made offline, few incore data structures of cachearea are not reset. When it is made online again the cachearea is re-created afresh as warm cache is not used in this case. Since the incore structures now have stale information about cachearea it results in corruption. RESOLUTION: Reinitalize the incore data structures of cachearea if warm cache is not used during cachearea online. * 3447306 (Tracking ID: 3424798) SYMPTOM: Veritas Volume Manager (VxVM) mirror attach operations (e.g., plex attach, vxassist mirror, and third-mirror break-off snapshot resynchronization) may take longer time under heavy application I/O load. The vxtask list command shows tasks are in the 'auto-throttled (waiting)' state for a long time. DESCRIPTION: With the AdminIO de-prioritization feature, VxVM administrative I/O's (e.g. plex attach, vxassist mirror, and third-mirror break-off snapshot resynchronization) are de-prioritized under heavy application I/O load, but this can lead to very slow progress of these operations. RESOLUTION: The code is modified to disable the AdminIO de-prioritization feature. * 3447530 (Tracking ID: 3410839) SYMPTOM: Volume allocation operations like make, grow, convert, etc can internally lead to layered layouts. In FSS (Flexible Storage Sharing) environment, this eventually causes full mirror synchronization for the layered volume recovery, when any node goes down. DESCRIPTION: During volume allocation operations, vxassist sometimes automatically creates a layered layout for better resiliency. This depends on criteria like the volume size and certain upper limit settings. Layered layouts prevents disabling of the complete volume in case some underlying disk(s) get affected. However, DCO (Data Change Object) is not supported for layered layouts. Hence, optimized mirror synchronization cannot be done for volume recovery, when any node goes down, which is a likely scenario in a FSS environment. RESOLUTION: Added checks in vxassist allocation operations, to prevent internal change to layered layout for volumes in a FSS diskgroup. * 3447894 (Tracking ID: 3353211) SYMPTOM: A. After EMC Symmetrix BCV (Business Continuance Volume) device switches to read-write mode, continuous vxdmp (Veritas Dynamic Multi Pathing) error messages flood syslog as shown below: NOTE VxVM vxdmp V-5-3-1061 dmp_restore_node: The path 18/0x2 has not yet aged - 299 NOTE VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x24/0xD0 NOTE VxVM vxdmp V-5-3-1062 dmp_restore_node: Unstable path 18/0x230 will not be available for I/O until 300 seconds NOTE VxVM vxdmp V-5-3-1061 dmp_restore_node: The path 18/0x2 has not yet aged - 299 NOTE VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x6) on dmpnode 36/0xD0 .. .. B. DMP metanode/path under DMP metanode gets disabled unexpectedly. DESCRIPTION: A. DMP caches the last discovery NDELAY open for the BCV dmpnode paths. BCV device switching to read-write mode is an array side operation. Typically in such cases, the system administrators are required to run the following command: 1. vxdisk rm OR In case of parallel backup jobs, 1. vxdisk offline 2. vxdisk online This causes DMP to close cached open, and during the next discovery, the device is opened in read-write mode. If the above steps are skipped then it causes the DMP device to go in state where one of the paths is in read-write mode and the others remain in NDELAY mode. If the above layers request for NORMAL open, then the DMP has the code to close NDELAY cached open and reopen in NORMAL mode. When the dmpnode is online, this happens only for one of the paths of dmpnode. B. DMP performs error analysis for paths on which I/O has failed. In some cases, the SCSI probes are sent, failed with the return value/sense codes that are not handled by DMP. This causes the paths to get disabled. RESOLUTION: A. The code is modified for the DMP EMC ASL (Array Support Library) to handle case A for EMC Symmetrix arrays. B. The DMP code is modified to handle the SCSI conditions correctly for case B. * 3449714 (Tracking ID: 3417185) SYMPTOM: Rebooting host, after the exclusion of a dmpnode while I/O is in progress on it, leads to vxconfigd core dump. DESCRIPTION: The function which deletes the path after exclusion, does not update the corresponding data structures properly. Consequently, rebooting the host, after the exclusion of dmpnode while I/O is in progress on it, leads to vxconfigd core dump with following stack ddl_find_devno_in_table () ddl_get_disk_policy () devintf_add_autoconfig_main () devintf_add_autoconfig () mode_set () req_vold_enable () request_loop () main () RESOLUTION: Code changes have been made to update the data structure properly. * 3452709 (Tracking ID: 3317430) SYMPTOM: "vxdiskunsetup" utility throws error during execution. Following error message is observed. "Device unexport failed: Operation is not supported". DESCRIPTION: In vxdisksetup vxunexport is called, without checking if the disk is exported or the CVM protocol version. When the bits are upgraded from 5.1SP1RP4, the CVM protocol version is not updated. Hence the error. RESOLUTION: Code changes done so that the vxunexport is called after doing proper checks. * 3452727 (Tracking ID: 3279932) SYMPTOM: The vxdisksetup and vxdiskunsetup utilities were failing on disk which is part of deported disk group (DG), even if '-f' option is specified. The vxdisksetup command fails with following error: VxVM vxedpart ERROR V-5-1-10089 partition modification failed : Device or resource busy The vxdiskunsetup command fails following error: VxVM vxdisk ERROR ERROR V-5-1-0 Device appears to be owned by disk group . Use -f option to force destroy. VxVM vxdiskunsetup ERROR V-5-2-5052 : Disk destroy failed. DESCRIPTION: The vxdisksetup and vxdiskunsetup utilities internally call the 'vxdisk' utility. Due to a defect in vxdisksetup and vxdiskunsetup, the vxdisk operation used to fail on disk which is part of deported DG, even if '-f' operation is requested by user. RESOLUTION: Code changes are done to the vxdisksetup and vxdiskunsetup utilities so that when "-f" option is specified the operation succeeds. * 3452811 (Tracking ID: 3445120) SYMPTOM: Default value of tunable 'vol_min_lowmem_sz' is not consistent on all platforms. DESCRIPTION: Default value of tunable 'vol_min_lowmem_sz' is not consistent on all platforms. RESOLUTION: Set the default value of tunable 'vol_min_lowmem_sz' to 32MB on all platforms. * 3453105 (Tracking ID: 3079819) SYMPTOM: In Cluster Volume Manager(CVM), configuration backup and configuration restore is failing for FSS diskgroups. DESCRIPTION: While taking configuration backup of diskgroup and restoration of configuration as well, SCSI inquiry is carried out on disks belonging to a diskgroup. If a diskgroup has remote/lfailed/lmissing disks on specific node of a cluster, SCSI inquiry on such disks will fail as node does not have connectivity to such disks. This results in a failure of configuration backup and configuration restore as well. RESOLUTION: Code has been changed to address the above mentioned issue in FSS environment so that only the disks which are physically connected to a node will be considered for configuration backup and restore. * 3453163 (Tracking ID: 3331769) SYMPTOM: Restore of latest Diskgroup configuration is not happening. Diskgroup configuration corresponding to the old backup is getting restored. DESCRIPTION: While taking the configuration backup of a diskgroup, the private region content gets updated on the disk. When we take a configuration backup for the first time, vxconfigbackup reads the data from the disk and in turn updates the buffer cache. When we again take a backup after modifying the diskgroup configuration within a short span, vxconfigbackup gets the data from the buffer cache . In such cases, buffer cache has the stale data, that is the data when the first backup was taken and hence vxconfigbackup backs up old config data again. vxconfigrestore fails to restore the latest configuration when invoked on this backup. RESOLUTION: Code changes are done to address this issue so that vxconfigbackup always queries for the required configuration data from disk, bypassing the buffer cache. * 3455455 (Tracking ID: 3409612) SYMPTOM: Fails to run "vxtune reclaim_on_delete_start_time " if the specified value is outside the range of 22:00-03:59 (E.g. setting it to 04:00 or 19:30 fails). DESCRIPTION: Tunable reclaim_on_delete_start_time can be set to any time value within 00:00 to 23:59. But because of the wrong regular expression to parse time, it cannot be set to all values in 00:00 - 23:59. RESOLUTION: The regular expression has been updated to parse time format correctly. Now all values in 00:00-23:59 can be set. * 3456729 (Tracking ID: 3428025) SYMPTOM: When heavy parallel I/O load is issued, system running Symantec Replication Option (VVR) and configured as VVR primary, panics with the below stack: schedule vol_alloc vol_zalloc volsio_alloc_kc_types vol_subdisk_iogen_base volobject_iogen vol_rv_iogen volobject_iogen vol_rv_batch_write_start volkcontext_process vol_rv_start_next_batch vol_rv_batch_write_done [...] vol_rv_batch_write_done volkcontext_process vol_rv_start_next_batch vol_rv_batch_write_done volkcontext_process vol_rv_start_next_batch vol_rv_batch_kio volkiostart vol_linux_kio_start vol_fsvm_strategy DESCRIPTION: Heavy parallel I/O load leads to I/O throttling in Symantec Replication Option (VVR). Improper throttle handling leads to kernel stack overflow. RESOLUTION: Handled I/O throttle correctly which avoids stack overflow and subsequent panic. * 3458036 (Tracking ID: 3418830) SYMPTOM: Node boot-up is getting hung while starting vxconfigd if '/etc/VRTSvcs/conf/sysname' file or '/etc/vx/vxvm-hostprefix' file is not present. User will see following messages on console. # Starting up VxVM ... # VxVM general startup... After these messages, it will hung. DESCRIPTION: While generating unique prefix, we were calling scanf instead of sscanf for fetching prefix which resulted in this hang. So while starting vxconfigd, it was waiting for some user input because of scanf which resulted in hang while booting up the node. RESOLUTION: Code changes are done to address this issue. * 3458799 (Tracking ID: 3197987) SYMPTOM: vxconfigd dumps core, when 'vxddladm assign names file= is executed and the file has one or more invalid values for enclosure vendor ID or product ID. DESCRIPTION: When the input file provided to 'vxddladm assign names file=' has invalid vendor ID or product ID, the vxconfigd daemon is unable to find the corresponding enclosure being referred to and makes an invalid memory reference. Following stack trace can be seen- strncasecmp () from /lib/libc.so.6 ddl_load_namefile () req_ddl_set_names () request_loop () main () RESOLUTION: As a fix the vxconfigd daemon verifies the validity of input vendor ID and product ID before making a memory reference to the corresponding enclosure in its internal data structures. * 3470346 (Tracking ID: 3377383) SYMPTOM: vxconfigd crashes when a disk under DMP reports device failure. After this, the following error will be seen when a VxVM (Veritas Volume Manager)command is excuted:- "VxVM vxdisk ERROR V-5-1-684 IPC failure: Configuration daemon is not accessible" DESCRIPTION: If a disk fails and reports certain failure to DMP (Veritas Dynamic Multipathing), then vxconfigd crashes because that error is not handled properly. RESOLUTION: The code is modified to properly handle the device failures reported by a failed disk under DMP. * 3484547 (Tracking ID: 3383625) SYMPTOM: When cluster node that contributes the storage rejoins the cluster, the local disks of the joiner are not getting reattached. DESCRIPTION: When any cluster node that contributes the storage rejoins the cluster, vxattachd daemon will identify the local disks of the joiner and will reattach those disks. But due to bug in the parsing logic, we are not able to identify the local disks of the joiner and hence local disks of the joiner are not available to other cluster nodes. RESOLUTION: Made appropriate code changes to identify the disks properly. * 3498795 (Tracking ID: 3394926) SYMPTOM: During cluster reconfiguration, VxVM (Veritas Volume Manager) commands hang intermittently with the below stack on a slave node in a CVM (Veritas Cluster Volume Manager) environment with Flexible Storage Sharing(FSS) feature in use: vol_commit_iowait_objects+ volktcvm_kmsgwait_objects volktcvm_iolock_wait vol_ktrans_commit volconfig_ioctl volsioctl_real vols_ioctl vols_compat_ioctl DESCRIPTION: During transaction if a joining node attempts to perform error processing and if it gets aborted then this may lead to VxVM commands getting hung as I/O drain will not proceed on slave node due to FSS diskgroup on CVM. RESOLUTION: Code changes are made to handle this case. Patch ID: VRTSlvmconv-6.1.0.100 * 3496079 (Tracking ID: 3496077) SYMPTOM: The vxvmconvert(1m) command fails while converting Logical Volume Manager (LVM) Volume Groups (VG)into VxVM disk group and displays the following error message: AThe specified Volume Group (VG name) was not found.A DESCRIPTION: A race condition is observed between AblockdevA and AvgdisplayA commands, wherein, if the AblockdevA command does not flush the buffered data onto the device, then the AvgdisplayA fails to provide accurate output. As a result, VxVM assumes that Volume Groups (VG) is not present and displays the error message. RESOLUTION: The code is modified to allow finite number of trials for AvgdisplayAcommand before failure so that there is sufficient time for buffers to get flushed. Patch ID: VRTSllt-6.1.0.200 * 3458677 (Tracking ID: 3460985) SYMPTOM: System panics or logs the following error message in the syslog: kernel:BUG: soft lockup - CPU#8 stuck for 67s! [llt_rfreem:7125] DESCRIPTION: With the LLT-RDMA feature, Low Latency Transport (LLT) threads can be busy for a longer period of time without scheduling out of CPU depending on workload. The Linux kernel may falsely detect this activity as an unresponsive task or a soft lockup causing the panic. This issue is observed in the LLT threads, such as lltd, llt_rdlv and llt_rfreem. RESOLUTION: The code is modified in order to avoid false alarm for softlockup. The LLT threads now schedule themselves out of the CPU after the specified time, which is configured by default based on the watchdog_threshold/softlockup_threshold value in the system configuration. The default value can be changed using the lltconfig utility (lltconfig -T yieldthresh:value). This value decides when the LLT threads schedule themselves out. Patch ID: VRTSllt-6.1.0.100 * 3376505 (Tracking ID: 3410309) SYMPTOM: The Low Latency Transport (LLT) driver fails to load and logs the following message in the syslog when a mismatch is observed in the RDMA-specific symbols. llt: disagrees about version of symbol rdma_connect llt: Unknown symbol rdma_connect llt: disagrees about version of symbol rdma_destroy_id llt: Unknown symbol rdma_destroy_id DESCRIPTION: The LLT driver fails to load when an external OFED (OFA or MLNX_OFED) stack is installed on a system. The OFED replaces the native RDMA-related drivers (shipped with the OS) with the external OFED drivers. Since LLT is built against the native RDMA drivers, LLT fails due to symbol mismatch when startup script tries to load LLT. RESOLUTION: The LLT startup script is modified to detect if any external OFED is installed on the system. If the script detects an external OFED, then it loads a LLT driver (without the RDMA symbols) of a non-RDMA version. Since this LLT does not contain RDMA-specific symbols, the LLT driver successfully loads. However, the LLT driver does not have the RDMA functionality. In this case, LLT can be used either in Ethernet or in a UDP mode. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- To install the patch perform the following steps on at least one node in the cluster: 1. Copy the hot-fix sfha-rhel5_x86_64-6.1.0.300-rpms.tar.gz to /tmp 2. Untar sfha-rhel5_x86_64-6.1.0.300-rpms.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/sfha-rhel5_x86_64-6.1.0.300-rpms.tar.gz # tar xf /tmp/sfha-rhel5_x86_64-6.1.0.300-rpms.tar 3. Install the hotfix # pwd /tmp/hf # ./installSFHA610HF300 [ ...] Install the patch manually: -------------------------- For VRTSvxvm: o Before-the-upgrade: (a) Stop I/Os to all the VxVM volumes. (b) Umount any filesystems with VxVM volumes. (c) Stop applications using any VxVM volumes. (d) If your system has Veritas Operation Manager(VOM) configured then check whether vxdclid daemon is running, if it is running then stop vxdclid daemon. Command to check the status of vxdclid daemon #/opt/VRTSsfmh/etc/vxdcli.sh status Command to stop the vxdclid daemon #/opt/VRTSsfmh/etc/vxdcli.sh stop o Select the appropriate RPMs for your system, and upgrade to the new patch. # rpm -Uhv VRTSvxvm-6.1.0.100-RHEL5.x86_64.rpm o After-the-upgrade: (a) If you have stopped vxdclid daemon before upgrade then re-start vxdclid daemon using following command #/opt/VRTSsfmh/etc/vxdcli.sh start For VRTSlvmconv: o Select the appropriate RPMs for your system, and install the new patch. # rpm -ihv VRTSlvmconv-6.1.0.100-GA_RHEL5.i686.rpm For upgrading the path: o Select the appropriate RPMs for your system, and upgrade to the new patch. # rpm -Uhv VRTSlvmconv-6.1.0.100-GA_RHEL5.i686.rpm For VRTSllt: To install the patch perform the following steps on all nodes in the VCS cluster: 1. Stop VCS stack on the cluster node. 2. Uninstall the LLT rpm # rpm -ev VRTSllt --nodeps 3. Install the patch rpm 4. Start VCS stack on the cluster node REMOVING THE PATCH ------------------ For VRTSvxvm: o If your system has Veritas Operation Manager(VOM) configured then check whether vxdclid daemon is running, if it is running then stop vxdclid daemon. Command to check the status of vxdclid daemon #/opt/VRTSsfmh/etc/vxdcli.sh status Command to stop the vxdclid daemon #/opt/VRTSsfmh/etc/vxdcli.sh stop o Execute following command to remove VxVM package from your system # rpm -e For VRTSlvmconv: Execute following command to remove VRTSlvmconv package from your system # rpm -e For VRTSllt: To uninstall the patch perform the following steps on all nodes in the VCS cluster: 1. Stop VCS stack on the cluster node. 2. Uninstall the LLT patch rpm # rpm -ev VRTSllt --nodeps 3. Install the base LLT RPM (6.1) 4. Start VCS stack on the cluster node KNOWN ISSUES ------------ * Tracking ID: 3445982 SYMPTOM: Disk which is newly added with the same name as that of failed or removed disk is not displayed under the possible replacements of new disk. WORKAROUND: After adding the new disk, first initialize the disk with vxdiskadm option 1, then go for the replacement of the disk i.e option 5. * Tracking ID: 3453072 SYMPTOM: "Blank remote disk" gets created on other nodes in cluster volume manager after exporting local disk with no hotprefix value. WORKAROUND: Following command to set "hostprefix" value explicitly on node of cluster and then try to unexport/export local disk for avoiding blank remove disk creation on other node. Command to set hotprefix : "vxdctl set hostprefix " ========================= * Tracking ID: 3461928 SYMPTOM: While restoring a Flexible Storage Sharing (FSS) disk group configuration that has cache objects configured, the following error messages may display during the pre-commit phase of the restoration: VxVM vxcache ERROR V-5-1-10128 Cache object meta-data update error VxVM vxcache ERROR V-5-1-10128 Cache object meta-data update error VxVM vxvol WARNING V-5-1-10364 Could not start cache object VxVM vxvol ERROR V-5-1-11802 Volume volume_name cannot be started VxVM vxvol ERROR V-5-1-13386 Cache object on which Volume volume_name is constructed is not enabled VxVM vxvol ERROR V-5-1-13386 Cache object on which Volume volume_name is constructed is not enabled The error messages are harmless and do not have any impact on restoration. After committing the disk group configuration, the cache object and the volume that is constructed on the cache object are enabled. WORKAROUND: None * Tracking ID: 3463710 SYMPTOM: Plex(es) on remote disk goes to DISABLED state because of plex IO error encounter just after slave node reboot in cluster volume manger IOFAIL flag can be seen on such plex(es). This IO failure is transient. WORKAROUND: "vxrecover" command need to trigger manual on master node of cluster for enabling plex state to "ENABLE". ==================================== * Tracking ID: 3464707 SYMPTOM: If private region of all disks, belonging to disk group, are corrupted then while importing DG vxconfigd temporary hangs because disk reonline operation is getting stuck in validating private region of all disks. WORKAROUND: vxconfigd will come up after some time. * Tracking ID: 3480154 SYMPTOM: The problem occurs when on Veritas Operation Manager(VOM)configured server. VxVM stops the daemons and unloads the VxVM kernel modules at time of patch upgrade.On VOM configured server vxdclid daemon uses one of the VxVM config file (/dev/vx/info)" which prevent to unload VxVM module like vxio, vxspec, vxdmp from kernel . WORKAROUND: Before upgrade, stop the vxdclid daemon. #/opt/VRTSsfmh/etc/vxdcli.sh stop After upgrade completes, start vxdclid daemon using command #/opt/VRTSsfmh/etc/vxdcli.sh start SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE