* * * READ ME * * * * * * Veritas Volume Manager 5.0.1 RP3 * * * * * * P-patch 7 * * * Patch Date: 2016-11-09 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas Volume Manager 5.0.1 RP3 P-patch 7 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v3 (11.31) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas Storage Foundation 5.0.1 * Veritas Storage Foundation Cluster File System 5.0.1 * Veritas Storage Foundation for Oracle 5.0.1 * Veritas Storage Foundation for Oracle RAC 5.0.1 * Veritas Storage Foundation HA 5.0.1 * Veritas Volume Manager 5.0.1 * Veritas Volume Replicator 5.0.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHCO_44138, PHKL_44139 * 3199563 (2710579) Data corruption is observed on a Cross-platform Data Sharing (CDS) disk, as a part of the operations like LUN resize, Disk FLUSH, Disk ONLINE, and so on. * 3407771 (3263105) The disk evacuation operation (vxevac()) fails for a volume with Data Change Object (DCO). * 3439670 (2253210) When a device tree is added after a LUN is removed or added, the command "vxdisk scandisks" hangs. * 3532405 (3047470) The device /dev/vx/esd is not recreated on reboot with the latest major number, if it is already present on the system. * 3596436 (3596425) During patch installation, the checkinstall script fails with error "FC-COMMON.FC-SNIA filesets with revision B.11.31.1311 or Higher is required for VxVM to work correctly.". Patch ID: PHCO_43910, PHKL_43909 * 3375267 (2054319) When there are a large number of un-initialized disks under VxVM, the system boot requires more than 3 hours. * 3410940 (3390959) The vxconfigd(1M) daemon hangs in the kernel while processing the I/O request. * 3466269 (3461383) The vxrlink(1M) command fails when the "vxrlink -g -a att " command is executed. * 3467626 (2515070) When the I/O fencing is enabled, the Cluster Volume Manager (CVM) slave node may fail to join the cluster node. * 3470966 (3438271) The vxconfigd(1M) daemon may hang when new LUNs are added. * 3483635 (3482001) The 'vxddladm addforeign' command renders the system unbootable, after the reboot, for a few cases. Patch ID: PHCO_43579, PHKL_43580 * 2234292 (2152830) A diskgroup (DG) import fails with a non-descriptive error message when multiple copies (clones) of the same device exist and the original devices are either offline or not available. * 2973525 (2973522) At cable connect on port1 of dual-port Fibre Channel Host Bus Adapters (FC HBA), paths via port2 are marked as SUSPECT. * 2982087 (2976130) Multithreading of the vxconfigd (1M) daemon for HP-UX 11i v3 causes the DMP database to be deleted as part of the device-discovery commands. * 2983903 (2907746) File Descriptor leaks are observed with the device-discovery command of VxVM. * 3028911 (2390998) System panicked during SAN reconfiguration because of the inconsistency in dmp device open count. * 3047804 (2969844) The device discovery failure should not cause the DMP database to be destroyed completely. * 3059067 (1820179) "vxdctl debug " command dumps core if vxconfigd log file was modified when vxconfigd starts with "logfile=" option. * 3059139 (2979824) The vxdiskadm(1M) utility bug results in the exclusion of the unintended paths. * 3068265 (2994677) When the 'vxdisk scandisks' or 'vxdctl enable' commands are run the system panics. * 3072892 (2352517) The system panics while excluding a controller from Veritas Volume Manager (VxVM) view. * 3121041 (2310284) In the Veritas Volume Manager (VxVM) versions prior to 5.1SP1, the Cross- Platform-Data Sharing (CDS) disk initialization of Logical Unit Number (LUN) size greater than 1 TB may lead to data corruption. * 3130376 (3130361) Prevent disk initialization of size greater than 1TB for disks with the CDS format. * 3139305 (3139300) Memory leaks are observed in the device discovery code path of VxVM. * 3315600 (3315534) The vxconfigd(1M) daemon dumps core during start-up from the /sbin/pre_init_rc file, after it is switched to native multi-pathing. * 3318945 (3248281) When the "vxdisk scandisks" or "vxdctle enable" commands are run consecutively the "VxVM vxdisk ERROR V-5-1-0 Device discovery failed." error is encountered. * 3369341 (3325371) Panic occurs in the vol_multistepsio_read_source() function when snapshots are used. Patch ID: PHCO_43185, PHKL_43186 * 2575150 (2617277) Man pages for the vxautoanalysis and vxautoconvert commands are missing from the base package. * 2631371 (2431470) "vxdisk set" command operates on a wrong VxVM device and does not work with DA (Disk Access) name correctly. * 2674273 (2647975) Serial Split Brain (SSB) condition caused Cluster Volume Manager(CVM) Master Takeover to fail. * 2753970 (2753954) When a cable is disconnected from one port of a dual-port FC HBA, the paths via another port are marked as SUSPECT PATH. * 2779347 (2779345) Data corruption seen. CDS backup label signature seen within PUBLIC region data. OR Creation of a volume failed on a disk indicating in-sufficient space available. * 2846151 (2108993) "vxdisk list" command can't show newly added disks, dmp_do_reconfig() reports error * 2860459 (2257850) vxdiskadm leaks memory while performing operations related to enclosures. * 2860462 (1874383) Memory leak in req_ddl_reconfig_fabric() . * 2860464 (1675599) Excluding and including a LUN in a loop triggers a huge memory leak for vxconfigd when EMC PowerPath is configured * 2872617 (2413763) Uninitialized memory read results in a vxconfigd coredump * 2872633 (2680482) startup scripts use 'quit' instead of 'exit', causing empty directories in /tmp * 2872635 (2792748) Node join fails because of closing of wrong file descriptor * 2872916 (2801962) Growing a volume takes significantly large time when the volume has version 20 DCO attached to it. * 2913666 (2000661) The enhanced noreonline needs to take care of diskgroup rename operation explicitly. * 2913669 (1959513) The shared diskgroup import times increase drastically with more number of disks attached to the CVM cluster. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following incidents: Patch ID: PHCO_44138, PHKL_44139 * 3199563 (Tracking ID: 2710579) SYMPTOM: When data corruption is observed on a Cross-platform Data Sharing (CDS) disk, as a part of the operations such asLUN resize, Disk FLUSH, Disk ONLINE and so on, thefollowing pattern is found in the data region of the disk: cyl alt 2 hd sec DESCRIPTION: The CDS disk maintains a SUN Volume Table of Contents(VTOC) VTOC in block zero and a backup label at the endof the disk. The VTOC maintains the disk geometry information similar to the number of cylinders, tracks and the sectors per track. The backup label is the duplicate of the VTOC and the backup label location is determined from the VTOC contents. As part of the resize, the VTOC is not updated to the new size, which results in the wrong calculation of the backup label location. If the wrongly calculated backup label location comes in the public data region instead of at the end of the disk as designed, data corruption occurs. RESOLUTION: The code is modified such that the writing of backup label is suppressed to prevent the data corruption. * 3407771 (Tracking ID: 3263105) SYMPTOM: The disk evacuation operation (vxevac()) fails for a volume with Data Change Object (DCO). DESCRIPTION: During the vxevac() operation, all sub-disks of a volume that reside on the specified source disk are moved to the destination disk. As part of the disk evacuation of the volume, its corresponding Data Change Log volume's (DCL's) sub-disks are also moved. The vxevac() operation re-tries to move the sub-disks of the DCL volume. This fails because the sub-disks are already moved. RESOLUTION: The code is modified to exclude the DCL volumes, for which the disk evacuation is already completed as a part of the parent-volume-disk evacuation. * 3439670 (Tracking ID: 2253210) SYMPTOM: When a device tree is added after a LUN is removed or added, the command "vxdisk scandisks" hangs. The following stack trace is observed: slpq_swtch_core() inline real_sleep() sleep() vxvm_delay() vol_disk_change_iopolicy() volconfig_ioctl() volsioctl_real() volsioctl() vols_ioctl() spec_ioctl() vno_ioctl() ioctl() syscall() DESCRIPTION: When the I/O fails from DMP, in case of a specific error code, the pending I/O count on the DMP node is decremented, which was incremented earlier. In this particular hang, the error code is not set properly. The reason that the error code is not set properly is because of the handling of the EFI devices that do not have s2 partition. As a result, the pending I/O count on the DMP node is not decremented. This leads to the vxconfigd" hang which in turn leads to "vxdisk scandisks" hang. RESOLUTION: The code is modified to set the appropriate error code that takes care of decrementing the pending I/O count. * 3532405 (Tracking ID: 3047470) SYMPTOM: Cannot deport the disk group as the device /dev/vx/esd is not recreated on reboot with the latest major number. Consider the problem scenario as follows: The old device /dev/vx/esdhas a major number which is now re-assigned to the vol driver. As the device /dev/vx/esd has same major number as that of the vol driver, it may disallow a disk group deport with the following message, until the vxesd(1M) daemon is stopped: VxVM vxdg ERROR V-5-1-584 Disk group XXX: Some volumes in the disk group are in use DESCRIPTION: If the device /dev/vx/esd is present in the system with an old major number, then the mknod(1M) command in the startup script fails to recreate the device with the new major number. This leads to change in the functionality. RESOLUTION: The code is modified to delete the device under /dev/vx/esd, before the mknod (1M) command in the startup script, so that it gets recreated with the latest major number. * 3596436 (Tracking ID: 3596425) SYMPTOM: While upgrading from HP-UX September 2010 release to OE containing PHCO_43824, following error may be seen in /var/adm/sw/swm.log file: * Running "checkinstall" for "PHCO_43824, r=1.0". NOTE: Command output: - /var/opt/swm/tmp/swmZ3Hj3uh/SDcat2tGfRnd/catalog/PHCO_43824/pfiles/checkinstall[8]: 1009 01: Syntax error ERROR: FC-COMMON.FC-SNIA filesets with revision B.11.31.1311 or Higher is required for VxVM to work correctly. ERROR: The "checkinstall" for "PHCO_43824, r=1.0" failed (exit code "1"). The script location was "/var/opt/swm/tmp/swmZ3Hj3uh/SDcat2tGfRnd/catalog/ PHCO_43824/pfiles/checkinstall". DESCRIPTION: The checkinstall script was wrongly parsing the revision string for FC-COMMON product which was resulting into this syntax error. RESOLUTION: The checkinstall script has been modified to correctly parse the revision string. Patch ID: PHCO_43910, PHKL_43909 * 3375267 (Tracking ID: 2054319) SYMPTOM: When there are a large number of uninitialized disks under VxVM, the system boot requires more than 3 hours. DESCRIPTION: As part of the "vold" startup, VxVM creates multiple threads. Each thread is online with only one disk at a time. Since the disks are uninitialized, VxVM checks if the disks have any file system. The statvfsdev() function is called on the DMP device path to perform this activity. The statvfsdev() function calls few IOCTL's which fail. DMP checks the state of the path, by performing an inquiry. Before finding the state of the path, DMP determines whether the disks are non-SCSI. The io_search() function is called on the instance number of the "lunpath" to determine if the disks are non-SCSI. Calls to the io_search () function in the multithreaded environment get serialized over the gio_spinlock() function. Thereby, the online activity takes a long time to execute. RESOLUTION: The code is modified so that the io_search() function call to determine whether the disk is SCSI or not is skipped, to prevent any contention among the threads. Because only the SCSI devices are supported on HP-UX under VxVM control. * 3410940 (Tracking ID: 3390959) SYMPTOM: The vxconfigd(1M) daemon hangs in the kernel, while processing the I/O request. The following stack trace is observed : slpq_swtch_core() sleep_pc() biowait() physio() dmpread() spec_rdwr() vno_rw() read() syscall() DESCRIPTION: The vxconfigd(1M) daemon hangs, while processing the I/O request. The "dmp_close_path" failure message is displayed in the syslog before the hang. Based on the probable cause analysis, the failure message gets displayed in the syslog, during the path closure, which is related to the hang observed. Also, if the I/O fails on a path, the "iocount" is not decremented properly. RESOLUTION: The code is modified to add some debug messages, to confirm the current probable cause analysis, when this issue occurs. Also, if the I/O fails on a path, the "iocount" is decremented properly. * 3466269 (Tracking ID: 3461383) SYMPTOM: The vxrlink(1M) command fails, when the "vxrlink -g -a att " command is executed. On PA machines the following error message is displayed: VxVM VVR vxrlink ERROR V-5-1-5276 Cannot open shared library/usr/lib/libvrascmd.sl, error: Can't dlopen() a library containing Thread Local Storage: /usr/lib/libvrascmd.sl DESCRIPTION: To make "vxconfigd" and other VxVM binaries thread safe, these binaries are now linked with HP's thread safe "libIOmt" library. The vxrlink(1M) command opens a shared library, which is linked with the thread safe "libIOmt" library. There is a limitation onHP-UX, that a shared library that contains Thread Local Storage (TLS) cannot be loaded dynamically. This results in an error. RESOLUTION: The code is modified, so that the library that is dynamically loaded through the vxrlink(1M) command, is not linked with the "libIOmt" library, as the vxrlink(1M) command and the library do not invoke any routines from the "libIOmt" library. * 3467626 (Tracking ID: 2515070) SYMPTOM: When the I/O fencing is enabled, the Cluster Volume Manager (CVM) slave node may fail to join the cluster node. The following error message is displayed: In Veritas Cluster Server(VCS) engine log: VCS ERROR V-16-20006-1005 (abc) CVMCluster:cvm_clus:monitor:node - state: out of cluster reason: SCSI-3 PR operation failed: retry to add a node failed In syslog: V-5-1-15908 Import failed for dg ebap01dg. Local node has data disk fencing enabled, but master does not have PGR key set DESCRIPTION: Whenever a fix transaction is performed during disk group (DG) import, the SCSI- 3 Persistent Group Reservation (PGR) key is not uploaded from "vxconfigd" to the kernel DG record. This leads to a NULL-PGR key in the kernel DG record. If subsequently, "vxconfigd" gets restarted, it reads the DG configuration record from the kernel. This leads to the PGR key being NULL in "vxconfigd" also. This configuration record is sent to the node, when it wants to join the cluster. The slave node fails to import the shared DG, because of the missing PGR key, and therefore it fails to join the cluster node. RESOLUTION: The code is modified to copy the disk group PGR key from "vxconfigd" to the kernel, when a new disk group record is loaded to the kernel. * 3470966 (Tracking ID: 3438271) SYMPTOM: The VxVM commands may hang when new LUNs are added and device discovery is performed. Subsequent VxVM commands that request information from the vxconfigd (1M) daemon may also hang. The "vxconfigd" stack trace is as following: swtch_to_thread () slpq_swtch_core () sleep_pc () biowait_rp () biowait () dmp_indirect_io () gendmpioctl () dmpioctl () spec_ioctl () vno_ioctl () ioctl () syscall () syscallinit () DESCRIPTION: When new LUNs are added device discovery is performed, where certain operations are requested from the vxconfigd(1M) daemon. If the LUN to be added is foundto be non-SCSI or faulty, the vxconfigd(1M) daemon may hang. RESOLUTION: The code is modified to avoid the hang in the vxconfigd(1M) daemon and the subsequent VxVM commands. * 3483635 (Tracking ID: 3482001) SYMPTOM: The 'vxddladm addforeign' command renders the system unbootable, after the reboot, for a few cases. DESCRIPTION: As part of the execution of the 'vxddladm addforeign' command, VxVM incorrectly identifies the specified disk as the root disk. As a result, it replaces all the entries pertaining to the 'root disk', with the entries of the specified disk, thus rendering the system unbootable. RESOLUTION: The code is modified to detect the root disk appropriately when it is specified, as a part of the 'vxddladm addforeign' command. Patch ID: PHCO_43579, PHKL_43580 * 2234292 (Tracking ID: 2152830) SYMPTOM: A diskgroup (DG) import fails with a non-descriptive error message when multiple copies (clones) of the same device exist and the original devices are either offline or not available. For example: # vxdg import mydg VxVM vxdg ERROR V-5-1-10978 Disk group mydg: import failed: No valid disk found containing disk group DESCRIPTION: If the original devices are offline or unavailable, the vxdg(1M) command picks up cloned disks for import.DG import fails unless the clones are tagged and the tag is specified during the DG import. The import failure is expected, but the error message is non-descriptive and does not specify the corrective action to be taken by the user. RESOLUTION: The code is modified to give the correct error message when duplicate clones exist during import. Also, details of the duplicate clones are reported in the system log. * 2973525 (Tracking ID: 2973522) SYMPTOM: At cable connect on one of the ports of a dual-port Fibre Channel Host Bus Adapters (FC HBA), paths that go through the other port are marked as SUSPECT. DMP does not issue I/O on such paths until the next restore daemon cycle confirms that the paths are functioning. DESCRIPTION: When a cable is connected at one of the ports of a dual-port FC HBA, HBA- Registered State Change Notification (RSCN) event occurs on the other port. When the RSCN event occurs, DMP marks the paths as SUSPECT that goes through that port. RESOLUTION: The code is modified so that the RSCN events that goes through the other port are not marked as SUSPECT. * 2982087 (Tracking ID: 2976130) SYMPTOM: The device-discovery commands such as "vxdisk scandisks" and "vxdctl enable" may cause the entire DMP database to be deleted. This causes the VxVM I/O errors and file systems to get disabled. For instances where VxVM manages the root disk(s), a system hang occurs. In a Serviceguard/SGeRAC environment integrated with CVM and/or CFS, VxVM I/O failures would typically lead to a Serviceguard INIT and/or a CRS TOC (if the voting disks sit on VxVM volumes). Syslog shows the removal of arrays from the DMP database as following: vmunix: NOTICE: VxVM vxdmp V-5-0-0 removed disk array 000292601518, datype = EMC In addition to messages that indicate VxVM I/O errors and file systems are disabled. DESCRIPTION: VxVM's vxconfigd(1M) daemon uses HP's libIO(3X) APIs such as io_search() and io_search_array() functions to claim devices that are attached to the host. Although, vxconfigd(1M) is multithreaded, it uses a non-thread safe version of the libIO(3X) APIs. A race condition may occur when multiple vxconfigd threads perform device discovery. This results in a NULL value returned to the libIO (3X) APIs call.VxVM interprets the NULL return value as an indication of none of the devices being attached and proceeds to delete all the devices previously claimed from the DMP database. RESOLUTION: The vxconfigd(1M) daemon, as well as the event source daemon vxesd(1M), is now linked with HP's thread-safe libIO(3X) library. This prevents the race condition among multiple vxconfigd threads that perform device discovery. Please refer to HP's customer bulletin c03585923 for a list of other software components required for a complete * 2983903 (Tracking ID: 2907746) SYMPTOM: At device discovery, the vxconfigd(1M) daemon allocates file descriptors for open instances of "/dev/config", but does not always close them after use, this results in a file descriptor leak over time. DESCRIPTION: Before any API of the "libIO" library is called, the io_init() function needs to be called. This function opens the "/dev/config" device file. Each io_init() function call should be paired with the io_end() function call. This function closes the "/dev/config" device file. However, the io_end() function call is amiss at some places in the device discovery code path. As a result, the file descriptor leaks are observed with the device-discovery command of VxVM. RESOLUTION: The code is modified to pair each io_init() function call with the io_end() function call in every possible code path. * 3028911 (Tracking ID: 2390998) SYMPTOM: When running 'vxdctl enable' or 'vxdisk scandisks' command after the configuration changes in SAN ports, system panicked with the following stack trace: .disable_lock() dmp_close_path() dmp_do_cleanup() dmp_decipher_instructions() dmp_process_instruction_buffer() dmp_reconfigure_db() gendmpioctl() vxdmpioctl() DESCRIPTION: After the configuration changes in SAN ports, the configuration in VxVM also needs to be updated. In the reconfiguration process, VxVM may temporarily have the old dmp path nodes and the new dmp path nodes, both of which has the same device number, to migrate the old ones to new ones. VxVM maintains two types of open count to avoid platform dependency. However when openining/closing the old dmp path nodes while the migration process is going on, VxVM wrongly calculates the open counts in the dmp path nodes; calculates an open count in the new node and then calculates the other open count in the old node. This results in the inconsistent open counts of the node and cause panic while checking open counts. RESOLUTION: The code change has been done to maintain the open counts on the same dmp path node database correctly while performing dmp device open/close. * 3047804 (Tracking ID: 2969844) SYMPTOM: The DMP database gets destroyed if the discovery fails for some reason. "ddl.log shows numerous entries as follows: DESTROY_DMPNODE: 0x3000010 dmpnode is to be destroyed/freed DESTROY_DMPNODE: 0x3000d30 dmpnode is to be destroyed/freed Numerous vxio errors are seen in the syslog as all VxVM I/O's fail afterwards. DESCRIPTION: VxVM deletes the old device database before it makes the new device database. If the discovery process fails for some reason, this results in a null DMP database. RESOLUTION: The code is modified to take a backup of the old device database before doing the new discovery. Therefore, if the discovery fails we restore the old database and display the appropriate message on the console. * 3059067 (Tracking ID: 1820179) SYMPTOM: The "vxdctl debug " (1M) command core dumps and the following stack trace is displayed: strcpy.strcpy() xfree_internal() msg_logfile_disable() req_vold_debug() request_loop() main() DESCRIPTION: The vxconfigd(1M) command uses a static memory for storing the logfile information when it is started with logfile=. This logfile can be changed with the vxdctl(1M) command. The "vxdctl debug" (1M) command uses dynamically allocated memory for storing the path information. If the default debug level is changed using the vxdctl(1M) command, the allocated memory for path information is deleted assuming that it is created by the vxdctl(1M) debug command. Thus, it results in core dump. RESOLUTION: The code is modified to allocate dynamic memory for storing the path information. * 3059139 (Tracking ID: 2979824) SYMPTOM: While excluding the controller using the vxdiskadm(1M) utility, the unintended paths get excluded DESCRIPTION: The issue occurs due to a logical error related to the grep command, when the hardware path of the controller to be retrieved is excluded. In some cases, the vxdiskadm(1M) utility takes the wrong hardware path for the controller that is excluded, and hence excludes unintended paths. Suppose there are two controllers viz. c189 and c18 and the controller c189 is listed above c18 in the command, and the controller c18 is excluded, then the hardware path of the controller c189 is passed to the function and hence it ends up excluding the wrong controller. RESOLUTION: The script is modified so that the vxdiskadm(1M) utility now takes the hardware path of the intended controller only, and the unintended paths do not get excluded. * 3068265 (Tracking ID: 2994677) SYMPTOM: When the 'vxdisk scandisks' or 'vxdctl enable' commands are run the system panics with the following stack trace: panic_save_regs_switchstack+0x110 () panic+0x410 () bad_kern_reference+0xa0 () $cold_pfault+0x530 () vm_hndlr+0x12f0 () bubbleup+0x880 () dmp_decode_add_new_path+0x430 () dmp_decipher_instructions+0x490 () dmp_process_instruction_buffer+0x340 () dmp_reconfigure_db+0xc0 () gendmpioctl+0x920 () dmpioctl+0x100 () spec_ioctl+0xf0 () vno_ioctl+0x350 () ioctl+0x410 () syscall+0x5b0 () DESCRIPTION: The problem occurs because the NULL value of the structure that causes the panic was not checked. RESOLUTION: The code is modified to check for the valid value of the structure to prevent the system to panic. * 3072892 (Tracking ID: 2352517) SYMPTOM: Excluding a controller from Veritas Volume Manager (VxVM ) using the vxdmpadm exclude ctlr=" command causes the system to panic with the following stack trace: gen_common_adaptiveminq_select_path dmp_select_path gendmpstrategy voldiskiostart vol_subdisksio_start volkcontext_process volkiostart vxiostrategy vx_bread_bp vx_getblk_cmn vx_getblk vx_getmap vx_getemap vx_do_extfree vx_extfree vx_te_trunc_data vx_te_trunc vx_trunc_typed vx_trunc_tran2 vx_trunc_tran vx_trunc vx_inactive_remove vx_inactive_tran vx_local_inactive_list vx_inactive_list vx_workitem_process vx_worklist_process vx_worklist_thread thread_start DESCRIPTION: While excluding a controller from the VxVM view, all the paths must also be excluded. The panic occurs because the controller is excluded before the paths belonging to that controller are excluded. While excluding the path, the controller of that path which is NULL is accessed. RESOLUTION: The code is modified to exclude all the paths belonging to a controller before excluding a controller. * 3121041 (Tracking ID: 2310284) SYMPTOM: In the Veritas Volume Manager (VxVM) versions prior to 5.1SP1, the Cross- Platform-Data Sharing (CDS) disk initialization of Logical Unit Number (LUN) size greater than 1 TB may lead to data corruption. DESCRIPTION: The Veritas Volume Manager (VxVM) versions prior to 5.1SP1, allows CDS-disk initialization of LUN size greater than 1TB. A CDS disk of size greater than 1TB can cause data corruption because of writing the backup labels. and the label ids. in the data or the public regions.VxVM should not allow the CDS-disk initialization for the LUN size that is greater than 1TB to support the CDS compatibility across platform. RESOLUTION: The code is modified to restrict the vxdisksetup(1M)command to perform the CDS- disk initialization for a CDS disk of size greater than 1TB. The changes in the vxdisk(1M) and vxresize(1M) command would be available in the subsequent patch. * 3130376 (Tracking ID: 3130361) SYMPTOM: Disks greater than 1 TB can be initialized with the Cross Platform Data sharing (CDS) format. DESCRIPTION: CDS formatted disks use Sun Microsystems Label (SMI) for partitioning. SMI partition table is capable of storing partition size of 1 TB only. EarlierVxVM releases did not prevent initialization of disks greater than 1 TB with the CDS format. RESOLUTION: The code is modified so that VxVM can explicitly prevent initialization of disks greater than 1 TB with the CDS format. Other VxVM utilities like 'vxvmconvert', 'vxcdsconvert', and 'DLE (vxdisk resize)' either explicitly fail the operation, if the environment involves greater than 1 TB disks, or use the HPDISK format wherever possible. * 3139305 (Tracking ID: 3139300) SYMPTOM: At device discovery, the vxconfigd(1M) daemon allocates memory but does not release it after use, causing a user memory leak. The Resident Memory Size (RSS) of the vxconfigd(1M) daemon thus keeps growing and may reach maxdsiz(5) in the extreme case that causes the vxconfigd(1M) daemon to abort. DESCRIPTION: At some places in the device discovery code path, the buffer is not freed. This results in memory leaks. RESOLUTION: The code is modified to free the buffers. * 3315600 (Tracking ID: 3315534) SYMPTOM: The vxconfigd(1M) daemon dumps core during system start-up on the IA machine when the boot is performed from VxVM ROOT device under Native Multi-Pathing (NMP) control. This causes boot failure as the root disk group 'rootdg' fails to get imported. With the vxconfigd(1M) daemon's core file the following stack trace is observed: dg_creat_tempdb mode_set setup_mode startup main DESCRIPTION: When the disk group 'rootdg' which resides on disks that belong to NMP on IA machines is imported, the vxconfigd(1M) daemon does not select the correct DSF for the BOOT device. Thereby, it ignores the partition component. This causes I/O errors when the root disk group is imported. RESOLUTION: The code is modified to use the correct public and private disk partitions during boot-up. * 3318945 (Tracking ID: 3248281) SYMPTOM: When the "vxdisk scandisks" or "vxdctl enable" commands are run consecutively, an error is displayed as following: VxVM vxdisk ERROR V-5-1-0 Device discovery failed. DESCRIPTION: The device discovery failure occurs because in some cases the variable that is passed to the OS specific function is not set properly. RESOLUTION: The code is modified to set the correct variable before the variable is passed to the OS specific function. * 3369341 (Tracking ID: 3325371) SYMPTOM: Panic occurs in the vol_multistepsio_read_source() function when VxVM's FastResync feature is used. The stack trace observed is as following: vol_multistepsio_read_source() vol_multistepsio_start() volkcontext_process() vol_rv_write2_start() voliod_iohandle() voliod_loop() kernel_thread() DESCRIPTION: When a volume is resized, Data Change Object (DCO) also needs to be resized. However, the old accumulator contents are not copied into the new accumulator. Thereby, the respective regions are marked as invalid. Subsequent I/O on these regions triggers the panic. RESOLUTION: The code is modified to appropriately copy the accumulator contents during the resize operation. Patch ID: PHCO_43185, PHKL_43186 * 2575150 (Tracking ID: 2617277) SYMPTOM: Man pages missing for the vxautoanalysis and vxautoconvert commands. DESCRIPTION: The man pages for the vxautoanalysis and vxautoconvert commands are missing from the base package. RESOLUTION: Added the man pages for vxautoanalysis(1M) and vxautoconvert(1M) commands. * 2631371 (Tracking ID: 2431470) SYMPTOM: 1. "vxpfto" command sets PFTO(Powerfail Timeout) value on a wrong VxVM device when it passes DM(Disk Media) name to the "vxdisk set" command with -g option. 2. "vxdisk set" command does not work when DA name is specified either -g option specified or not. Ex.) # vxdisk set [DA name] clone=off VxVM vxdisk ERROR V-5-1-5455 Operation requires a disk group # vxdisk -g [DG name] set [DA name] clone=off VxVM vxdisk ERROR V-5-1-0 Device [Da name] not in configuration or associated with DG [DG name] DESCRIPTION: 1. "vxpfto" command invokes "vxdisk set" command to set the PFTO value. It shall accept both DM and DA names for device specification. However DM and DA names can have conflicts such that even within the same disk group, the same name can refer to different devices - one as a DA name and another as a DM name. "vxpfto" command uses a DM name with -g option when invoking the "vxdisk set" command but it will choose a matching DA name before a DM name. This causes incorrect device to be acted upon. Both DM and DA name can be specified for the "vxdisk set" command with -g option however the DM name are given preference with -g option from the design perspective. 2. "vxdisk set" command shall accept DA name for device specification. Without -g option, the command shall work only when DA name is specified. However it doesn't work because the disk group name is not extracted from the DA record correctly. Hence the first error. With -g option the DA name specified is treated as a matching DM name wrongly, hence the second error. RESOLUTION: Code changes are made to make the "vxdisk set" command working correctly on DA name without -g option and on both DM and DA names with -g option. The given preference is DM name when -g option is specified. It resolves the "vxpfto" command issue as well. * 2674273 (Tracking ID: 2647975) SYMPTOM: Serial Split Brain (SSB) condition caused Cluster Volume Manager(CVM) Master Takeover to fail. The below vxconfigd debug output was noticed when the issue was noticed: VxVM vxconfigd NOTICE V-5-1-7899 CVM_VOLD_CHANGE command received V-5-1-0 Preempting CM NID 1 VxVM vxconfigd NOTICE V-5-1-9576 Split Brain. da id is 0.5, while dm id is 0.4 for dm cvmdgA-01 VxVM vxconfigd WARNING V-5-1-8060 master: could not delete shared disk groups VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgA: Disabled by errors VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgB: Disabled by errors ... VxVM vxconfigd ERROR V-5-1-11467 kernel_fail_join() : Reconfiguration interrupted: Reason is transition to role failed (12, 1) VxVM vxconfigd NOTICE V-5-1-7901 CVM_VOLD_STOP command received DESCRIPTION: When Serial Split Brain (SSB) condition is detected by the new CVM master, on Veritas Volume Manager (VxVM) versions 5.0 and 5.1, the default CVM behaviour will cause the new CVM master to leave the cluster and causes cluster-wide downtime. RESOLUTION: Necessary code changes have been done to ensure that when SSB is detected in a diskgroup, CVM will only disable that particular diskgroup and keep the other diskgroups imported during the CVM Master Takeover, the new CVM master will not leave the cluster with the fix applied. * 2753970 (Tracking ID: 2753954) SYMPTOM: When cable is disconnected from one port of a dual-port FC HBA, only paths going through the port should be marked as SUSPECT. But paths going through other port are also getting marked as SUSPECT. DESCRIPTION: Disconnection of a cable from a HBA port generates a FC event. When the event is generated, paths of all ports of the corresponding HBA are marked as SUSPECT. RESOLUTION: The code changes are done to mark the paths only going through the port on which FC event is generated. * 2779347 (Tracking ID: 2779345) SYMPTOM: 1. Creation of a volume failed on a disk indicating in-sufficient space available. 2. Data corruption seen. CDS backup label signature seen within PUBLIC region data. DESCRIPTION: Cross Platform Data (CDS) disk uses Solaris VTOC as Platform block. When a disk is initialized as CDS, geometry obtained from SCSI MODE-SENSE/fake geometry algorithm is written within the VTOC. With operations like BCV cloning, firmware upgrade etc, geometry obtained from MODE-SENSE/fake geometry can be different from the stamped geometry. If disk size obtained using the geometry stored on disk label is lesser than the disk size obtained from MODE-SENSE/fake geometry, operation like creating a volume might fail due to in-sufficient space available. If disk size using fake geometry is greater than the disk size obtained from MODE-SENSE geometry, data corruption might occur if CDS backup label is written with operation like "vxdisk flush". RESOLUTION: The fix is that the geometry on the disk is honored, i.e the geometry used during initialization will be used for operation like "vxdisk flush". * 2846151 (Tracking ID: 2108993) SYMPTOM: When an existing path/LUN is removed and at the same time new LUN is added on executing 'vxdisk scandisks' dmp_do_reconfig() reports error and discovery gets aborted. Following error message is displayed: VxVM vxconfigd DEBUG V-5-1-0 dmp_do_reconfig: DMP_RECONFIGURE_DB failed: No such file or directory 'vxdisk list' will not show the newly added disks. DESCRIPTION: The scenario where the newly added device reuses the device numbers of removed devices, was not handled in DMP (Dynamic Multi Pathing) code. RESOLUTION: The DMP driver has been changed to handle this case. * 2860459 (Tracking ID: 2257850) SYMPTOM: Memory leak is observed when information about enclosure is accessed by vxdiskadm. DESCRIPTION: The memory allocated locally for a data structure keeping information about the array specific attributes is not freed. RESOLUTION: Code changes are made to avoid such memory leaks. * 2860462 (Tracking ID: 1874383) SYMPTOM: Vxconfigd leaks memory during DMP device reconfiguration. DESCRIPTION: In vxconfigd code, during device reconfiguration, memory allocated for the data structures related to device reconfiguration is not freed which led to memory leak. RESOLUTION: The memory is released after its scope is over. * 2860464 (Tracking ID: 1675599) SYMPTOM: Vxconfigd leaks memory while excluding and including a Third party Driver controlled LUN in a loop. As part of this vxconfigd loses its license information and following error is seen in system log: "License has expired or is not available for operation" DESCRIPTION: In vxconfigd code, memory allocated for various data structures related to device discovery layer is not freed which led to the memory leak. RESOLUTION: The memory is released after its scope is over. * 2872617 (Tracking ID: 2413763) SYMPTOM: vxconfigd, the VxVM daemon dumps core with the following stack: ddl_fill_dmp_info ddl_init_dmp_tree ddl_fetch_dmp_tree ddl_find_devices_in_system find_devices_in_system mode_set setup_mode startup main __libc_start_main _start DESCRIPTION: Dynamic Multi Pathing node buffer declared in the Device Discovery Layer was not initialized. Since the node buffer is local to the function, an explicit initialization is required before copying another buffer into it. RESOLUTION: The node buffer is appropriately initialized using memset() to address the coredump. * 2872633 (Tracking ID: 2680482) SYMPTOM: There are many random directories not cleaned up in /tmp/, like vx.$RANDOM.$RANDOM.$RANDOM.$$ on system startup. DESCRIPTION: In general the startup scripts should call quit(), in which it call do the cleanup when errors detected. The scripts were calling exit() directly instead of quit() leaving some random-created directories uncleaned. RESOLUTION: These script should be restored to call quit() instead of exit() directly. * 2872635 (Tracking ID: 2792748) SYMPTOM: In an HPUX cluster environment, the slave join fails with the following error message in syslog: VxVM vxconfigd ERROR V-5-1-5784 cluster_establish:kernel interrupted vold on overlapping reconfig. DESCRIPTION: During the join, the slave node performs disk group import. As part of the import, the file descriptor pertaining to "Port u" is closed because of a wrong assignment of the return value of open(). Hence, the subsequent write to the same port was returning EBADF. RESOLUTION: Code changes are done to avoid closing the wrong file descriptor * 2872916 (Tracking ID: 2801962) SYMPTOM: Operations that lead to growing of volume, including 'vxresize', 'vxassist growby/growto' take significantly larger time if the volume has version 20 DCO(Data Change Object) attached to it in comparison to volume which doesn't have DCO attached. DESCRIPTION: When a volume with a DCO is grown, it needs to copy the existing map in DCO and update the map to track the grown regions. The algorithm was such that for each region in the map it would search for the page that contains that region so as to update the map. Number of regions and number of pages containing them are proportional to volume size. So, the search complexity is amplified and observed primarily when the volume size is of the order of terabytes. In the reported instance, it took more than 12 minutes to grow a 2.7TB volume by 50G. RESOLUTION: Code has been enhanced to find the regions that are contained within a page and then avoid looking-up the page for all those regions. * 2913666 (Tracking ID: 2000661) SYMPTOM: The enhanced noreonline needs to take care of diskgroup rename operation explicitly. DESCRIPTION: While renaming a shared diskgroup, the master node updates the on-disk information to reflect rename of diskgroup before actual import. With NOREONLINE propagated to slaves, the slaves do not refresh the in-core information belonging to the disks involved in the import operation. So when the import operation on slaves tries to verify the (new) diskgroup information sent by the master with the information available in-core, it does not match. RESOLUTION: The shared diskgroup import operation explicitly re-onlines the disks selected for import so that the latest information (if updated by the master) is available to the slave nodes even if NOREONLINE is specified. * 2913669 (Tracking ID: 1959513) SYMPTOM: The shared diskgroup import times increase drastically with more number of disks attached to the CVM cluster DESCRIPTION: As part of regular shared diskgroup import, First the master node followed by all slaves nodes do re-online of all the disks attached to the cluster which are not part of any imported diskgroup. With more number of disks attached to the cluster, the re-online times are higher which causes the import times to be high as well. diskgroup import command supports a -o noreonline option to skip the reonline of disks on the master node however the same is not propogated to slaves. Hence the import times with this option specified are still on the higher side. RESOLUTION: The option to skip reonline of disks during import is propogated to slave nodes because of which the REONLINE is skipped on entire cluster ensuring very good shared diskgroup import times. INSTALLING THE PATCH -------------------- a)VxVM 5.0.1 (GA)version 5.0.31.5 or version 5.0.31.6 must be installed before applying these patches. b)All prerequisite/corequisite patches have to be installed.The Kernel patch requires a system reboot for both installation and removal. c)To install the patch, enter the following command: # swinstall -x autoreboot=true -s PHCO_44138 PHKL_44139 Incase the patch is not registered, the patch can be registered using the following command: # swreg -l depot , where is the absolute path where the patch resides.d) Please do swverify after installing the patches in order to make sure that the patches are installed correctly using: $ swverify PHCO_44138 PHKL_44139 REMOVING THE PATCH ------------------ a)To remove the patch, enter the following command: # swremove -x autoreboot=true PHCO_44138 PHKL_44139 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE