* * * READ ME * * * * * * Veritas Storage Foundation HA 6.0.5 * * * * * * Patch 6.0.5.100 * * * Patch Date: 2014-11-20 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas Storage Foundation HA 6.0.5 Patch 6.0.5.100 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- RHEL6 x86-64 PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSaslapm VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Symantec VirtualStore 6.0.1 * Veritas Dynamic Multi-Pathing 6.0.1 * Veritas Storage Foundation 6.0.1 * Veritas Storage Foundation Cluster File System HA 6.0.1 * Veritas Storage Foundation for Oracle RAC 6.0.1 * Veritas Storage Foundation HA 6.0.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: VRTSvxvm-6.0.500.100 * 3496715 (3281004) For DMP minimum queue I/O policy with large number of CPUs a couple of issues are observed. * 3501358 (3399323) The reconfiguration of Dynamic Multipathing (DMP) database fails. * 3502662 (3380481) When you select a removed disk during the "5 Replace a failed or removed disk" operation, the vxdiskadm(1M) command displays an error message. * 3521727 (3521726) System panicked for double freeing IOHINT. * 3526501 (3526500) Disk IO failures occur with DMP IO timeout error messages when DMP (Dynamic Multi-pathing) IO statistics demon is not running. * 3531332 (3077582) A Veritas Volume Manager (VxVM) volume may become inaccessible causing the read/write operations to fail. * 3552411 (3482026) The vxattachd(1M) daemon reattaches plexes of manually detached site. * 3600161 (3599977) During replica connection, referencing a port which is already deleted in another thread caused system panic. * 3612801 (3596330) 'vxsnap refresh' operation fails with `Transaction aborted waiting for IO drain` error * 3621240 (3621232) vradmin ibc command cannot be started/executed on VVR (Veritas Volume Replicator) Secondary. * 3622069 (3513392) Reference to replication port that is already deleted caused panic. * 3632969 (3631230) VRTSvxvm patch version 6.0.5 and 6.1.1 or previous will not work with RHEL6.6 update. * 3638039 (3625890) Vxdisk resize a CDS disk failing with "invalid attribute specification" Patch ID: VRTSaslapm-6.0.500.300 * 3665733 (3665727) Array I/O POLICY is set to Single-active for SF6.1.1 with RHEL6.6 NOTE: Please refer to the product level readme files for incidents fixed in the previous patches. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: VRTSvxvm-6.0.500.100 * 3496715 (Tracking ID: 3281004) SYMPTOM: For DMP minimum queue I/O policy with large number of CPUs, the following issues are observed since the VxVM 5.1 SP1 release: 1. CPU usage is high. 2. I/O throughput is down if there are many concurrent I/Os. DESCRIPTION: The earlier minimum queue I/O policy is used to consider the host controller I/O load to select the least loaded path. For VxVM 5.1 SP1 version, an addition was made to consider the I/O load of the underlying paths of the selected host based controllers. However, this resulted in the performance issues, as there were lock contentions with the I/O processing functions and the DMP statistics daemon. RESOLUTION: The code is modified such that the host controller paths I/O load is not considered to avoid the lock contention. * 3501358 (Tracking ID: 3399323) SYMPTOM: The reconfiguration of Dynamic Multipathing (DMP) database fails with the below error: VxVM vxconfigd DEBUG V-5-1-0 dmp_do_reconfig: DMP_RECONFIGURE_DB failed: 2 DESCRIPTION: As part of the DMP database reconfiguration process, controller information from DMP user-land database is not removed even though it is removed from DMP kernel database. This creates inconsistency between the user-land and kernel-land DMP database. Because of this, subsequent DMP reconfiguration fails with above error. RESOLUTION: The code changes have been made to properly remove the controller information from the user-land DMP database. * 3502662 (Tracking ID: 3380481) SYMPTOM: When the vxdiskadm(1M) command selects a removed disk during the "5 Replace a failed or removed disk" operation, the vxdiskadm(1M) command displays the following error message: "/usr/lib/vxvm/voladm.d/lib/vxadm_syslib.sh: line 2091:return: -1: invalid option". DESCRIPTION: From bash version 4.0, bash doesnt accept negative error values. If VxVM scripts return negative values to bash, the error message is displayed. RESOLUTION: The code is modified so that VxVM scripts dont return negative values to bash. * 3521727 (Tracking ID: 3521726) SYMPTOM: When using Symantec Replication Option, system panic happens while freeing memory with the following stack trace on AIX, pvthread+011500 STACK: [0001BF60]abend_trap+000000 () [000C9F78]xmfree+000098 () [04FC2120]vol_tbmemfree+0000B0 () [04FC2214]vol_memfreesio_start+00001C () [04FCEC64]voliod_iohandle+000050 () [04FCF080]voliod_loop+0002D0 () [04FC629C]vol_kernel_thread_init+000024 () [0025783C]threadentry+00005C () DESCRIPTION: In certain scenarios, when a write IO gets throttled or un-winded in VVR, we free the memory related to one of our data structures. When we restart this IO, the same memory gets illegally accessed and freed again even though it was freed.It causes system panic. RESOLUTION: Code changes have been done to fix the illegal memory access issue. * 3526501 (Tracking ID: 3526500) SYMPTOM: Disk IO failures occur with DMP IO timeout error messages when DMP (Dynamic Multi-pathing) IO statistics daemon is not running. Following are the timeout error messages: VxVM vxdmp V-5-3-0 I/O failed on path 65/0x40 after 1 retries for disk 201/0x70 VxVM vxdmp V-5-3-0 Reached DMP Threshold IO TimeOut (100 secs) I/O with start 3e861909fa0 and end 3e86190a388 time VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 201/0x70 DESCRIPTION: When IO is submitted to DMP, it sets the start time on the IO buffer. The value of the start time depends on whether the DMP IO statistics daemon is running or not. When the IO is returned as error from SCSI to DMP, instead of retrying the IO on alternate paths, DMP failed that IO with 300 seconds timeout error, but the IO has elapsed only few milliseconds in its execution. The miscalculation of DMP timeout happens only when DMP IO statistics daemon is not running. RESOLUTION: The code is modified to calculate appropriate DMP IO timeout value when DMP IO statistics demon is not running. * 3531332 (Tracking ID: 3077582) SYMPTOM: A Veritas Volume Manager (VxVM) volume may become inaccessible causing the read/write operations to fail with the following error: # dd if=/dev/vx/dsk/<dg>/<volume> of=/dev/null count=10 dd read error: No such device 0+0 records in 0+0 records out DESCRIPTION: If I/Os to the disks timeout due to some hardware failures like weak Storage Area Network (SAN) cable link or Host Bus Adapter (HBA) failure, VxVM assumes that the disk is faulty or slow and it sets the failio flag on the disk. Due to this flag, all the subsequent I/Os fail with the No such device error. RESOLUTION: The code is modified such that vxdisk now provides a way to clear the failio flag. To check whether the failio flag is set on the disks, use the vxkprint(1M) utility (under /etc/vx/diag.d). To reset the failio flag, execute the vxdisk set <disk_name> failio=off command, or deport and import the disk group that holds these disks. * 3552411 (Tracking ID: 3482026) SYMPTOM: The vxattachd(1M) daemon reattaches plexes of manually detached site. DESCRIPTION: The vxattachd daemon reattaches plexes for a manually detached site that is the site with state as OFFLINE. As there was no check to differentiate between a manually detach site and the site that was detached due to IO failure. Hence, the vxattachd(1M) daemon brings the plexes online for manually detached site also. RESOLUTION: The code is modified to differentiate between manually detached site and the site detached due to IO failure. * 3600161 (Tracking ID: 3599977) SYMPTOM: system panics with following stack trace: [00009514].simple_lock+000014 [004E4C48]soereceive+001AA8 [00504770]soreceive+000010 [00014F50].kernel_add_gate_cstack+000030 [04F6992C]kmsg_sys_rcv+000070 [04F89448]nmcom_get_next_mblk+0000A4 [04F63618]nmcom_get_next_msg+0000F0 [04F64434]nmcom_wait_msg_tcp+00015C [04F7A71C]nmcom_server_proc_tcp+00006C [04F7C7AC]nmcom_server_proc_enter+0000A4 [04D88AF4]vxvm_start_thread_enter+00005C DESCRIPTION: During replica connection, after created the port before increasing the count to protect the port from deleting, another thread deletes the port. While the replica connection thread proceeds, it refers to the port that is already deleted, hence causes NULL pointer reference and panic. RESOLUTION: Code changes have been done to add locking to the checking and modification of the count associated with the port. * 3612801 (Tracking ID: 3596330) SYMPTOM: 'vxsnap refresh' operation fails with following indicants: Errors occur from DR (Disaster Recovery) Site of VVR (Veritas Volume Replicator): o vxio: [ID 160489 kern.notice] NOTICE: VxVM vxio V-5-3-1576 commit: Timedout waiting for rvg [RVG] to quiesce, iocount [PENDING_COUNT] msg 0 o vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-8011 Internal transaction failed: Transaction aborted waiting for io drain At the same time, following errors occur from Primary Site of VVR: vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink [RLINK] disconnecting due to ack timeout on update message DESCRIPTION: VM (Volume Manager) Transactions on DR site get aborted as pending IOs could not be drained in stipulated time leading to failure of FMR (Fast-Mirror Resync) 'snap' operations. These IOs could not be drained because of IO throttling. A bug/race in conjunction with timing in VVR causes a miss in clearing this throttling condition/state. RESOLUTION: Code changes have been done to fix the race condition which ensures clearance of throttling state at appropriate time. * 3621240 (Tracking ID: 3621232) SYMPTOM: When run IBC(In-band Control) procedure by vradmin ibc command, vradmind (VVR daemon) on VVR secondary may goes into disconnected state, then the following IBC procedure or vradmin ibc commands cannot be started/executed on VVR secondary and below message will be outputted on VVR primary. VxVM VVR vradmin ERROR V-5-52-532 Secondary is undergoing a state transition. Please re-try the command after some time. VxVM VVR vradmin ERROR V-5-52-802 Cannot start command execution on Secondary. DESCRIPTION: When IBC procedure run into command finish state, the vradmind on VVR secondary may goes into disconnected state, but it was not sensed by vradmind on primary. In such situation, the vradmind on primary will not send handshake request to the secondary which can take it out of disconnected state to running state. Then the vradmind will stall in disconnected state and the following vradmin ibc command cannot be executed on VVR secondary, although vradmind looks normal and in running state on VVR primary. RESOLUTION: Code changes have been done to make sure the vradmind on VVR primary will be notified while it goes into disconnected state on VVR secondary, hence it can send out handshake request to take the secondary out of disconnected state. * 3622069 (Tracking ID: 3513392) SYMPTOM: secondary panics when rebooted while heavy IOs are going on primary PID: 18862 TASK: ffff8810275ff500 CPU: 0 COMMAND: "vxiod" #0 [ffff880ff3de3960] machine_kexec at ffffffff81035b7b #1 [ffff880ff3de39c0] crash_kexec at ffffffff810c0db2 #2 [ffff880ff3de3a90] oops_end at ffffffff815111d0 #3 [ffff880ff3de3ac0] no_context at ffffffff81046bfb #4 [ffff880ff3de3b10] __bad_area_nosemaphore at ffffffff81046e85 #5 [ffff880ff3de3b60] bad_area_nosemaphore at ffffffff81046f53 #6 [ffff880ff3de3b70] __do_page_fault at ffffffff810476b1 #7 [ffff880ff3de3c90] do_page_fault at ffffffff8151311e #8 [ffff880ff3de3cc0] page_fault at ffffffff815104d5 #9 [ffff880ff3de3d78] volrp_sendsio_start at ffffffffa0af07e3 [vxio] #10 [ffff880ff3de3e08] voliod_iohandle at ffffffffa09991be [vxio] #11 [ffff880ff3de3e38] voliod_loop at ffffffffa0999419 [vxio] #12 [ffff880ff3de3f48] kernel_thread at ffffffff8100c0ca DESCRIPTION: If the replication stage IOs are started after serialization of the replica volume, replication port could be deleted and set to NULL during handling the replica connection changes, this will cause the panic since we have not checked if the replication port is still valid before referencing to it. RESOLUTION: Code changes have been done to abort the stage IO if replication port is NULL. * 3632969 (Tracking ID: 3631230) SYMPTOM: VRTSvxvm patch version 6.0.5 and 6.1.1 will not work with RHEL6.6 update. # rpm -ivh VRTSvxvm-6.1.1.000-GA_RHEL6.x86_64.rpm Preparing... ########################################### [100%] 1:VRTSvxvm ########################################### [100%] Installing file /etc/init.d/vxvm-boot creating VxVM device nodes under /dev WARNING: No modules found for 2.6.32-494.el6.x86_64, using compatible modules for 2.6.32-71.el6.x86_64. FATAL: Error inserting vxio (/lib/modules/2.6.32- 494.el6.x86_64/veritas/vxvm/vxio.ko): Unknown symbol in module, or unknown parameter (see dmesg) ERROR: modprobe error for vxio. See documentation. warning: %post(VRTSvxvm-6.1.1.000-GA_RHEL6.x86_64) scriptlet failed, exit status 1 # Or after OS update, the system log file will have the following messages logged. vxio: disagrees about version of symbol poll_freewait vxio: Unknown symbol poll_freewait vxio: disagrees about version of symbol poll_initwait vxio: Unknown symbol poll_initwait DESCRIPTION: Installation of VRTSvxvm patch version 6.0.5 and 6.1.1 fails on RHEL6.6 due to the changes in poll_initwait() and poll_freewait() interfaces. RESOLUTION: The VxVM package has re-compiled with RHEL6.6 build environment. * 3638039 (Tracking ID: 3625890) SYMPTOM: Error message is reported as bellow after running vxdisk resize command: "VxVM vxdisk ERROR V-5-1-8643 Device ibm_shark0_9: resize failed: Invalid attribute specification" DESCRIPTION: For CDS(cross platform data sharing) VTOC(volume table of contents) disks, there are 2 reserved cylinders for special usage. In case of expanding a disk with particular disk size on storage side, VxVM(veritas volume manager) may calculate the cylinder number as 2, which will cause the vxdisk resize failed with "Invalid attribute specification". RESOLUTION: Code changes have been made to avoid failure of resizing a CDS VTOC disk. Patch ID: VRTSaslapm-6.0.500.300 * 3665733 (Tracking ID: 3665727) SYMPTOM: vxdmpadm listapm output does not list any APM except default ones [root@rpms]# vxdmpadm listapm Filename APM Name APM Version Array Types State ================================================================================ dmpjbod.ko dmpjbod 1 Disk Active dmpjbod.ko dmpjbod 1 APdisk Active dmpalua.ko dmpalua 1 ALUA Not-Active dmpaaa.ko dmpaaa 1 A/A-A Not-Active dmpapg.ko dmpapg 1 A/PG Not-Active dmpapg.ko dmpapg 1 A/PG-C Not-Active dmpaa.ko dmpaa 1 A/A Active dmpap.ko dmpap 1 A/P Active dmpap.ko dmpap 1 A/P-C Active dmpapf.ko dmpapf 1 A/PF-VERITAS Not-Active dmpapf.ko dmpapf 1 A/PF-T3PLUS Not-Active [root@rpms]# DESCRIPTION: For supporting RHEL6.6 update, dmp module is recomipled with latest RHEL6.6 kernel version. During post install of the package the APM modules fails to load due to mismatch in DMP and additional APM module kernel version. RESOLUTION: ASLAPM package is recompiled with RHEL6.6 kernel. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- To install the patch perform the following steps on at least one node in the cluster: 1. Stop applications using any VxVM volumes. 2. Copy the patch sfha-rhel6.6_x86_64-6.0.5.100-rpms.tar.gz to /tmp 3. Untar sfha-rhel6.6_x86_64-6.0.5.100-rpms.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/sfha-rhel6.6_x86_64-6.0.5.100-rpms.tar.gz # tar xf /tmp/sfha-rhel6.6_x86_64-6.0.5.100-rpms.tar 4. Install the hotfix # pwd /tmp/hf # ./installSFHA605P1 [ ...] You can also install this patch together with 6.0.1 GA release and 6.0.5 Patch release # ./installSFHA605P1 -base_path [<601 path>] -mr_path [<605 path>] [ ...] where the -mr_path should point to the 6.0.5 image directory, while -base_path to the 6.0.1 image. Install the patch manually: -------------------------- o Before-the-upgrade :- (a) Stop I/Os to all the VxVM volumes. (b) Umount any filesystems with VxVM volumes. (c) Stop applications using any VxVM volumes. o Select the appropriate RPMs for your system, and upgrade to the new patch. # rpm -Uhv VRTSvxvm-6.0.500.100-RHEL6.x86_64.rpm VRTSaslapm-6.0.500.300-GA_RHEL6.x86_64.rpm REMOVING THE PATCH ------------------ # rpm -e SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE