* * * READ ME * * * * * * Veritas Volume Manager 7.2 * * * * * * Patch 300 * * * Patch Date: 2017-08-10 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas Volume Manager 7.2 Patch 300 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- Solaris 11 SPARC PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas InfoScale Foundation 7.2 * Veritas InfoScale Storage 7.2 * Veritas InfoScale Enterprise 7.2 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: 7.2.0.300 * 3920915 (3892787) Enabling DMP (Dynamic Multipathing) native support should not import the exported zpools present on the system. * 3920928 (3915953) Enabling dmp_native_support takes much more time to get complete. * 3920932 (3915523) Local disk from other node belonging to private DG(diskgroup) is exported to the node when a private DG is imported on current node. * 3920975 (3893150) VxDMP vxdmpadm native ls command sometimes doesn't report imported disks' pool name * 3922804 (3918356) zpools are imported automatically when DMP(Dynamic Multipathing) native support is set to on which may lead to zpool corruption. * 3925379 (3919902) vxdmpadm iopolicy switch command can fail and standby paths are not honored by some iopolicy Patch ID: 7.2.0.200 * 3909992 (3898069) System panic may happen in dmp_process_stats routine. * 3910000 (3893756) 'vxconfigd' is holding a task device for long time, after the kernel counter rewinds, it may create a boundary issue. * 3910426 (3868533) IO hang happens because of a deadlock situation. * 3910586 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options are specified together * 3910590 (3878030) Enhance VxVM DR tool to clean up OS and VxDMP device trees without user interaction. * 3910591 (3867236) Application IO hang happens because of a race between Master Pause SIO(Staging IO) and RVWRITE1 SIO. * 3910592 (3864063) Application IO hang happens because of a race between Master Pause SIO(Staging IO) and Error Handler SIO. * 3910593 (3879324) VxVM DR tool fails to handle busy device problem while LUNs are removed from OS * 3912532 (3853144) VxVM mirror volume's stale plex is incorrectly marked as "Enable Active" after it comes back. * 3913119 (3902769) System may panic while running vxstat in CVM (Clustered Volume Manager) environment. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following incidents: Patch ID: 7.2.0.300 * 3920915 (Tracking ID: 3892787) SYMPTOM: Enabling DMP native support should not import the exported zpools present on the system. DESCRIPTION: When we enable DMP native support through "vxdmpadm settune dmp_native_support=on", it will try to migrate and import the zpools that are currently exported on the host. This is not an expected behavior and exported pools before DMP Native Support should not be imported. RESOLUTION: Code changes has been done not to import the zpools that are exported on the host before enabling dmp native support. * 3920928 (Tracking ID: 3915953) SYMPTOM: When we enable dmp_native_support using 'vxdmpadm settune dmp_native_support=on', it takes too long to get completed. DESCRIPTION: When we enable dmp_native_support , we import all the zpools so as to make them come under DMP, so that when we import them later on with native support on they should be under the DMP. For this the command used was taking much time and now the command has been modified in the script to reduce the time. RESOLUTION: Instead of searching the whole /dev/vx/dmp directory to import the zpools , import them by using their specific attributes. * 3920932 (Tracking ID: 3915523) SYMPTOM: Local disk from other node belonging to private DG is exported to the node when a private DG is imported on current node. DESCRIPTION: When we try to import a DG, all the disks belonging to the DG are automatically exported to the current node so as to make sure that the DG gets imported. This is done to have same behaviour as SAN with local disks as well. Since we are exporting all disks in the DG, then it happens that disks which belong to same DG name but different private DG on other node get exported to current node as well. This leads to wrong disk getting selected while DG gets imported. RESOLUTION: Instead of DG name, DGID (diskgroup ID) is used to decide whether disk needs to be exported or not. * 3920975 (Tracking ID: 3893150) SYMPTOM: VxDMP(Veritas Dynamic Multi-Pathing) vxdmpadm native ls command sometimes doesn't report imported disks' pool name DESCRIPTION: When Solaris pool is imported with extra options like -d or -R, paths in 'zpool status ' can be disk full path. 'Vxdmpadm native ls' command doesn't handle such situation hence fails to report its pool name. RESOLUTION: Code changes have been made to correctly handle disk full path to get its pool name. * 3922804 (Tracking ID: 3918356) SYMPTOM: zpools are imported automatically when DMP native support is set to on which may lead to zpool corruption. DESCRIPTION: When DMP native support is set to on all zpools are imported using DMP devices so that when the import happens for the same zpool again it is automatically imported using DMP device. In clustered environment if the import of the same zpool is triggered on two different nodes at the same time it can lead to zpool corruption. A way needs to be provided so that zpools are not imported. RESOLUTION: Changes are made to provide a way to customer to not import the zpools if required. The way is to set the variable auto_import_exported_pools to off in the file /var/adm/vx/native_input like below: bash:~# cat /var/adm/vx/native_input auto_import_exported_pools=off * 3925379 (Tracking ID: 3919902) SYMPTOM: VxVM(Veritas Volume Manager) vxdmpadm iopolicy switch command may not work. When issue happens, vxdmpadm setattr iopolicy finishes without any error, but subsequent vxdmpadm getattr command shows iopolicy is not correctly updated: # vxdmpadm getattr enclosure emc-vplex0 iopolicy ENCLR_NAME DEFAULT CURRENT ============================================ emc-vplex0 Balanced Balanced # vxdmpadm setattr arraytype VPLEX-A/A iopolicy=minimumq # vxdmpadm getattr enclosure emc-vplex0 iopolicy ENCLR_NAME DEFAULT CURRENT ============================================ emc-vplex0 Balanced Balanced Also standby paths are not honored by some iopolicy(for example balanced iopolicy). Read/Write IOs are seen against standby paths by vxdmpadm iostat command. DESCRIPTION: array's iopolicy field becomes stale when vxdmpadm setattr arraytype iopolicy command is used, hence when iopolicy was set back to the staled one, it will not work actually. Also, when paths are evaluated for issusing IOs, standby flag isn't taken into consideration hence standby paths are used for r/w IOs. RESOLUTION: Code changes have been done to address these issues. Patch ID: 7.2.0.200 * 3909992 (Tracking ID: 3898069) SYMPTOM: System panic may happen in dmp_process_stats routine with the following stack: dmp_process_stats+0x471/0x7b0 dmp_daemons_loop+0x247/0x550 kthread+0xb4/0xc0 ret_from_fork+0x58/0x90 DESCRIPTION: When aggregate the pending IOs per DMP path over all CPUs, out of bound access issue happened due to the wrong index of statistic table, which could cause a system panic. RESOLUTION: Code changes have been done to correct the wrong index. * 3910000 (Tracking ID: 3893756) SYMPTOM: Under certain circumstances, after vxconfigd running for a long time, a task might be dangling in system. Which may be seen by issuing 'vxtask -l list'. DESCRIPTION: - voltask_dump() gets a task id by calling ' vol_task_dump' in kernel (ioctl) as the minor number of the taskdev. - the task id (or minor number) increases by 1 when a new task is registered. - task id starts from 160 and rewinds when it meets 65536. there is a global counter 'vxtask_next_minor' indicating next task id. - at the time vxconfigd opens a taskdev by calling voltask_dump() and holding it, it gets a task id too (let's say 165). from then on, there's a vnode with this minor number (major=273, minor=165) exists in kernel. - as time goes by, the task id increases and meets 65536, it then rewinds and starts from 160 again. - when taskid goes by 165 again with a cli command (say 'vxdisk -othin, fssize list'), then it's taskdev gets the same major and minor number (165) as vxconfigd's. - at the same time, vxconfigd is still holding this vnode too. vxdisk doesn't know this and opens the taskdev, and registers a task structure in kernel hash table, this adds a reference to the same vnode which vxconfigd is holding, now the reference count of the common snode is 2. - when vxdisk (fsusage_collect_stats_task) has done it's job, it calls voltask_complete->close()->spec_close(), trying to remove this task (165). but the os function spec_close() ( from specfs ) gets in the way, it detects reference count of the common snode (vnode->v_data- >snode->s_commonvp->v_data->common snode). spec_close() finds out the value of s_count is 2, then it only drops the reference by one and returns success to caller, without calling the actual closing function 'volsclose()'. - volsclose() is not called by spec_close(), then it's subsequent functions are not called too: volsclose_real()->voltask_close() ->vxtask_rm_task(), among those, vxtask_rm_task() does the actual job removing a task from the kernel hashtable. - after calling close(), fsusage_collect_stats_task returns, and vxdisk command exits. from this point on, the task is dangling in kernel hash table, until vxconfigd exits. RESOLUTION: Source change to avoid vxconfigd holding task device. * 3910426 (Tracking ID: 3868533) SYMPTOM: IO hang happens when starting replication. VXIO deamon hang with stack like following: vx_cfs_getemap at ffffffffa035e159 [vxfs] vx_get_freeexts_ioctl at ffffffffa0361972 [vxfs] vxportalunlockedkioctl at ffffffffa06ed5ab [vxportal] vxportalkioctl at ffffffffa06ed66d [vxportal] vol_ru_start at ffffffffa0b72366 [vxio] voliod_iohandle at ffffffffa09f0d8d [vxio] voliod_loop at ffffffffa09f0fe9 [vxio] DESCRIPTION: While performing DCM replay in case Smart Move feature is enabled, VxIO kernel needs to issue IOCTL to VxFS kernel to get file system free region. VxFS kernel needs to clone map by issuing IO to VxIO kernel to complete this IOCTL. Just at the time RLINK disconnection happened, so RV is serialized to complete the disconnection. As RV is serialized, all IOs including the clone map IO form VxFS is queued to rv_restartq, hence the deadlock. RESOLUTION: Code changes have been made to handle the dead lock situation. * 3910586 (Tracking ID: 3852146) SYMPTOM: In a CVM cluster, when importing a shared diskgroup specifying both -c and -o noreonline options, the following error may be returned: VxVM vxdg ERROR V-5-1-10978 Disk group : import failed: Disk for disk group not found. DESCRIPTION: The -c option will update the disk ID and disk group ID on the private region of the disks in the disk group being imported. Such updated information is not yet seen by the slave because the disks have not been re-onlined (given that noreonline option is specified). As a result, the slave cannot identify the disk(s) based on the updated information sent from the master, causing the import to fail with the error Disk for disk group not found. RESOLUTION: The code is modified to handle the working of the "-c" and "-o noreonline" options together. * 3910590 (Tracking ID: 3878030) SYMPTOM: Enhance VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool to clean up OS and VxDMP(Veritas Dynamic Multi-Pathing) device trees without user interaction. DESCRIPTION: When users add or remove LUNs, stale entries in OS or VxDMP device trees can prevent VxVM from discovering changed LUNs correctly. It even causes VxVM vxconfigd process core dump under certain conditions, users have to reboot system to let vxconfigd restart again. VxVM has DR tool to help users adding or removing LUNs properly but it requires user inputs during operations. RESOLUTION: Enhancement has been done to VxVM DR tool. It accepts '-o refresh' option to clean up OS and VxDMP device trees without user interaction. * 3910591 (Tracking ID: 3867236) SYMPTOM: Application IO hang happens after issuing Master Pause command. DESCRIPTION: The flag VOL_RIFLAG_REQUEST_PENDING in VVR(Veritas Volume Replicator) kernel is not cleared because of a race between Master Pause SIO and RVWRITE1 SIO resulting in RU (Replication Update) SIO to fail to proceed thereby causing IO hang. RESOLUTION: Code changes have been made to handle the race condition. * 3910592 (Tracking ID: 3864063) SYMPTOM: Application IO hang happens after issuing Master Pause command. DESCRIPTION: Some flags(VOL_RIFLAG_DISCONNECTING or VOL_RIFLAG_REQUEST_PENDING) in VVR(Veritas Volume Replicator) kernel are not cleared because of a race between Master Pause SIO and Error Handler SIO resulting in RU (Replication Update) SIO to fail to proceed thereby causing IO hang. RESOLUTION: Code changes have been made to handle the race condition. * 3910593 (Tracking ID: 3879324) SYMPTOM: VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool fails to handle busy device problem while LUNs are removed from OS DESCRIPTION: OS devices may still be busy after removing them from OS, it fails 'luxadm - e offline ' operation and leaves staled entries in 'vxdisk list' output like: emc0_65535 auto - - error emc0_65536 auto - - error RESOLUTION: Code changes have been done to address busy devices issue. * 3912532 (Tracking ID: 3853144) SYMPTOM: VxVM(Veritas Volume Manager) mirror volume's stale plex is incorrectly marked as "Enable Active" after it comes back, which prevents resync of such stale plex from up-to-date ones. It can cause data corruption if the stale plex happens to be the preferred or slected plex, or read policy "round" is set for the volume. DESCRIPTION: When volume plex is detached abruptly while vxconfigd is unavailable, VxVM kernel logging records the detach activity along with its detach transaction id for future resync or recover. Because of code defect, such detach transaction id can be wrongly selected under certain situation. RESOLUTION: Code changes have been done to correctly select the detach transaction id. * 3913119 (Tracking ID: 3902769) SYMPTOM: While running vxstat in CVM (Clustered Volume Manager) environment, system may panic with following stack: machine_kexec __crash_kexec crash_kexec oops_end no_context do_page_fault page_fault [exception RIP: vol_cvm_io_stats_common+143] __wake_up_common __wake_up_sync_key unix_destruct_scm __alloc_pages_nodemask volinfo_ioctl volsioctl_real vols_ioctl at vols_unlocked_ioctl do_vfs_ioctl sys_ioctl entry_SYSCALL_64_fastpath DESCRIPTION: The system panic occurs because while fetching the IO statistics from the VxVM (Veritas Volume Manager) kernel, an illegal address in the IO stats data is accessed which is not yet populated. RESOLUTION: The code is fixed to correctly populate the address in the IO stats data before fetching the IO stats. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- Please be noted that the installation of this P-Patch will cause downtime. To install the patch perform the following steps on at least one node in the cluster: 1. Copy the patch vm-sol11_sparc-Patch-7.2.0.300.tar.gz to /tmp 2. Untar vm-sol11_sparc-Patch-7.2.0.300.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/vm-sol11_sparc-Patch-7.2.0.300.tar.gz # tar xf /tmp/vm-sol11_sparc-Patch-7.2.0.300.tar 3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.) # pwd /tmp/hf # ./installVRTSvxvm720P300 [ ...] You can also install this patch together with 7.2 base release using Install Bundles 1. Download this patch and extract it to a directory 2. Change to the Veritas InfoScale 7.2 directory and invoke the installer script with -patch_path option where -patch_path should point to the patch directory # ./installer -patch_path [] [ ...] Install the patch manually: -------------------------- o Before-the-upgrade :- (a) Stop applications using any VxVM volumes. (b) Stop I/Os to all the VxVM volumes. (c) Umount any filesystems with VxVM volumes. (d) In case of multiple boot environments, boot using the BE you wish to install the patch on. For Solaris 11 release, refer to the man pages for instructions on using install and uninstall options of 'pkg' command provided with Solaris. Any other special or non-generic installation instructions should be described below as special instructions. The following example installs a patch to a standalone machine: example# pkg install --accept -g /patch_location/VRTSvxvm.p5p VRTSvxvm After 'pkg install' please follow mandatory configuration steps mentioned in special instructions. Please follow the special instructions mentioned below after installing the patch. REMOVING THE PATCH ------------------ For Solaris 11.1 or later, if DMP native support is enabled, DMP controls the ZFS root pool. Turn off native support before removing the patch. # vxdmpadm settune dmp_native_support=off Note: If you do not disable native support, the system cannot be restarted after you remove DMP. The following example removes a patch from a standalone system: example# pkg uninstall VRTSvxvm Note: Uninstalling the patch will remove the entire package. If you need earlier version of the package, install it from the original source media. SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE