vm-aix-Patch-7.2.0.200

 Basic information
Release type: Patch
Release date: 2017-06-08
OS update support: None
Technote: None
Documentation: None
Popularity: 803 viewed    downloaded
Download size: 100.02 MB
Checksum: 3668068841

 Applies to one or more of the following products:
InfoScale Enterprise 7.2 On AIX 7.1
InfoScale Foundation 7.2 On AIX 7.1
InfoScale Storage 7.2 On AIX 7.1

 Obsolete patches, incompatibilities, superseded patches, or other requirements:
None.

 Fixes the following incidents:
3909992, 3910000, 3910426, 3910586, 3910590, 3910591, 3910592, 3910593, 3912532

 Patch ID:
VRTSvxvm.bff

Readme file
                          * * * READ ME * * *
                 * * * Veritas Volume Manager 7.2 * * *
                         * * * Patch 200 * * *
                         Patch Date: 2017-05-29


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH


PATCH NAME
----------
Veritas Volume Manager 7.2 Patch 200


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
AIX


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * Veritas InfoScale Foundation 7.2
   * Veritas InfoScale Storage 7.2
   * Veritas InfoScale Enterprise 7.2


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: 7.2.0.200
* 3909992 (3898069) System panic may happen in dmp_process_stats routine.
* 3910000 (3893756) 'vxconfigd' is holding a task device for long time, after the kernel counter rewinds, it may create a boundary issue.
* 3910426 (3868533) IO hang happens because of a deadlock situation.
* 3910586 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together
* 3910590 (3878030) Enhance VxVM DR tool to clean up OS and VxDMP device trees without user 
interaction.
* 3910591 (3867236) Application IO hang happens because of a race between Master Pause SIO(Staging IO) 
and RVWRITE1 SIO.
* 3910592 (3864063) Application IO hang happens because of a race between Master Pause SIO(Staging IO) 
and Error Handler SIO.
* 3910593 (3879324) VxVM DR tool fails to handle busy device problem while LUNs are removed from  OS
* 3912532 (3853144) VxVM mirror volume's stale plex is incorrectly marked as "Enable Active" after 
it comes back.


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: 7.2.0.200

* 3909992 (Tracking ID: 3898069)

SYMPTOM:
System panic may happen in dmp_process_stats routine with the following stack:

dmp_process_stats+0x471/0x7b0 
dmp_daemons_loop+0x247/0x550 
kthread+0xb4/0xc0
ret_from_fork+0x58/0x90

DESCRIPTION:
When aggregate the pending IOs per DMP path over all CPUs, out of bound 
access issue happened due to the wrong index of statistic table, which could 
cause a system panic.

RESOLUTION:
Code changes have been done to correct the wrong index.

* 3910000 (Tracking ID: 3893756)

SYMPTOM:
Under certain circumstances, after vxconfigd running for a long time, a task might be dangling in system. Which may be seen by issuing 'vxtask -l list'.

DESCRIPTION:
- voltask_dump() gets a task id by calling ' vol_task_dump' in kernel (ioctl) as the minor number of the taskdev.
- the task id (or minor number) increases by 1 when a new task is registered. 
- task id starts from 160 and rewinds when it meets 65536. there is a global counter 'vxtask_next_minor' indicating next task id.
- at the time vxconfigd opens a taskdev by calling voltask_dump() and holding it, it gets a task id too (let's say 165). from then on, there's a
  vnode with this minor number (major=273, minor=165) exists in kernel.
- as time goes by, the task id increases and meets 65536, it then rewinds and starts from 160 again.
- when taskid goes by 165 again with a cli command (say 'vxdisk -othin, fssize list'), then it's taskdev gets the same major and minor number 
 (165) as vxconfigd's. 
- at the same time, vxconfigd is still holding this vnode too. vxdisk doesn't know this and opens the taskdev, and registers a task structure in 
  kernel hash table, this adds a reference to the same vnode which vxconfigd is holding, now the reference count of the common snode is 2.
- when vxdisk (fsusage_collect_stats_task) has done it's job, it calls voltask_complete->close()->spec_close(), trying to remove this task 
  (165). but the os function spec_close() ( from specfs ) gets in the way, it detects reference count of the common snode (vnode->v_data-
  >snode->s_commonvp->v_data->common snode). spec_close() finds out the value of s_count is 2, then it only drops the reference by one 
  and returns success to caller, without calling the actual closing function 'volsclose()'.
- volsclose() is not called by spec_close(), then it's subsequent functions are not called too: volsclose_real()->voltask_close()
  ->vxtask_rm_task(), among those, vxtask_rm_task() does the actual job removing a task from the kernel hashtable.
- after calling close(), fsusage_collect_stats_task returns, and vxdisk command exits. from this point on, the task is dangling in kernel hash 
  table, until vxconfigd exits.

RESOLUTION:
Source change to avoid vxconfigd holding task device.

* 3910426 (Tracking ID: 3868533)

SYMPTOM:
IO hang happens when starting replication. VXIO deamon hang with stack like 
following:

vx_cfs_getemap at ffffffffa035e159 [vxfs]
vx_get_freeexts_ioctl at ffffffffa0361972 [vxfs]
vxportalunlockedkioctl at ffffffffa06ed5ab [vxportal]
vxportalkioctl at ffffffffa06ed66d [vxportal]
vol_ru_start at ffffffffa0b72366 [vxio]
voliod_iohandle at ffffffffa09f0d8d [vxio]
voliod_loop at ffffffffa09f0fe9 [vxio]

DESCRIPTION:
While performing DCM replay in case Smart Move feature is enabled, VxIO 
kernel needs to issue IOCTL to VxFS kernel to get file system free region. 
VxFS kernel needs to clone map by issuing IO to VxIO kernel to complete this 
IOCTL. Just at the time RLINK disconnection happened, so RV is serialized to 
complete the disconnection. As RV is serialized, all IOs including the 
clone map IO form VxFS is queued to rv_restartq, hence the deadlock.

RESOLUTION:
Code changes have been made to handle the dead lock situation.

* 3910586 (Tracking ID: 3852146)

SYMPTOM:
In a CVM cluster, when importing a shared diskgroup specifying both -c and -o
noreonline options, the following error may be returned: 
VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed: Disk for disk
group not found.

DESCRIPTION:
The -c option will update the disk ID and disk group ID on the private region
of the disks in the disk group being imported. Such updated information is not
yet seen by the slave because the disks have not been re-onlined (given that
noreonline option is specified). As a result, the slave cannot identify the
disk(s) based on the updated information sent from the master, causing the
import to fail with the error Disk for disk group not found.

RESOLUTION:
The code is modified to handle the working of the "-c" and "-o noreonline"
options together.

* 3910590 (Tracking ID: 3878030)

SYMPTOM:
Enhance VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool to 
clean up OS and VxDMP(Veritas Dynamic Multi-Pathing) device trees without 
user interaction.

DESCRIPTION:
When users add or remove LUNs, stale entries in OS or VxDMP device trees can 
prevent VxVM from discovering changed LUNs correctly. It even causes VxVM 
vxconfigd process core dump under certain conditions, users have to reboot 
system to let vxconfigd restart again.
VxVM has DR tool to help users adding or removing LUNs properly but it 
requires user inputs during operations.

RESOLUTION:
Enhancement has been done to VxVM DR tool. It accepts '-o refresh' option to 
clean up OS and VxDMP device trees without user interaction.

* 3910591 (Tracking ID: 3867236)

SYMPTOM:
Application IO hang happens after issuing Master Pause command.

DESCRIPTION:
The flag VOL_RIFLAG_REQUEST_PENDING in VVR(Veritas Volume Replicator) kernel is 
not cleared because of a race between Master Pause SIO and RVWRITE1 SIO resulting 
in RU (Replication Update) SIO to fail to proceed thereby causing IO hang.

RESOLUTION:
Code changes have been made to handle the race condition.

* 3910592 (Tracking ID: 3864063)

SYMPTOM:
Application IO hang happens after issuing Master Pause command.

DESCRIPTION:
Some flags(VOL_RIFLAG_DISCONNECTING or VOL_RIFLAG_REQUEST_PENDING) in VVR(Veritas 
Volume Replicator) kernel are not cleared because of a race between Master Pause SIO 
and Error Handler SIO resulting in RU (Replication Update) SIO to fail to proceed 
thereby causing IO hang.

RESOLUTION:
Code changes have been made to handle the race condition.

* 3910593 (Tracking ID: 3879324)

SYMPTOM:
VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool fails to 
handle busy device problem while LUNs are removed from OS

DESCRIPTION:
OS devices may still be busy after removing them from OS, it fails 'luxadm -
e offline <disk>' operation and leaves staled entries in 'vxdisk list' 
output 
like:
emc0_65535   auto            -            -            error
emc0_65536   auto            -            -            error

RESOLUTION:
Code changes have been done to address busy devices issue.

* 3912532 (Tracking ID: 3853144)

SYMPTOM:
VxVM(Veritas Volume Manager) mirror volume's stale plex is incorrectly marked as 
"Enable Active" after it comes back, which prevents resync of such stale plex 
from up-to-date ones. It can cause data corruption if the stale plex happens to 
be the preferred or slected plex, or read policy "round" is set for the volume.

DESCRIPTION:
When volume plex is detached abruptly while vxconfigd is unavailable, VxVM 
kernel logging records the detach activity along with its detach transaction id 
for future resync or recover. Because of code defect, such detach transaction id 
can be wrongly selected under certain situation.

RESOLUTION:
Code changes have been done to correctly select the detach transaction id.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch vm-sol11_sparc-Patch-7.2.0.200.tar.gz to /tmp
2. Untar vm-sol11_sparc-Patch-7.2.0.200.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/vm-sol11_sparc-Patch-7.2.0.200.tar.gz
    # tar xf /tmp/vm-sol11_sparc-Patch-7.2.0.200.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSvxvm720P200 [<host1> <host2>...]

You can also install this patch together with 7.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.2 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
If the currently installed VRTSvxvm is below 7.2.0.200 level,
upgrade VRTSvxvm to 7.2.0.000 level before installing this patch.
AIX maintenance levels and APARs can be downloaded from the IBM web site:
 http://techsupport.services.ibm.com
1. Since the patch process will configure the new kernel extensions,
        a) Stop I/Os to all the VxVM volumes.
        b) Ensure that no VxVM volumes are in use or open or mounted before starting the installation procedure.
        c) Stop applications using any VxVM volumes.
2. Check whether root support or DMP native support is enabled. If it is enabled, it will be retained after patch upgrade.
# vxdmpadm gettune dmp_native_support
If the current value is 'on', DMP native support is enabled on this machine.
# vxdmpadm native list vgname=rootvg
If the output is some list of hdisks, root support is enabled on this machine
3. Proceed with patch installation as mentioned below
    a. Before applying this VxVM 7.2.0.200 patch, stop the VEA Server's vxsvc process:
        # /opt/VRTSob/bin/vxsvcctrl stop
    b. If your system has Veritas Operation Manager(VOM) configured then check whether vxdclid daemon is running, if it is running then stop vxdclid daemon.
       Command to check the status of vxdclid daemon
       #/opt/VRTSsfmh/etc/vxdcli.sh status
       Command to stop the vxdclid daemon
       #/opt/VRTSsfmh/etc/vxdcli.sh stop
    c. To apply this patch, use following command:
       # installp -ag -d ./VRTSvxvm.bff VRTSvxvm
    d. To apply and commit this patch, use following command:
       # installp -acg -d ./VRTSvxvm.bff VRTSvxvm
NOTE: Please refer installp(1M) man page for clear understanding on APPLY & COMMIT state of the package/patch.
    e. Reboot the system to complete the patch  upgrade.
        #reboot
    f. If you have stopped vxdclid daemon before upgrade then re-start vxdclid daemon using following command
       #/opt/VRTSsfmh/etc/vxdcli.sh start


REMOVING THE PATCH
------------------
1. Check whether root support or DMP native support is enabled or not:
      # vxdmpadm gettune dmp_native_support
If the current value is "on", DMP native support is enabled on this machine.
      # vxdmpadm native list vgname=rootvg
If the output is some list of hdisks, root support is enabled on this machine
If disabled: goto step 3.
If enabled: goto step 2.
2. If root support or DMP native support is enabled:
        a. It is essential to disable DMP native support.
        Run the following command to disable DMP native support as well as root support
              # vxdmpadm settune dmp_native_support=off
        b. If only root support is enabled, run the following command to disable root support
              # vxdmpadm native disable vgname=rootvg
        c. Reboot the system
              # reboot
3.
   a. Before backing out patch, stop the VEA server's vxsvc process:
           # /opt/VRTSob/bin/vxsvcctrl stop
   b. If your system has Veritas Operation Manager(VOM) configured then check whether vxdclid daemon is running, if it is running then stop vxdclid daemon.
      Command to check the status of vxdclid daemon
      #/opt/VRTSsfmh/etc/vxdcli.sh status
      Command to stop the vxdclid daemon
      #/opt/VRTSsfmh/etc/vxdcli.sh stop
   c. To reject the patch if it is in "APPLIED" state, use the following command and re-enable DMP support
      # installp -r VRTSvxvm 7.2.0.200
   d. #  reboot
   e. If you have stopped vxdclid daemon before upgrade then re-start vxdclid daemon using following command
      #/opt/VRTSsfmh/etc/vxdcli.sh start


SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE