vm-aix-Patch-6.0.5.300

 Basic information
Release type: Patch
Release date: 2015-09-18
OS update support: None
Technote: None
Documentation: None
Popularity: 1401 viewed    downloaded
Download size: 78.32 MB
Checksum: 1670021742

 Applies to one or more of the following products:
Dynamic Multi-Pathing 6.0.1 On AIX 6.1
Dynamic Multi-Pathing 6.0.1 On AIX 7.1
Storage Foundation 6.0.1 On AIX 6.1
Storage Foundation 6.0.1 On AIX 7.1
Storage Foundation Cluster File System 6.0.1 On AIX 6.1
Storage Foundation Cluster File System 6.0.1 On AIX 7.1
Storage Foundation for Oracle RAC 6.0.1 On AIX 6.1
Storage Foundation for Oracle RAC 6.0.1 On AIX 7.1
Storage Foundation HA 6.0.1 On AIX 6.1
Storage Foundation HA 6.0.1 On AIX 7.1

 Obsolete patches, incompatibilities, superseded patches, or other requirements:
None.

 Fixes the following incidents:
3496715, 3501358, 3521727, 3526501, 3531332, 3539518, 3540777, 3552411, 3560581, 3600161, 3603811, 3612801, 3621240, 3622069, 3638039, 3648603, 3654163, 3654191, 3654215, 3654228, 3657236, 3690795, 3713320, 3737823, 3774137, 3781745, 3788751, 3799822, 3800394, 3800396, 3800449, 3800452, 3800788, 3801225, 3805938, 3806808, 3807761, 3816233, 3826918

 Patch ID:
VRTSvxvm-06.00.0500.0300

Readme file
                          * * * READ ME * * *
                * * * Veritas Volume Manager 6.0.5 * * *
                      * * * Patch 6.0.5.300 * * *
                         Patch Date: 2015-09-14


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH


PATCH NAME
----------
Veritas Volume Manager 6.0.5 Patch 6.0.5.300


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
AIX 6.1 ppc
AIX 7.1 ppc


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * Veritas Dynamic Multi-Pathing 6.0.1
   * Veritas Storage Foundation 6.0.1
   * Veritas Storage Foundation Cluster File System HA 6.0.1
   * Veritas Storage Foundation for Oracle RAC 6.0.1
   * Veritas Storage Foundation HA 6.0.1


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: 6.0.500.300
* 3496715 (3281004) For DMP minimum queue I/O policy with large number of CPUs a couple of issues 
are observed.
* 3501358 (3399323) The reconfiguration of Dynamic Multipathing (DMP) database fails.
* 3521727 (3521726) System panicked for double freeing IOHINT.
* 3526501 (3526500) Disk IO failures occur with DMP IO timeout error messages when DMP (Dynamic Multi-pathing) IO statistics demon is not running.
* 3531332 (3077582) A Veritas Volume Manager (VxVM) volume may become inaccessible causing the read/write operations to fail.
* 3539518 (3486920) DMP should set ODM attribute reserve_policy=no_reserve when 
disabling MPIO for pid using vxmpio utility
* 3540777 (3539548) After adding removed MPIO disk back, 'vxdisk list' or 'vxdmpadm listctlr all' 
commands may show duplicate entry for DMP node with error state.
* 3552411 (3482026) The vxattachd(1M) daemon reattaches plexes of manually detached site.
* 3560581 (3560576) DMP related error messages logged with syslog for devices with Write Disable(WD)/Not 
Ready(NR) state.
* 3600161 (3599977) During a replica connection, referencing a port that is already deleted in another thread causes a system panic.
* 3603811 (3594158) The spinlock and unspinlock are referenced to different objects when interleaving with a kernel transaction.
* 3612801 (3596330) 'vxsnap refresh' operation fails with `Transaction aborted waiting for IO 
drain` error
* 3621240 (3621232) The vradmin ibc command cannot be started or executed on Veritas Volume Replicators (VVR) secondary node.
* 3622069 (3513392) Reference to replication port that is already deleted caused panic.
* 3638039 (3625890) vxdisk resize operation on CDS disks fails with an error message of "Invalid
attribute specification"
* 3648603 (3564260) VVR commands are unresponsive when replication is paused and resumed in a loop.
* 3654163 (2916877) vxconfigd hangs on a node leaving the cluster.
* 3654191 (3120716) System panic can occur while booting from mirror of rootvg when
dmp_native_support is enabled.
* 3654215 (3350077) VxVM doesnt shrink actively used vxvm secondary paging 
device properly.
* 3654228 (3564204) Some of the paging space characteristics cannot be changed if you use the System
Management Interface Tool (SMIT) with  Change /Show characteristics of VxVM
page space option.
* 3657236 (3657235) With multiple Veritas Volume Manager(VxVM) paging spaces
configured on a system, smit menu "Change /Show characteristics of VxVM page
space" shows incorrect characteristics for some of the paging spaces.
* 3690795 (2573229) On RHEL6, the server panics when Dynamic Multi-Pathing (DMP) executes 
PERSISTENT RESERVE IN command with REPORT CAPABILITIES service action on 
powerpath controlled device.
* 3713320 (3596282) Snap operations fail with error "Failed to allocate a new map due to no free 
map available in DCO".
* 3737823 (3736502) Memory leakage is found when transaction aborts.
* 3774137 (3565212) IO failure is seen during controller giveback operations 
on Netapp Arrays in ALUA mode.
* 3781745 (3169854) Enabling VxVM(Veritas Volume Manager) DMP(Dynamic Multi-Pathing) native support 
fails on some AIX TL levels due to boot image size limit.
* 3788751 (3788644) Reuse raw device number when checking for available raw devices.
* 3799822 (3573262) System panic during space optimized snapshot operations  
on recent UltraSPARC architectures.
* 3800394 (3672759) The vxconfigd(1M) daemon may core dump when DMP database is corrupted.
* 3800396 (3749557) System hangs because of high memory usage by vxvm.
* 3800449 (3726110) On systems with high number of CPU's, Dynamic Multipathing (DMP) devices may perform 
considerably slower than OS device 
paths.
* 3800452 (3437852) The system panics when Symantec Replicator Option goes to
PASSTHRU mode.
* 3800788 (3648719) The server panics while adding or removing LUNs or HBAs.
* 3801225 (3662392) In the Cluster Volume Manager (CVM) environment, if I/Os are getting executed 
on slave node, corruption can happen when the vxdisk resize(1M) command is 
executing on the master node.
* 3805938 (3790136) File system hang observed due to IO's in Ditry Region Logging (DRL).
* 3806808 (3645370) vxevac command fails to evacuate disks with Dirty Region Log(DRL) plexes.
* 3807761 (3729078) VVR(Veritas Volume Replication) secondary site panic occurs during patch 
installation because of flag overlap issue.
* 3816233 (3686698) vxconfigd was getting hung due to deadlock between two threads
* 3826918 (3819670) poll() return -1 with errno EINTR should be handled correctly in 
vol_admintask_wait()


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following Symantec incidents:

Patch ID: 6.0.500.300

* 3496715 (Tracking ID: 3281004)

SYMPTOM:
For DMP minimum queue I/O policy with large number of CPUs, the following 
issues are observed since the VxVM 5.1 SP1 release: 
1. CPU usage is high. 
2. I/O throughput is down if there are many concurrent I/Os.

DESCRIPTION:
The earlier minimum queue I/O policy is used to consider the host controller 
I/O load to select the least loaded path. For VxVM 5.1 SP1 version, an addition 
was made to consider the I/O load of the underlying paths of the selected host 
based controllers. However, this resulted in the performance issues, as there 
were lock contentions with the I/O processing functions and the DMP statistics 
daemon.

RESOLUTION:
The code is modified such that the host controller paths I/O load is not 
considered to avoid the lock contention.

* 3501358 (Tracking ID: 3399323)

SYMPTOM:
The reconfiguration of Dynamic Multipathing (DMP) database fails with the below error: VxVM vxconfigd DEBUG  V-5-1-0 dmp_do_reconfig: DMP_RECONFIGURE_DB failed: 2

DESCRIPTION:
As part of the DMP database reconfiguration process, controller information from DMP user-land database is not removed even though it is removed from DMP kernel database. This creates inconsistency between the user-land and kernel-land DMP database. Because of this, subsequent DMP reconfiguration fails with above error.

RESOLUTION:
The code changes have been made to properly remove the controller information from the user-land DMP database.

* 3521727 (Tracking ID: 3521726)

SYMPTOM:
When using Symantec Replication Option, system panic happens while freeing
memory with the following stack trace on AIX,

pvthread+011500 STACK:
[0001BF60]abend_trap+000000 ()
[000C9F78]xmfree+000098 ()
[04FC2120]vol_tbmemfree+0000B0 ()
[04FC2214]vol_memfreesio_start+00001C ()
[04FCEC64]voliod_iohandle+000050 ()
[04FCF080]voliod_loop+0002D0 ()
[04FC629C]vol_kernel_thread_init+000024 ()
[0025783C]threadentry+00005C ()

DESCRIPTION:
In certain scenarios, when a write IO gets throttled or un-winded in VVR, we
free the memory related to one of our data structures. When we restart this IO,
the same memory gets illegally accessed and freed again even though it was
freed.It causes system panic.

RESOLUTION:
Code changes have been done to fix the illegal memory access issue.

* 3526501 (Tracking ID: 3526500)

SYMPTOM:
Disk IO failures occur with DMP IO timeout error messages when DMP (Dynamic Multi-pathing) IO statistics daemon is not running. Following are the timeout error messages:

VxVM vxdmp V-5-3-0 I/O failed on path 65/0x40 after 1 retries for disk 201/0x70
VxVM vxdmp V-5-3-0 Reached DMP Threshold IO TimeOut (100 secs) I/O with start 
3e861909fa0 and end 3e86190a388 time
VxVM vxdmp V-5-0-0 [Error] i/o error occurred (errno=0x206) on dmpnode 201/0x70

DESCRIPTION:
When IO is submitted to DMP, it sets the start time on the IO buffer. The value of the start time depends on whether the DMP IO statistics daemon is running or not. When the IO is returned as error from SCSI to DMP, instead of retrying the IO on alternate paths, DMP failed that IO with 300 seconds timeout error, but the IO has elapsed only few milliseconds in its execution. The miscalculation of DMP timeout happens only when DMP IO statistics daemon is not running.

RESOLUTION:
The code is modified to calculate appropriate DMP IO timeout value when DMP IO statistics demon is not running.

* 3531332 (Tracking ID: 3077582)

SYMPTOM:
A Veritas Volume Manager (VxVM) volume may become inaccessible causing the read/write operations to fail with the following error:
# dd if=/dev/vx/dsk/<dg>/<volume> of=/dev/null count=10
dd read error: No such device
0+0 records in
0+0 records out

DESCRIPTION:
If I/Os to the disks timeout due to some hardware failures like weak Storage Area Network (SAN) cable link or Host Bus Adapter (HBA) failure, VxVM assumes that the disk is faulty or slow and it sets the failio flag on the disk. Due to this flag, all the subsequent I/Os fail with the No such device error.

RESOLUTION:
The code is modified such that vxdisk now provides a way to clear the failio flag. To check whether the failio flag is set on the disks, use the vxkprint(1M) utility (under /etc/vx/diag.d). To reset the failio flag, execute the vxdisk set <disk_name> failio=off command, or deport and import the disk group that holds these disks.

* 3539518 (Tracking ID: 3486920)

SYMPTOM:
After disabling MPIO(AIX Multipath I/O) thus enabling DMP (Veritas Dynamic 
Multipathing) support using vxmpio command, 'reserve_policy' attribute for 
devices is not getting set due to which I/O commands fail on some paths.
#lsattr -El <disk_name> -a reserve_policy
lsattr: 0514-528 The "reserve_policy" attribute does not exist in the predefined 
device configuration database.

DESCRIPTION:
After disabling MPIO using vxmpio command for the hdisk devices, 
'reserve_policy' attribute is not getting set. If AIX does not find 
'reserve_policy' attribute for a device then the device is getting opened in 
'SINGLE PATH RESERVE mode'. This causes SCSI-2 reservation 
to be set through one of the paths and hence I/O commands fail on other paths.

RESOLUTION:
'reserve_policy' PdAt ODM entry has been added explicitly when disabling MPIO 
using vxmpio command.

* 3540777 (Tracking ID: 3539548)

SYMPTOM:
Adding MPIO(Multi Path I/O) disk that had been removed earlier may result in 
following two issues:
1. 'vxdisk list' command shows duplicate entry for DMP (Dynamic Multi-Pathing) 
node with error state.
2. 'vxdmpadm listctlr all' command shows duplicate controller names.

DESCRIPTION:
1. Under certain circumstances, deleted MPIO disk record information is left in 
/etc/vx/disk.info file with its device number as -1 but its DMP node name is 
reassigned to other MPIO disk. When the deleted disk is added back, it is 
assigned the same name, without validating for conflict in the name. 
2. When some devices are removed and added back to the system, we are adding a 
new controller for each and every path that we have discovered. This leads to 
duplicated controller entries in DMP database.

RESOLUTION:
1. Code is modified to properly remove all stale information about any disk 
before updating MPIO disk names. 
2. Code changes have been made to add the controller for selected paths only.

* 3552411 (Tracking ID: 3482026)

SYMPTOM:
The vxattachd(1M) daemon reattaches plexes of manually detached site.

DESCRIPTION:
The vxattachd daemon reattaches plexes for a manually detached site that is the site with state as OFFLINE. As there was no check to differentiate between a manually detach site and the site that was detached due to IO failure. Hence, the vxattachd(1M) daemon brings the plexes online for manually detached site also.

RESOLUTION:
The code is modified to differentiate between manually detached site and the site detached due to IO failure.

* 3560581 (Tracking ID: 3560576)

SYMPTOM:
vxdisk scandisks / vxdctl enable command logs following message in the 
syslog for devices with Write Disable(WD)/Not Ready(NR) state.

"VxVM vxdmp V-5-3-0 dmp_indirect_ioctl: Open failed on path <major/minor> 
with ret 47"

DESCRIPTION:
vxdisk scandisks/vxdctl enable command do open on devices. If the open fails 
then Dynamic Multipathing (DMP)logs error message. For Write 
Disabled(WD)/Not Ready(NR) state devices, open is expected to fail. Hence, these 
error messages will get logged.

RESOLUTION:
Code changes are done to suppress these error messages for default DMP log 
level.

* 3600161 (Tracking ID: 3599977)

SYMPTOM:
During a replica connection, referencing a port that is already deleted in another thread causes a system panic with a similar stack trace as below:
.simple_lock()
soereceive()
soreceive()
.kernel_add_gate_cstack()
kmsg_sys_rcv()
nmcom_get_next_mblk()
nmcom_get_next_msg()
nmcom_wait_msg_tcp()
nmcom_server_proc_tcp()
nmcom_server_proc_enter()
vxvm_start_thread_enter()

DESCRIPTION:
During a replica connection, a port is created before increasing the count. This is to protect the port from getting deleted. However, another thread deletes the port before the count is increased and after the port is created. 
While the replica connection thread proceeds, it refers to the port that is already deleted, which causes a NULL pointer reference and a system panic.

RESOLUTION:
The code is modified to prevent asynchronous access to the count that is associated with the port by means of locks.

* 3603811 (Tracking ID: 3594158)

SYMPTOM:
The system panics on a VVR secondary node with the following stack trace:
.simple_lock()
soereceive()
soreceive()
.kernel_add_gate_cstack()
kmsg_sys_rcv()
nmcom_get_next_mblk()
nmcom_get_next_msg()
nmcom_wait_msg_tcp()
nmcom_server_proc_tcp()
nmcom_server_proc_enter()
vxvm_start_thread_enter()

DESCRIPTION:
You may issue a spinlock or unspinlock to the replica to check whether to use a checksum in the received packet. During the lock or unlock operation, if there is a transaction that is being processed with the replica, which rebuilds the replica object in the kernel, then there is a possibility that the replica referenced in spinlock is different than the one which the replica has referenced in unspinlock (especially when the replica is referenced through several pointers). As a result, the system panics.

RESOLUTION:
The code is modified to set the flag in the port attribute to indicate whether to use the checksum during a port creation. Hence, for each packet that is received, you only need to check the flag in the port attribute rather than referencing it to the replica object. As part of the change, the spinlock or unspinlock statements are also removed.

* 3612801 (Tracking ID: 3596330)

SYMPTOM:
'vxsnap refresh' operation fails with following indicants:

Errors occur from DR (Disaster Recovery) Site of VVR (Veritas 
Volume Replicator):

o	vxio: [ID 160489 kern.notice] NOTICE: VxVM vxio V-5-3-1576 commit: 
Timedout waiting for rvg [RVG] to quiesce, iocount [PENDING_COUNT] msg 0
o	vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-8011 Internal 
transaction failed: Transaction aborted waiting for io drain

At the same time, following errors occur from Primary Site of VVR:

vxio: [ID 218356 kern.warning] WARNING: VxVM VVR vxio V-5-0-267 Rlink 
[RLINK] disconnecting due to ack timeout on update message

DESCRIPTION:
VM (Volume Manager) Transactions on DR site get aborted as pending IOs could 
not be drained in stipulated time leading to failure of FMR (Fast-Mirror 
Resync) 'snap' operations. These IOs could not be drained because of IO 
throttling. A bug/race in conjunction with timing in VVR causes a miss in 
clearing this throttling condition/state.

RESOLUTION:
Code changes have been done to fix the race condition which ensures clearance 
of throttling state at appropriate time.

* 3621240 (Tracking ID: 3621232)

SYMPTOM:
When the vradmin ibc command is executed to initiate the In-band Control (IBC) procedure, the vradmind (VVR daemon) on VVR secondary node goes into the disconnected state. Due to which, the following IBC procedure or vradmin ibc commands cannot be started or executed on VVRs secondary node and a message similar to the following appears on  VVRs primary node:
    
VxVM VVR vradmin ERROR V-5-52-532 Secondary is undergoing a state transition. Please re-try the command after some time.
VxVM VVR vradmin ERROR V-5-52-802 Cannot start command execution on Secondary.

DESCRIPTION:
When IBC procedure runs into the commands finish state, the vradmin on VVR secondary node goes into a disconnected state, which the vradmind on primary node fails to realize. 
In such a scenario, the vradmind on primary refrains from sending a handshake request to the secondary node which can change the primary nodes state from disconnected to running. As a result, the vradmind in the primary node continues to be in the disconnected state and the vradmin ibc command fails to run on the VVR secondary node despite of being in the running state on the VVR primary node.

RESOLUTION:
The code is modified to make sure the vradmind on VVR primary node is notified while it goes into the disconnected state on VVR secondary node. As a result, it can send out a handshake request to take the secondary node out of the disconnected state.

* 3622069 (Tracking ID: 3513392)

SYMPTOM:
secondary panics when rebooted while heavy IOs are going on primary

PID: 18862  TASK: ffff8810275ff500  CPU: 0   COMMAND: "vxiod"
#0 [ffff880ff3de3960] machine_kexec at ffffffff81035b7b
#1 [ffff880ff3de39c0] crash_kexec at ffffffff810c0db2
#2 [ffff880ff3de3a90] oops_end at ffffffff815111d0
#3 [ffff880ff3de3ac0] no_context at ffffffff81046bfb
#4 [ffff880ff3de3b10] __bad_area_nosemaphore at ffffffff81046e85
#5 [ffff880ff3de3b60] bad_area_nosemaphore at ffffffff81046f53
#6 [ffff880ff3de3b70] __do_page_fault at ffffffff810476b1
#7 [ffff880ff3de3c90] do_page_fault at ffffffff8151311e
#8 [ffff880ff3de3cc0] page_fault at ffffffff815104d5
#9 [ffff880ff3de3d78] volrp_sendsio_start at ffffffffa0af07e3 [vxio]
#10 [ffff880ff3de3e08] voliod_iohandle at ffffffffa09991be [vxio]
#11 [ffff880ff3de3e38] voliod_loop at ffffffffa0999419 [vxio]
#12 [ffff880ff3de3f48] kernel_thread at ffffffff8100c0ca

DESCRIPTION:
If the replication stage IOs are started after serialization of the replica volume, 
replication port  could be deleted  and set to NULL during handling the replica 
connection changes,  this will cause the panic since we have not checked if the 
replication port is still valid before referencing to it.

RESOLUTION:
Code changes have been done to abort the stage IO if replication port is NULL.

* 3638039 (Tracking ID: 3625890)

SYMPTOM:
After running the vxdisk resize command, the following message is displayed: 
"VxVM vxdisk ERROR V-5-1-8643 Device <disk name> resize failed: Invalid 
attribute specification"

DESCRIPTION:
Two reserved cylinders for special usage for CDS (Cross-platform Data Sharing)
VTOC(Volume Table of Contents) disks. In case of expanding a disk with
particular disk size on storage side, VxVM(Veritas Volume Manager) may calculate
the cylinder number as 2, which causes the vxdisk resize fails with the error
message of "Invalid attribute specification".

RESOLUTION:
The code is modified to avoid the failure of resizing a CDS VTOC disk.

* 3648603 (Tracking ID: 3564260)

SYMPTOM:
VVR commands are unresponsive when replication is paused and resumed in a loop.

DESCRIPTION:
While Veritas Volume Replicator (VVR) is in the process of sending updates then pausing a replication is deferred until acknowledgements of updates are received or until an error occurs. For some reason, if the acknowledgements get delayed or the delivery fails, the pause operation continues to get deferred resulting in unresponsiveness.

RESOLUTION:
The code is modified to resolve the issue that caused unresponsiveness.

* 3654163 (Tracking ID: 2916877)

SYMPTOM:
vxconfigd hangs, if a node leaves the cluster, while I/O error handling is in
progress. Stack observed is as follows:
volcvm_iodrain_dg 
volcvmdg_abort_complete
volcvm_abort_sio_start 
voliod_loop
vol_kernel_thread_init

DESCRIPTION:
A bug in DCO error handling code can lead to an infinite loop if a node leaves
cluster while I/O error handling is in progress. This causes vxconfigd to hang
and stop responding to VxVM commands like vxprint, vxdisk 
etc.

RESOLUTION:
DCO error handling code has been changed so that I/O errors are handled
correctly. Hence, hang is avoided.

* 3654191 (Tracking ID: 3120716)

SYMPTOM:
System panic can occur while booting from mirror of rootvg when
dmp_native_support is enabled.

DESCRIPTION:
In early boot stage, ldata memory allocation within Dynamic Multipathing driver
(DMP)might fail in interrupt context. ldata memory allocation logic in DMP was
not handling the failure situation correctly and retries were not done to
satisfy memory allocation request. This resulted in panic as I/O could not be
serviced.

RESOLUTION:
Code changes are done to properly handle ldata memory allocation failure in
interrupt context within DMP.

* 3654215 (Tracking ID: 3350077)

SYMPTOM:
VxVM fails to shrink actively used vxvm secondary paging device 
with the following error message:
"VxVM vxvm ERROR V-5-2-0 Cannot shrink the pagespace"

DESCRIPTION:
While shrinking an actively used vxvm secondary paging device, 
VxVM creates a temporary smaller paging space, based on the input size to 
decrease (input_size_to_decrease). After this, VxVM deactivates the currently 
used secondary paging space and activates the newly created temporary smaller 
paging space as the new secondary paging device. The size of the temporary 
paging space created should equal to the value of that total_size minus 
input_size_to_decrease. But actually, the size is the value of 
input_size_to_decrease which may be smaller than the current in-use size of the 
secondary paging space to be shrinked. As a result, the in-use vxvm paging 
space cannot be deactivated. This causes VxVM to fail while shrinking the 
secondary paging space.

RESOLUTION:
The code is modified to create a temporary paging space of an 
appropriate size, so that shrink operations succeed.

* 3654228 (Tracking ID: 3564204)

SYMPTOM:
If you edit characteristics of VxVM page space to activate it on every system
reboot by using AIX SMIT 'Change /Show
characteristics of VxVM page space' option, it fails with the following error:

VxVM vxvm ERROR V-5-2-0 Please specify VxVM-pagespace
USAGE: chps -t vxvm [ -c ChksumSize] [ -s no_of_blocks | -d no_of_blocks ] [
-a
{ y | n } ] VxVM-PSname
chps: Error executing the helper executable vxvm.
chps: Cannot change paging space swapvol1.

DESCRIPTION:
When you use the AIX SMIT with 'Change /Show characteristics of VxVM page space'
option to change the characteristics of VxVM paging space so that it will be
activated on every reboot,  it fails. This is because incorrect Oracle Disk
Manager (ODM) entries will get added in the SMIT ODM database. This causes
execution of wrong command resulting failure.

RESOLUTION:
The code has been changed to add ODM entry in SMIT ODM database so that
correct command gets executed.

* 3657236 (Tracking ID: 3657235)

SYMPTOM:
When a system is configured with multiple VxVM paging spaces, smit
menu "Change /Show characteristics of VxVM page space" displays characteristics
of one of the paging space, irrespective of the given input. For other paging
spaces, the output displays wrong characteristics and such paging spaces cannot
be modified also.

DESCRIPTION:
In case of multiple VxVM paging spaces configured on the
system, because of incorrect ODM entries added to smit ODM database, the smit
menu only displays characteristics of one of the paging spaces, regardless of
the input given. Also, there can be multiple stale ODM entries for VxVM paging
spaces, because the entries are not deleted as a part of package removal. Hence,
in some case, wrong stale ODM entry is used by smit menu which causes the issue.

RESOLUTION:
The code is modified to parse the input correctly in order to
show the characteristics of VxVM paging spaces correctly based on the input
provided and to delete the stale ODM entries present on the system.

* 3690795 (Tracking ID: 2573229)

SYMPTOM:
On RHEL6, the server panics when Dynamic Multi-Pathing (DMP) executes 
PERSISTENT RESERVE IN command with REPORT CAPABILITIES service action on 
powerpath controlled device. The following stack trace is displayed:

enqueue_entity at ffffffff81068f09
enqueue_task_fair at ffffffff81069384
enqueue_task at ffffffff81059216
activate_task at ffffffff81059253
pull_task at ffffffff81065401
load_balance_fair at ffffffff810657b7
thread_return at ffffffff81527d30
schedule_timeout at ffffffff815287b5
wait_for_common at ffffffff81528433
wait_for_completion at ffffffff8152854d
blk_execute_rq at ffffffff8126d9dc
emcp_scsi_cmd_ioctl at ffffffffa04920a2 [emcp]
PowerPlatformBottomDispatch at ffffffffa0492eb8 [emcp]
PowerSyncIoBottomDispatch at ffffffffa04930b8 [emcp]
PowerBottomDispatchPirp at ffffffffa049348c [emcp]
PowerDispatchX at ffffffffa049390d [emcp]
MpxSendScsiCmd at ffffffffa061853e [emcpmpx]
ClariionKLam_groupReserveRelease at ffffffffa061e495 [emcpmpx]
MpxDefaultRegister at ffffffffa061df0a [emcpmpx]
MpxTestPath at ffffffffa06227b5 [emcpmpx]
MpxExtraTry at ffffffffa06234ab [emcpmpx]
MpxTestDaemonCalloutGuts at ffffffffa062402f [emcpmpx]
MpxIodone at ffffffffa0624621 [emcpmpx]
MpxDispatchGuts at ffffffffa0625534 [emcpmpx]
MpxDispatch at ffffffffa06256a8 [emcpmpx]
PowerDispatchX at ffffffffa0493921 [emcp]
GpxDispatch at ffffffffa0644775 [emcpgpx]
PowerDispatchX at ffffffffa0493921 [emcp]
GpxDispatchDown at ffffffffa06447ae [emcpgpx]
VluDispatch at ffffffffa068b025 [emcpvlumd]
GpxDispatch at ffffffffa0644752 [emcpgpx]
PowerDispatchX at ffffffffa0493921 [emcp]
GpxDispatchDown at ffffffffa06447ae [emcpgpx]
XcryptDispatchGuts at ffffffffa0660b45 [emcpxcrypt]
XcryptDispatch at ffffffffa0660c09 [emcpxcrypt]
GpxDispatch at ffffffffa0644752 [emcpgpx]
PowerDispatchX at ffffffffa0493921 [emcp]
GpxDispatch at ffffffffa0644775 [emcpgpx]
PowerDispatchX at ffffffffa0493921 [emcp]
PowerSyncIoTopDispatch at ffffffffa04978b9 [emcp]
emcp_send_pirp at ffffffffa04979b9 [emcp]
emcp_pseudo_blk_ioctl at ffffffffa04982dc [emcp]
__blkdev_driver_ioctl at ffffffff8126f627
blkdev_ioctl at ffffffff8126faad
block_ioctl at ffffffff811c46cc
dmp_ioctl_by_bdev at ffffffffa074767b [vxdmp]
dmp_kernel_scsi_ioctl at ffffffffa0747982 [vxdmp]
dmp_scsi_ioctl at ffffffffa0786d42 [vxdmp]
dmp_send_scsireq at ffffffffa078770f [vxdmp]
dmp_do_scsi_gen at ffffffffa077d46b [vxdmp]
dmp_pr_check_aptpl at ffffffffa07834dd [vxdmp]
dmp_make_mp_node at ffffffffa0782c89 [vxdmp]
dmp_decode_add_disk at ffffffffa075164e [vxdmp]
dmp_decipher_instructions at ffffffffa07521c7 [vxdmp]
dmp_process_instruction_buffer at ffffffffa075244e [vxdmp]
dmp_reconfigure_db at ffffffffa076f40e [vxdmp]
gendmpioctl at ffffffffa0752a12 [vxdmp]
dmpioctl at ffffffffa0754615 [vxdmp]
dmp_ioctl at ffffffffa07784eb [vxdmp]
dmp_compat_ioctl at ffffffffa0778566 [vxdmp]
compat_blkdev_ioctl at ffffffff8128031d
compat_sys_ioctl at ffffffff811e0bfd
sysenter_dispatch at ffffffff81050c20

DESCRIPTION:
Dynamic Multi-Pathing (DMP) uses PERSISTENT RESERVE IN command with the REPORT
CAPABILITIES service action to discover target capabilities. On RHEL6, system
panics unexpectedly when Dynamic Multi-Pathing (DMP) executes PERSISTENT 
RESERVE IN command with REPORT CAPABILITIES service action on powerpath 
controlled device coming from EMC Clarion/VNX array. This bug has been reported 
to EMC powperpath engineering.

RESOLUTION:
The Dynamic Multi-Pathing (DMP) code is modified to execute PERSISTENT RESERVE 
IN command with the REPORT CAPABILITIES service action to discover target 
capabilities only on non-third party controlled devices.

* 3713320 (Tracking ID: 3596282)

SYMPTOM:
FMR (Fast Mirror Resync) operations fail with error "Failed to allocate a new 
map due to no free map available in DCO".
 
"vxio: [ID 609550 kern.warning] WARNING: VxVM vxio V-5-3-1721
voldco_allocate_toc_entry: Failed to allocate a new map due to no free map
available in DCO of [volume]"

It often leads to disabling of the snapshot.

DESCRIPTION:
For instant space optimized snapshots, stale maps are left behind for DCO (Data 
Change Object) objects at the time of creation of cache objects. So, over the 
time if space optimized snapshots are created that use a new cache object, 
stale maps get accumulated, which eventually consume all the available DCO 
space, resulting in the error.

RESOLUTION:
Code changes have been done to ensure no stale entries are left behind.

* 3737823 (Tracking ID: 3736502)

SYMPTOM:
When FMR is configured in VVR environment, 'vxsnap refresh' fails with below 
error message:
"VxVM VVR vxsnap ERROR V-5-1-10128 DCO experienced IO errors during the
operation. Re-run the operation after ensuring that DCO is accessible".
Also, multiple messages of connection/disconnection of replication 
link(rlink) are seen.

DESCRIPTION:
Inherently triggered rlink connection/disconnection causes the transaction 
retries. During transaction, memory is allocated for Data Change Object(DCO) 
maps and is not cleared on abortion of a transaction.
This leads to a problem of memory leak and eventually to exhaustion of maps.

RESOLUTION:
The fix has been added to clear the allocated DCO maps when transaction 
aborts.

* 3774137 (Tracking ID: 3565212)

SYMPTOM:
While performing controller giveback operations on NetApp ALUA arrays, the 
below messages are observed in /etc/vx/dmpevents.log

[Date]: I/O error occured on Path <path> belonging to Dmpnode <dmpnode>
[Date]: I/O analysis done as DMP_PATH_BUSY on Path <path> belonging to 
Dmpnode 
<dmpnode>
[Date]: I/O analysis done as DMP_IOTIMEOUT on Path <path> belonging to 
Dmpnode 
<dmpnode>

DESCRIPTION:
During the asymmetric access state transition, DMP puts the buffer pointer 
in the delay queue based on the flags observed in the logs. This delay 
resulted in timeout and thereby filesystem went into disabled state.

RESOLUTION:
DMP code is modified to perform immediate retries instead of putting the 
buffer pointer in the delay queue for transition in progress case.

* 3781745 (Tracking ID: 3169854)

SYMPTOM:
On some AIX TL levels, when enabling DMP native support, it reports following 
errors:
	VxVM vxdmpadm ERROR V-5-1-16951 boot image size exceeding the system 
limit, trying to recover
	VxVM vxdmpadm ERROR V-5-1-16952 system recovery successful, but DMP 
support for LVM bootability could not be enabled
It can also stop AIX bootup procedure, displaying below message on the console:
	PReP-BOOT : Unable to load full PReP image

DESCRIPTION:
With some AIX TL levels, the defalult boot image size increases. After enabling 
DMP native support, boot image size exceeds AIX limit of 32MB hence it errors 
and can not boot up.

RESOLUTION:
Code changes have been done to significantly decrease DMP component size in AIX 
boot image when enabling DMP native support.

* 3788751 (Tracking ID: 3788644)

SYMPTOM:
When DMP (Dynamic Multi-Pathing) native support enabled for Oracle ASM 
environment, if we constantly adding and removing DMP devices, it will cause error 
like:
/etc/vx/bin/vxdmpraw enable oracle dba 775 emc0_3f84
VxVM vxdmpraw INFO V-5-2-6157
Device enabled : emc0_3f84
Error setting raw device (Invalid argument)

DESCRIPTION:
There is a limitation (8192) of maximum raw device number N (exclusive) of 
/dev/raw/rawN. This limitation is defined in boot configuration file. When binding a 
raw 
device to a dmpnode, it uses /dev/raw/rawN to bind the dmpnode. The rawN is 
calculated by one-way incremental process. So even if we unbind the device later on, 
the "released" rawN number will not be reused in the next binding. When the rawN 
number is increased to exceed the maximum limitation, the error will be reported.

RESOLUTION:
Code has been changed to always use the smallest available rawN number instead of 
calculating by one-way incremental process.

* 3799822 (Tracking ID: 3573262)

SYMPTOM:
On recent UltraSPARC-T4 architectures, Panic is observed with the 
topmost stack frame pointing to bcopy during snapshot operations involving 
space optimized snapshots.

<trap>SPARC-T4:bcopy_more()
SPARC-T4:bcopy() 
vxio:vol_cvol_bplus_delete()
vxio:vol_cvol_dshadow1_done()
vxio:voliod_iohandle()
vxio:voliod_loop()

DESCRIPTION:
The bcopy kernel library routine on Solaris was optimized to take advantage 
of recent Ultrasparc-T4 architectures. But it has some known issues for large 
size copy in some patch versions of Solaris 10. The use of bcopy was causing 
in-core corruption of cache object metadata. The corruption later lead to 
system panic.

RESOLUTION:
The code is modified to use word by word copy of the buffer 
instead of bcopy kernel library routine.

* 3800394 (Tracking ID: 3672759)

SYMPTOM:
When a DMP database is corrupted, the vxconfigd(1M) daemon may core dump with the following stack trace:
database is corrupted.
  ddl_change_dmpnode_state ()
  ddl_data_corruption_msgs ()
  ddl_reconfigure_all ()
  ddl_find_devices_in_system ()
  find_devices_in_system ()
  req_change_state ()
  request_loop ()
  main ()

DESCRIPTION:
The issue is observed because the corrupted DMP database is not properly destroyed.

RESOLUTION:
The code is modified to remove the corrupted DMP database.

* 3800396 (Tracking ID: 3749557)

SYMPTOM:
System hangs and becomes unresponsive because of heavy memory consumption by 
vxvm.

DESCRIPTION:
In the Dirty Region Logging(DRL) update code path an erroneous condition was
present that lead to an infinite loop which keeps on consuming memory. This leads
to consumption of large amounts of memory making the system unresponsive.

RESOLUTION:
Code has been fixed, to avoid the infinite loop, hence preventing the hang which
was caused by high memory usage.

* 3800449 (Tracking ID: 3726110)

SYMPTOM:
On systems with high number of CPU's, Dynamic Multipathing (DMP) devices may perform 
considerably slower than OS device 
paths.

DESCRIPTION:
In high CPU configuration, IO statistics related functionality in DMP takes more
CPU time as DMP statistics are collected on per CPU basis. This stat collection
happens in DMP IO code path hence it reduces the IO performance. Because of
this DMP devices perform slower than OS device paths.

RESOLUTION:
Code changes are made to remove some of the stats collection functionality from
DMP IO code path. Along with this, following tunable need to be turned off. 
1. Turn off idle lun probing. 
#vxdmpadm settune dmp_probe_idle_lun=off
2. Turn off statistic gathering functionality.  
#vxdmpadm iostat stop

Notes: 
1. Please apply this patch if system configuration has large number of CPU and
if DMP is performing considerably slower than OS device paths. For normal
systems this issue is not applicable.

* 3800452 (Tracking ID: 3437852)

SYMPTOM:
The system panics when  Symantec Replicator Option goes to PASSTHRU
mode. Panic stack trace might look like:

vol_rp_halt()
vol_rp_state_trans()
vol_rv_replica_reconfigure()
vol_rv_error_handle()
vol_rv_errorhandler_callback()
vol_klog_start()
voliod_iohandle()
voliod_loop()

DESCRIPTION:
When Storage Replicator Log (SRL) gets faulted for any reason, VVR
goes into the PASSTHRU Mode. At this time, a few updates are erroneously freed.
When these updates are accessed during the correct processing, access to these
updates results in panic as the updates are already freed.

RESOLUTION:
The code changes have been made not to free the updates erroneously.

* 3800788 (Tracking ID: 3648719)

SYMPTOM:
The server panics with a following stack trace while adding or removing LUNs or HBAs: 
dmp_decode_add_path()
dmp_decipher_instructions()
dmp_process_instruction_buffer()
dmp_reconfigure_db()
gendmpioctl()
vxdmpioctl()

DESCRIPTION:
While deleting a dmpnode, Dynamic Multi-Pathing (DMP) releases the memory associated with the dmpnode structure. 
In case the dmpnode doesn't get deleted for some reason, and if any other tasks access the freed memory of this dmpnode, then the server panics.

RESOLUTION:
The code is modified to avoid the tasks from accessing the memory that is freed by the dmpnode, which is deleted. The change also fixed the memory leak issue in the buffer allocation code path.

* 3801225 (Tracking ID: 3662392)

SYMPTOM:
In the CVM environment, if I/Os are getting executed on slave node, corruption 
can happen when the vxdisk resize(1M) command is executing on the master 
node.

DESCRIPTION:
During the first stage of resize transaction, the master node re-adjusts the 
disk offsets and public/private partition device numbers.
On a slave node, the public/private partition device numbers are not adjusted 
properly. Because of this, the partition starting offset is are added twice 
and causes the corruption. The window is small during which public/private 
partition device numbers are adjusted. If I/O occurs during this window then 
only 
corruption is observed. 
After the resize operation completes its execution, no further corruption will 
happen.

RESOLUTION:
The code has been changed to add partition starting offset properly to an I/O 
on slave node during execution of a resize command.

* 3805938 (Tracking ID: 3790136)

SYMPTOM:
File system hang can be observed sometimes due to IO's hung in DRL.

DESCRIPTION:
There might be some IO's hung in DRL of mirrored volume due to incorrect 
calculation of outstanding IO's on volume and number of active IO's which are 
currently in progress on DRL. The value of the outstanding IO on volume can get 
modified incorrectly leading to IO's on DRL not to progress further which in 
turns results in a hang kind of scenario.

RESOLUTION:
Code changes have been done to avoid incorrect modification of value of 
outstanding IO's on volume and prevent the hang.

* 3806808 (Tracking ID: 3645370)

SYMPTOM:
After running the vxevac command, if the user tries to rollback or commit the evacuation for a disk containing DRL plex, the action fails with the following errors:

/etc/vx/bin/vxevac -g testdg  commit testdg02 testdg03
VxVM vxsd ERROR V-5-1-10127 deleting plex %1:
        Record is associated
VxVM vxassist ERROR V-5-1-324 fsgen/vxsd killed by signal 11, core dumped
VxVM vxassist ERROR V-5-1-12178 Could not commit subdisk testdg02-01 in 
volume testvol
VxVM vxevac ERROR V-5-2-3537 Aborting disk evacuation

/etc/vx/bin/vxevac -g testdg rollback testdg02 testdg03
VxVM vxsd ERROR V-5-1-10127 deleting plex %1:
        Record is associated
VxVM vxassist ERROR V-5-1-324 fsgen/vxsd killed by signal 11, core dumped
VxVM vxassist ERROR V-5-1-12178 Could not rollback subdisk testdg02-01 in 
volume
testvol
VxVM vxevac ERROR V-5-2-3537 Aborting disk evacuation

DESCRIPTION:
When the user uses the vxevac command, new plexes are created on the target disks. Later,  during the commit or roll back operation, VxVM deletes the plexes on the source or the target disks. 
For deleting a plex, VxVM should delete its sub disks first, otherwise the plex deletion fails with the following error message:
VxVM vxsd ERROR V-5-1-10127 deleting plex %1:
        Record is associated 
The error is displayed because the code does not handle the deletion of subdisks of plexes marked for DRL (dirty region logging) correctly.

RESOLUTION:
The code is modified to handle evacuation of disks with DRL plexes correctly.  .

* 3807761 (Tracking ID: 3729078)

SYMPTOM:
In VVR environment, the panic may occur after SF(Storage Foundation) patch 
installation or uninstallation on the secondary site.

DESCRIPTION:
VXIO Kernel reset invoked by SF patch installation removes all Disk Group 
objects that have no preserved flag set, because the preserve flag is overlapped 
with RVG(Replicated Volume Group) logging flag, the RVG object won't be removed, 
but its rlink object is removed, result of system panic when starting VVR.

RESOLUTION:
Code changes have been made to fix this issue.

* 3816233 (Tracking ID: 3686698)

SYMPTOM:
vxconfigd was getting hung due to deadlock between two threads

DESCRIPTION:
Two threads were waiting for same lock causing deadlock between 
them. This will lead to block all vx commands. 
untimeout function will not return until pending callback is cancelled  (which 
is set through timeout function) OR pending callback has completed its 
execution (if it has already started). Therefore locks acquired by callback 
routine should not be held across call to untimeout routine or deadlock may 
result.

Thread 1: 
    untimeout_generic()   
    untimeout()
    voldio()
    volsioctl_real()
    fop_ioctl()
    ioctl()
    syscall_trap32()
 
Thread 2:
    mutex_vector_enter()
    voldsio_timeout()
    callout_list_expire()
    callout_expire()
    callout_execute()
    taskq_thread()
    thread_start()

RESOLUTION:
Code changes have been made to call untimeout outside the lock 
taken by callback handler.

* 3826918 (Tracking ID: 3819670)

SYMPTOM:
When running smartmove with "vxevac", customer let it run in background by typing 
ctlr-
z and bg command, however, this result in termination of data moving.

DESCRIPTION:
When doing data moving from user command like "vxevac", we submit the data moving 
as a task in the kernel, and use select() primitive on the task file descriptor to 
wait for 
task finishing events arrived. 
However, when typing "ctlr-z" plus bg, the select() returns -1 with errno EINTR, 
which is 
interpreted by our code logic as user termination action. Hence we terminate the 
data 
moving. The correct behavior should be retrying the select() to wait for task 
finishing 
events.

RESOLUTION:
Code changes has been done that when select() returns with errno EINTR, we check if 
task is finished or not; if not finished, retry the select().



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch vm-aix-Patch-6.0.5.300.tar.gz to /tmp
2. Untar vm-aix-Patch-6.0.5.300.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/vm-aix-Patch-6.0.5.300.tar.gz
    # tar xf /tmp/vm-aix-Patch-6.0.5.300.tar
3. Install the hotfix
    # pwd 
    /tmp/hf
    # ./installVM605P3 [<host1> <host2>...]

You can also install this patch together with 6.0.1 GA release and 6.0.5 Patch release
    # ./installVM605P3 -base_path [<601 path>] -mr_path [<605 path>] [<host1> <host2>...]
where the -mr_path should point to the 6.0.5 image directory, while -base_path to the 6.0.1 image.

Install the patch manually:
--------------------------
If the currently installed VRTSvxvm is below 6.0.500.0 level,
upgrade VRTSvxvm to 6.0.500.0 level before installing this patch.

AIX maintenance levels and APARs can be downloaded from the IBM web site:

 http://techsupport.services.ibm.com

1. Since the patch process will configure the new kernel extensions,
        a) Stop I/Os to all the VxVM volumes.
        b) Ensure that no VxVM volumes are in use or open or mounted before starting the installation procedure.
        c) Stop applications using any VxVM volumes.

2. Check whether root support or DMP native support is enabled. If it is enabled, it will be retained after patch upgrade.

# vxdmpadm gettune dmp_native_support


If the current value is "on", DMP native support is enabled on this machine.

# vxdmpadm native list vgname=rootvg

If the output is some list of hdisks, root support is enabled on this machine

3.
a. Before applying this VxVM 6.0.500.300 patch, stop the VEA Server's vxsvc process:
     # /opt/VRTSob/bin/vxsvcctrl stop

b. To apply this patch, use following command:
      # installp -ag -d ./VRTSvxvm.bff VRTSvxvm

c. To apply and commit this patch, use following command:
     # installp -acg -d ./VRTSvxvm.bff VRTSvxvm
NOTE: Please refer installp(1M) man page for clear understanding on APPLY & COMMIT state of the package/patch.
d. Reboot the system to complete the patch  upgrade.
     # reboot

e. Confirm that the point patch is installed:
# lslpp -hac VRTSvxvm | tail -1
f. If root support or dmp native support is enabled in step 2, verify whether it is retained after completing the patch upgrade
# vxdmpadm gettune dmp_native_support
# vxdmpadm native list vgname=rootvg


REMOVING THE PATCH
------------------
Please refer to  Release Notes for uninstall instructions
1. Check whether root support or DMP native support is enabled or not:

      # vxdmpadm gettune dmp_native_support

If the current value is "on", DMP native support is enabled on this machine.

      # vxdmpadm native list vgname=rootvg

If the output is some list of hdisks, root support is enabled on this machine

If disabled: goto step 3.
If enabled: goto step 2.

2. If root support or DMP native support is enabled:

        a. It is essential to disable DMP native support.
        Run the following command to disable DMP native support as well as root support
              # vxdmpadm settune dmp_native_support=off

        b. If only root support is enabled, run the following command to disable root support
              # vxdmpadm native disable vgname=rootvg

        c. Reboot the system
              # reboot

3.
   a. Before backing out patch, stop the VEA server's vxsvc process:
              # /opt/VRTSob/bin/vxsvcctrl stop

    b. To reject the patch if it is in "APPLIED" state, use the following command and re-enable DMP support
              # installp -r VRTSvxvm 6.0.500.300

    c.   # reboot


SPECIAL INSTRUCTIONS
--------------------
VRTSVxVM 6.0.5.100 and VRTSVxVM 6.0.5.200 are linux-specific patches. This is the first patch for AIX on top of VRTSvxvm 6.0.5.


OTHERS
------
NONE