Basic information
Release type: Hot Fix
Release date: 2011-06-09
OS update support: None
Technote: None
Documentation: None
Popularity: 958 viewed    downloaded
Download size: 4.64 MB
Checksum: 592145686

 Applies to one or more of the following products:
Storage Foundation 5.0MP4 On RHEL4 x86-64
Storage Foundation Cluster File System 5.0MP4 On RHEL4 x86-64
Storage Foundation Cluster File System for Oracle RAC 5.0MP4 On RHEL4 x86-64
Storage Foundation for DB2 5.0MP4 On RHEL4 x86-64
Storage Foundation for Oracle 5.0MP4 On RHEL4 x86-64
Storage Foundation for Oracle RAC 5.0MP4 On RHEL4 x86-64
Storage Foundation HA 5.0MP4 On RHEL4 x86-64
Volume Manager 5.0MP4 On RHEL4 x86-64
Volume Replicator 5.0MP4 On RHEL4 x86-64

 Obsolete patches, incompatibilities, superseded patches, or other requirements:

This patch requires: Release date
sfha-rhel4_x86_64-5.0MP4RP1 2011-05-16

 Fixes the following incidents:
1862427, 2076328, 2125708, 2192613, 2232056, 2317714, 2326872, 2327594, 2328343, 2353399, 2353420, 2357819, 2367680, 2367794, 2369790, 2369860, 2374225, 2374724, 2380442, 2383159, 2385695, 2390433, 2397636, 2398349

 Patch ID:

Readme file
                          * * * READ ME * * *
             * * * Veritas Volume Manager 5.0 MP4 RP1 * * *
                         * * * Hot Fix 1 * * *
                         Patch Date: 2011.06.09

This document provides the following information:


Veritas Volume Manager 5.0 MP4 RP1 Hot Fix 1


   * Veritas Volume Manager 5.0 MP4 RP1
   * Veritas Storage Foundation for Oracle RAC 5.0 MP4 RP1
   * Veritas Storage Foundation Cluster File System 5.0 MP4 RP1
   * Veritas Volume Replicator 5.0 MP4 RP1
   * Veritas Storage Foundation 5.0 MP4 RP1
   * Veritas Storage Foundation High Availability 5.0 MP4 RP1
   * Veritas Storage Foundation for DB2 5.0 MP4 RP1
   * Veritas Storage Foundation for Oracle 5.0 MP4 RP1
   * Veritas Storage Foundation Cluster File System for Oracle RAC 5.0 MP4 RP1

RHEL4 x86-64
RHEL5 x86-64
RHEL5 ppc64
SLES9 x86-64
SLES10 x86-64
SLES10 ppc64
SLES11 x86-64
SLES11 ppc64

This patch fixes the following Symantec incidents:

Patch ID:

* 1862427 (Tracking ID: 1507567)

I/Os to hang on master and slave when loss of all storage connectivity from the 
master node with Local Detach Policy set and with FMR enabled on the mirrored 

This problem is caused by the LDP cluster-wide check running on the DCO volume 
itself because it is a mirror. This needs to be fixed as fmr3 is the default 
log type. Essentially the master cannot allow local faults on the dco volume. 
This needs to be treated as a global failure.

Fix is to correct the behavior by preventing the cluster-wide check from 
running if the volume is a DCO volume and the I/Os should complete as expected 
on the slave nodes and fail on the master as expected.

* 2076328 (Tracking ID: 2052203)

Immediately after vxconfigd is restarted on master, following error is seen
"Slave not communicating" when commands (that result in transactions) are executed
on master node.

All slaves need to re-establish connection with master vxconfigd after it is
restarted. Before this, if any command is executed we get the above error.

The fix is to retry the command (as this error is a retry error) with some delay.
During this time slaves are expected to re-establish the connection with master

* 2125708 (Tracking ID: 2218470)

When "service -status-all" or "service vxvm-recover status" are executed, the
duplicate instance of the Veritas Volume Manager(VxVM) daemons (vxattachd,
vxcached, vxrelocd, vxvvrsecdgd and vxconfigbackupd) are invoked.

The startup scripts of VxVM for Linux are not Linux Standard Base (LSB) 
compliant. Hence, the execution of the following command results in starting up 
new instance.

# vxvm-recover status

The VxVM startup scripts are modified to be LSB compliant.

* 2192613 (Tracking ID: 2192612)

HP XP1024 ASL is claiming some LUNs from the HP EVA array.

The HP XP ASL (libvxxp1281024) is claiming LUNs based on LUN serial number 
ignoring the PID check. This leads to claiming some LUNs that do not belong 
XP array.

Modified the HP XP ASL to verify the PID and only claim if the check passes.

* 2232056 (Tracking ID: 2230377)

Differences based synchronization (Diff sync) fails for volumes/RVG sizes 
greater than 1TB.

The diff sync fails to calculate the correct RVG/volume size for objects 
over 1TB causing the diff sync to loop/fail.

Hotfix binaries of vxrsyncd have been generated with the code fixes.

* 2317714 (Tracking ID: 2317703)

When the vxesd daemon is invoked by device attach & removal operations in a loop,
it leaves open file descriptors with vxconfigd daemon

The issue is caused due to multiple vxesd daemon threads trying to establish
contact with vxconfigd daemon at the same time and ending up using losing track of
the file descriptor through which the communication channel was established

The fix for this issue is to maintain a single file descriptor that has a thread
safe reference counter thereby not having multiple communication channels
established between vxesd and vxconfigd by various threads of vxesd.

* 2326872 (Tracking ID: 1676771)

In CVM (Cluster Volume Manager) environment, I/O from a slave node on space
optimized snapshots can hang if storage of underlying cache volume is 

I/O space optimized volumes require new translations on cache object. In CVM,
master node takes care of allocating all cache object translations. When storage
gets disconnected while a translation is being allocated on master node, the
error processing code incorrectly updates the pending iocount and other resource
information which leads to I/O hang on slave node.

The error processing code is modified to update pending iocount and other
resources correctly.

* 2327594 (Tracking ID: 2233611)

Hitachi ASL (libvxhdsusp) doesn't claim the VSP R700 array. This ASL also  
performs SCSI inquiry on pages 0xE0 and 0xE3 without 
checking if these pages are supported by the array, this leads to possible 
warning messages  in array logs.

The Hitachi ASL (libvxhdsusp) needs to be modified to recognize the VSP 
(R700) array. The ASL also needs to be modified to 
  Inquire on SCSI pages 0xE0 and 0xE3 if and only if these pages are supported 
by the array.

Modified the ASL as follows:
    -	Claim the Hitachi VSP array, which is capable of Thin Provisioning.
    -	Renamed the arrays from "TagmaStore" to "Hitachi"
    -	Inquire on SCSI pages 0xE0 and 0xE3 only if these pages are supported 
by the array

* 2328343 (Tracking ID: 2147922)

Thin reclamation for EMC VMAX array (flarecode level 5875) is not supported
by the ASL

Thin reclamation support is not present in the EMC VMAX ASL, libvxemc.sl
(ASL_VERSION = vm-5.0-rev-3)

Thin reclamation support for EMC VMAX array (flarecode level 5875 and later)
is added to the ASL, libvxemc.sl. The new ASL_VERSION is "vm-5.0-rev-4"

* 2328343 (Tracking ID: 2147922)

Thin reclamation for EMC VMAX array (flarecode level 5875) is not supported
by the ASL

Thin reclamation support is not present in the EMC VMAX ASL, libvxemc.sl
(ASL_VERSION = vm-5.0-rev-3)

Thin reclamation support for EMC VMAX array (flarecode level 5875 and later)
is added to the ASL, libvxemc.sl. The new ASL_VERSION is "vm-5.0-rev-4"

* 2353399 (Tracking ID: 1969526)

System panic with the following stack trace:


When a hung I/O comes back, in iodone routine it tries to find the disk on which 
this diskio is issued. If the disk is being
moved to another diskgroup at the same time, panic will occur trying to access 
NULL disk.

Modifed the code not to move a disk from one disk group to another if there are 
any pending I/Os issued previously on this disk.

* 2353420 (Tracking ID: 2334534)

In CVM (Cluster Volume Manager) environment, a node (SLAVE) join to the cluster
is getting stuck and leading to unending join hang unless join operation is
stopped on joining node (SLAVE) using command '/opt/VRTS/bin/vxclustadm
stopnode'. While CVM join is hung in user-land (also called as vxconfigd level
join), on CVM MASTER node, vxconfigd (Volume Manager Configuration daemon)
doesn't respond to any VxVM command, which communicates to vxconfigd process.

When vxconfigd level CVM join is hung in user-land, "vxdctl -c mode" on joining
node (SLAVE) displays an output such as:
     bash-3.00#  vxdctl -c mode
     mode: enabled: cluster active - SLAVE
     master: mtvat1000-c1d
     state: joining
     reconfig: vxconfigd in join

As part of a CVM node join to the cluster, every node in the cluster updates the
current CVM membership information (membership information which can be viewed
by using command '/opt/VRTS/bin/vxclustadm nidmap') in kernel first and then
sends a signal to vxconfigd in user land to use that membership in exchanging
configuration records among each others. Since each node receives the signal
(SIGIO) from kernel independently, the joining node's (SLAVE) vxconfigd is ahead
of the MASTER in its execution. Thus any requests coming from the joining node
(SLAVE) is denied by MASTER with the error "VE_CLUSTER_NOJOINERS" i.e. join
operation is not currently allowed (error number: 234) since MASTER's vxconfigd
has not got the updated membership from the kernel yet. While responding to
joining node (SLAVE) with error "VE_CLUSTER_NOJOINERS", if there is any change
in current membership (change in CVM node ID) as part of node join then MASTER
node is wrongly updating the internal data structure of vxconfigd, which is
being used to send response to joining (SLAVE) nodes. Due to wrong update of
internal data structure, later when the joining node retries its request, the
response from master is sent to a wrong node, which doesn't exist in the
cluster, and no response is sent to the joining node. Joining node (SLAVE) never
gets the response from MASTER for its request and hence CVM node join is not
completed and leading to cluster hang.

vxconfigd code is modified to handle the above mentioned scenario effectively. 
vxconfid on MASTER node will process connection request coming from joining node
(SLAVE) effectively only when MASTER node gets the updated CVM membership
information from kernel.

* 2357819 (Tracking ID: 2357798)

VVR leaking memory due to unfreed vol_ru_update structure. Memory leak is very
small but it can accumulate to big value if VVR is running for many days.

VVR allocates update structure for each write, if replication is up-to-date then
next write coming in will also create multi-update and add it to VVR replication
queue. While creating multi-update, VVR wrongly marked the original update with
flag, which means that update is in replication queue, but it was never added(not
required) to replication queue. When update free routine is called it check if
update has flag marked then don't free it, assuming that update is still in
replication queue, it will get free while remove it from queue. Since update was
not in the queue it will never get free and leak the memory. Memory leak will
happen for only first write coming after each time rlink become up-to-date, that
is reason it will take many days to leak big memory.

Marking of flag for some updates was causing this memory leak, flag marking is not
required as we are not adding update into replication queue. Fix is to remove
marking and checking of flag.

* 2367680 (Tracking ID: 2291226)

Data corruption can be observed on a CDS (Cross-platform Data Sharing) disk, 
whose capacity is more than 1 TB. The following pattern would be found in the 
data region of the disk.

<DISK-IDENTIFICATION> cyl <number-of-cylinders> alt 2 hd <number-of-tracks> sec 

The CDS disk maintains a SUN vtoc in the zeroth block of the disk. This VTOC 
maintains the disk geometry information like number of cylinders, tracks and 
sectors per track. These values are limited by a maximum of 65535 by design of 
SUN's vtoc, which limits the disk capacity to 1TB. As per SUN's requirement, 
few backup VTOC labels have to be maintained on the last track of the disk.

VxVM 5.0 MP3 RP3 allows to setup CDS disk on a disk with capacity more than 
1TB. The data region of the CDS disk would span more than 1TB utilizing all the 
accessible cylinders of the disk. As mentioned above, the VTOC labels would be 
written at zeroth block and on the last track considering the disk capacity as 
1TB. The backup labels would fall in to the data region of the CDS disk causing 
the data corruption.

Suppress writing the backup labels to prevent the data corruption.

* 2367794 (Tracking ID: 2361295)

In CVM (Cluster Volume Manager) environment, CVM reconfiguration is getting 
stuck at vxconfigd (VxVM configuration daemon) level join in user-land. When 
vxconfigd level CVM join is hung in user-land, "vxdctl -c mode" on joining node 
(SLAVE) displays an output such as:
                bash-3.00#  vxdctl -c mode
                mode: enabled: cluster active - SLAVE
                master: mtvat1000-c1d 
                state: joining
                reconfig: vxconfigd in join

The vxconfigd hang occurs when a node (SLAVE) is trying to join the cluster and 
vxconfigd on MASTER node is either dead or restarted approximately at the same 
time. As part of user-level vxconfigd join in CVM reconfiguration, before 
sending connection request to MASTER 's vxconfigd, SLAVE vxconfigd checks 
whether vxconfigd on MASTER node is available to accept its connection request. 
If not then SLAVE doesn't send the connection request to MASTER. Once vxconfigd 
on MASTER node comes back enabled or becomes active, SLAVE vxconfigd doesn't 
resend the connection request to MASTER vxconfigd, which is leading to 
vxconfigd level join hang.

vxconfigd code is modified to resend the connection request from joining node 
(SLAVE) to MASTER's vxconfigd when vxconfigd on MASTER node comes back enabled.

* 2369790 (Tracking ID: 2369786)

On VVR Secondary cluster, if SRL disk goes bad then, then vxconfigd may hang in
transaction code path.

In case of any error seen in VVR shared disk group environments, error handling
is done cluster wide. On VVR Secondary, if SRL disk goes bad due to some
temporary or actual disk failure, it starts cluster wide error handling. Error
handling requires serialization, in some cases we didn't do serialization which
caused error handling to go in dead loop hence the hang.

Making sure we always serialize the I/O during error handling on VVR Secondary
resolved this issue.

* 2369860 (Tracking ID: 1951062)

The inconsistent path information leads to disabling of the DMP node,
which may result in disabling the file system residing over the DMP node.


The following command outputs show inconsistency. The DMP node emc_lariion1_121
is shown as "DISABLED" though two of its secondary paths are in "active enabled"

(1) The following command displays the number of sub-paths of given DMP node
    and their respective states. This command indicates that the DMP node has two
    active paths and two failed paths.

#vxdmpadm getsubpaths dmpnodename=emc_clariion1_121s2
c2t5006016746E003AEd42s2 ENABLED(A) SECONDARY    c2         EMC_CLARiiON 
emc_clariion1    -
c2t5006016F46E003AEd42s2 DISABLED   PRIMARY      c2         EMC_CLARiiON 
emc_clariion1    -
c3t5006016146E403AEd42s2 ENABLED(A) SECONDARY    c3         EMC_CLARiiON 
emc_clariion1    -
c3t5006016946E403AEd42s2 DISABLED   PRIMARY      c3         EMC_CLARiiON 
emc_clariion1    -

(2) The following command displays the state of given DMP node. This command
indicates that the DMP node is in "DISABLED" state as all paths are in "failed"

# vxdmpadm getlungroup dmpnodename=emc_clariion1_121
emc_clariion1_121    DISABLED      EMC_CLARiiON  4      0     4     

The inconsistency  shown in the above example is an after effect of
an event in the past when all the paths of a DMP node fail at once.
The DMP driver is a multi threaded code. In error conditions DMP changes the
state of the path from "active enabled" state to "failed disabled" state. A DMP
thread, while changing the state of the last path of the DMP node, attempts to
revive remaining paths. Before proceeding to revive remaining paths, the DMP
thread, relinquishes previously held locks, which is mandatory. After reviving
the paths, the same DMP thread acquires the locks to update the active, failed
counts. During the time where the locks are released, any other DMP thread can
mark the same path as failed and update the active, fail path counts. The
earlier thread after completion of path revival, acquires the locks to update
the state of the same path. Because of this race condition, the fail, active
counts can be updated twice resulting in inconsistency.

When DMP reacquires the lock, after doing path restoration, before
updating the unusable and healthy path count, it will ensure that no other
thread has already marked the same path as unusable.

* 2374225 (Tracking ID: 2355706)

In a CVM environment, when cache object was full for the space-optimized
snapshots, IO hang might happen for the volume when LDP(local detach policy) was

As part of LDP processing, the master node broadcasts the CVM_MSG_CHECK_REPAIR
request. The slave node did not respond correctly to this request when the volume
was detached. As the master node did not get the respond, the hang could happen.

When slave nodes get CVM_MSG_CHECK_REPAIR, it will respond to master even if the
volume is detached.

* 2374724 (Tracking ID: 2374459)

After an I/O failure, I/O hangs while trying to access data stored in a cache 

When a cache object experience an I/O failure, error recovery is initiated. As
part of the error recovery the cache object waits for all outstanding 
I/Os to complete. If an I/O is waiting in the cache objects replay queue, but
the I/O it is waiting for has initiated the error processing, then the I/O in
the replay queue can hang indefinitely.

When we get an I/O on a cache object, before completing the error handling for 
that I/O, we check that no one is waiting for the I/O in the cache objects
replay queue.

* 2380442 (Tracking ID: 2385680)

The vol_rv_async_childdone() panic occurred because of corrupted pripendingq

The pripendingq is always corrupted in this panic. The head
entry is always freed from the queue but not removed. In mdship_srv_done code, 
for error condition, we remove the update from pripendingq only if the next or 
prev pointers of updateq is non-null. This leads to the head pointer not 
getting removed in the abort scenerio and causing the free to happen without 
deleting it from the queue.

The prev and next checks are removed in all the places. Also handled the abort 
case carefully for the following conditions:

1) abort logendq due to slave node panic (i.e.) this has the update entry but 
the update is not removed from the pripendingq.

2) vol_kmsg_eagain type of failures, (i.e.) the update is there, but it is 
removed from the pripendingq.

3) abort very early in the mdship_sio_start() (i.e.) update is allocated but 
not in pripendingq.

* 2383159 (Tracking ID: 2383158)

The panic in vol_rv_mdship_srv_done() due to sio is freed and having the 
invalid node pointer.

The vol_rv_mdship_srv_done() is panicking at referencing wrsio->wrsrv_node as 
the wrsrv_node is having the invalid pointer.It is also observed that the wrsio 
is freed or allocated for different SIO. Looking closely, the 
vol_rv_check_wrswaitq() is called at every done of the SIO, which looks into 
the waitq and releases all the SIO which has RV_WRSHIP_SRV_SIO_FLAG_LOGEND_DONE 
flag set on it. In vol_rv_mdship_srv_done(), we set this flag and do more 
operations on wrsrv. During this time the other SIO which is completed with the 
DONE, calls the function vol_rv_check_wrswaitq() and deletes the SIO of it own 
and other SIO which has the RV_WRSHIP_SRV_SIO_FLAG_LOGEND_DONE flag set. This 
leads to deleting the SIO which is on the fly, and is causing the panic.

The flag must be set just before calling the function vol_rv_mdship_srv_done(), 
and at the end of the SIOdone() to avoid other SIO's to race and delete the 
current running one.

* 2385695 (Tracking ID: 2385694)

After issuing I/O's to both the master and slave and subsequently rebooting the 
slave node is leading to all the I/O's hung in the master.

On slave panic, the master node is hung in sending the I/O's. The core analysis
revealed that the I/O's are stuck in volilock waiting for the volilock_release()
on one of the SIO (staged IO within VxVM). The SIO in il_cb_private is freed 
and therefore the ilock is never released, and the I/O's pending behind it are 
queued/hung. The observation is that the SIO is freed/done'd without invoking 
the volilock_release().

At one place in vol_rv_preprocess() code, the rvsio is freed but not checked 
for the possible rvsio->rvsio_ilock allocated for it. The fix is to check if 
the rvsio_ilock is present and just release the lock before deleting the sio.

* 2390433 (Tracking ID: 2390431)

In a Disaster Recovery environment, when DCM (Data Change Map) is active and 
during SRL(Storage Replicator Log)/DCM flush, the system panics due to missing
parent on one of the DCM in an RVG (Replicated Volume Group).

The DCM flush happens during every log update and its frequency depends on the 
IO load. If the I/O load is high, the DCM flush happens very often and if there 
are more volumes in the RVG, the frequency is very high. Every DCM flush 
triggers the DCM flush on all the volumes in the RVG. If there are 50 volumes, 
in an RVG, then each DCM flush creates 50 children and is controlled by one 
parent SIO. Once all the 50 children are done, then the parent SIO releases 
itself for the next flush. Once the DCM flush of each child completes, it 
detaches itself from the parent by setting the parent field to NULL. It so 
happens that, if the 49th child is done and before it is detaching it from the 
parent, the 50th child completes and releases the parent_SIO for the next DCM 
flush. Before the 49th child detaches, the new DCM flush is started on the same 
50th child. After the next flush is started, the 49th child of the previous 
flush detaches itself from the parent and since it is a static SIO, it 
indirectly resets the new flush parent field. Also, the lock is not obtained 
before modifing the sio state field in a few scenarios.

Before reducing the children count, detach the parent first. This will make 
sure the new flush will not race with the previous flush. Protect the field 
with the required lock in all the scenarios.

* 2397636 (Tracking ID: 2165394)

If the cloned copy of a diskgroup and a destroyed diskgroup exists on the 
system, an import operation imports destroyed diskgroup instread of cloned one.
For example, consider a system with diskgroup dg containing disk disk1. Disk 
disk01 is cloned to disk02. When diskgroup dg containing disk01 is destroyed and 
diskgroup dg is imported, VXVM should import dg with cloned disk i.e disk02. 
However, it imports the diskgroup dg with disk01.

After destroying a diskgroup, if the cloned copy of the same diskgroup exists on 
the system, the following disk group import operation wrongly identifies the 
disks to be import and hence destroyed diskgroup gets imported.

The diskgroup import code is modified to identify the correct diskgroup when a 
cloned copy of the destroyed diskgroup exists.

* 2398349 (Tracking ID: 1977253)

Case 1: In a Cluster Volume Manager environment, slave is not showing 
the "clone_disk" flag in "vxdisk list" output after the disk group gets 
imported with clone disks. But master shows the "clone_disk" flag. 

Case 2: A VxVM diskgroup import with clone disks can happen in two scenarios as 
shown below:

a) with explicit option "useclonedev=on"
b) If there exists only clone disks, "vxdg import" command imports the disk 
group using the clone disks, without any user option.

In Case 2, "vxdg import" command is not setting the "clone_disk" flag.

Case 1: Updating the UDIDs (unique disk ids) on the disk, and setting 
the "clone_disk" flag are not being done atomically. Master updates the UDID, 
and sends the disk information to the slaves. However, the slave is not aware 
of the clone disks. Later on, Master updates the disk with "clone_disk" flag. 
Due to this slave is not showing the clone disk flag.

Case 2: It is updating the "clone_disk" flag only if user 
gives "useclonedev=on" option. If it implicitly decides to import using clone 
disks( i.e. if there exists only clone disks), it is NOT setting 
the "clone_disk" flag.

Updating the UDID on the disk, and setting the "clone disk" are done in the 
same routine ensuring atomicity. 

Case 1: Master updates the disk with "clone disk" flag, before sending the disk 
information to the slave.

Case 2: "close disk" flag is set based on UDID change. So even "vxdg import" 
implicitly does import using clone disks, it sets the "clone_disk" flag.

# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -Uhv VRTSvxvm-common-
# rpm -Uhv VRTSvxvm-platform-

# rpm -e  <rpm-name>