README VERSION		: 1
README Creation Date	: 2011-09-23
PATCH-ID 		: PHCO_42245 
PATCH NAME		: VRTSvxvm 5.1SP1RP1
BASE PACKAGE NAME	: VRTSvxvm
BASE PACKAGE VERSION	: 5.1.100.000 / 5.1.100.001
OBSOLETE PATCHES	: NONE
SUPERSEDED PATCHES	: NONE
REQUIRED PATCHES	: PHKL_42246
INCOMPATIBLE PATCHES	: NONE
SUPPORTED PADV		: hpux1131
(P-Platform , A-Architecture , D-Distribution , V-Version)
PATCH CATEGORY		:  CORE ,  CORRUPTION ,  HANG ,  MEMORYLEAK ,  PERFORMANCE
REBOOT REQUIRED		: NO


 Patch Id::PHCO_42245

 * Incident no::2163809	 Tracking ID ::2151894

 Symptom::Internal testing utility volassert prints a message:

Volume TCv1-548914: recover_offset=0, expected 1024

 Description::We changed the behavior of recover_offset in other incident by resetting it
back to zero after starting a volume. This works okay for normal cases but not for
raid5 volumes.

 Resolution::Recover offset will be set at the end of a volume after grow/init operations.

 * Incident no::2198041	 Tracking ID ::2196918

 Symptom::When creating a space-opimized snapshot by specifying cache-object size
either in percentage terms of the volume size or an absolute size, the snapshot
creation can fail with an error similar to following:
"VxVM vxassist ERROR V-5-1-10127 creating volume snap-dvol2-CV01:
        Volume or log length violates disk group alignment"

 Description::VxVM expects all virtual storage objects to have size aligned to a
value which is set diskgroup-wide. One can get this value with:
# vxdg list testdg|grep alignment
alignment: 8192 (bytes)

When the cachesize is specified in percentage, the value might not align with dg
alignment. If not aligned, the creation of the cache-volume could fail with
specified error message

 Resolution::After computing the cache-size from specified percentage value, it
is aligned up to the diskgroup alignment value before trying to create the
cache-volume.

 * Incident no::2204146	 Tracking ID ::2200670

 Symptom::Some disks are left detached and not recovered by vxattachd.

 Description::If the shared disk group is not imported or node is not part of the 
cluster when storage connectivity to failed node is restored, the vxattachd 
daemon does not getting notified about storage connectivity restore and does not 
trigger a reattach. Even if the disk group is later imported or the node is 
joined to CVM cluster, the disks are not automatically reattached.

 Resolution::i) Missing events for a deported diskgroup: The fix handles this by 
listening to the import event of the diksgroup and triggers the brute-force 
recovery for that specific diskgroup.
ii) parallel recover of volumes from same disk: vxrecover automatically 
serializes the recovery of objects that are from the same disk to avoid the back 
and forth head movements. Also provided an option in vxattchd and vxrecover to 
control the number of parallel recovery that can 
happen for objects from the same disk.

 * Incident no::2214184	 Tracking ID ::2202710

 Symptom::Transactions on Rlink are not allowed during SRL to DCM flush.

 Description::Present implementation doesn't allow rlink transaction to go through if SRL
to DCM flush is in progress. As SRL overflows, VVR start reading from SRL and
mark the dirty regions in corresponding DCMs of data volumes, it is called SRL
to DCM flush. During SRL to DCM flush transactions on rlink is not allowed. Time
to complete SRL flush depend on SRL size, it could range from minutes to many
hours. If user initiate any transaction on rlink then it will hang until SRL
flush completes.

 Resolution::Changed the code behavior to allow rlink transaction during SRL flush. Fix stops
the SRL flush for transaction to go ahead and restart the flush after
transaction completion.

 * Incident no::2232829	 Tracking ID ::2232789

 Symptom::With NetApp metro cluster disk arrays, takeover operations (toggling of LUN
ownership within NetApp filer) can lead to IO failures on VxVM volumes.

Example of an IO error message at VxVM
VxVM vxio V-5-0-2 Subdisk disk_36-03 block 24928: Uncorrectable write error

 Description::During the takeover operation, the array fails the PGR and IO SCSI commands on
secondary paths with the following transient error codes - 0x02/0x04/0x0a
(NOT READY/LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC ACCESS STATE TRANSITION) or
0x02/0x04/0x01 (NOT READY/LOGICAL UNIT IS IN PROCESS OF BECOMING READY) -  
that are not handled properly within VxVM.

 Resolution::Included required code logic within the APM so that the SCSI commands with
transient errors are retried for the duration of NetApp filer reconfig time (60
secs) before failing the IO's on VxVM volumes.

 * Incident no::2234292	 Tracking ID ::2152830

 Symptom::Sometimes the storage admins create multiple copies/clones of the same device. 
Diskgroup import fails with a non-descriptive error message when multiple
copies(clones) of the same device exists and original device(s) are either
offline or not available.

# vxdg import mydg
VxVM vxdg ERROR V-5-1-10978 Disk group mydg: import failed: 
No valid disk found containing disk group

 Description::If the original devices are offline or unavailable, vxdg import picks
up cloned disks for import. DG import fails by design unless the clones
are tagged and tag is specified during DG import. While the import
failure is expected, but the error message is non-descriptive and
doesn't provide any corrective action to be taken by user.

 Resolution::Fix has been added to give correct error meesage when duplicate clones
exist during import. Also, details of duplicate clones is reported in
the syslog.

Example:

[At CLI level]
# vxdg import testdg             
VxVM vxdg ERROR V-5-1-10978 Disk group testdg: import failed:
DG import duplcate clone detected

[In syslog]
vxvm:vxconfigd: warning V-5-1-0 Disk Group import failed: Duplicate clone disks are
detected, please follow the vxdg (1M) man page to import disk group with
duplicate clone disks. Duplicate clone disks are: c2t20210002AC00065Bd0s2 :
c2t50060E800563D204d1s2  c2t50060E800563D204d0s2 : c2t50060E800563D204d1s2

 * Incident no::2241149	 Tracking ID ::2240056

 Symptom::'vxdg move/split/join' may fail during high I/O load.

 Description::During heavy I/O load 'dg move' transcation may fail because of open/close 
assertion and retry will be done. As the retry limit is set to 30 'dg move' 
fails if retry hits the limit.

 Resolution::Change the default transaction retry to unlimit, introduce a new option 
to 'vxdg move/split/join' to set transcation retry limit as follows:

vxdg [-f] [-o verify|override] [-o expand] [-o transretry=retrylimit] move 
src_diskgroup dst_diskgroup objects ...

vxdg [-f] [-o verify|override] [-o expand] [-o transretry=retrylimit] split 
src_diskgroup dst_diskgroup objects ...

vxdg [-f] [-o verify|override] [-o transretry=retrylimit] join src_diskgroup 
dst_diskgroup

 * Incident no::2247645	 Tracking ID ::2243044

 Symptom::Initialization of VxVM cdsdisk layout on a disk with greater than or equal to 1 
TB fails on HP-UX/ia-64 platform.

 Description::During the initialization of VxVM cdsdisk layout on a disk with size greater 
than or equal to 1 TB on HP-UX/ia-64 platform, we were checking if the disk has 
an existing HP-UX LVM layout.  During this check, the i/o was performed 
incorrectly using the DMP (Dynamic Multi-Pathing Driver) paths associated with 
the existing GPT partitions on the disk.

 Resolution::This issue has been resolved by performing i/o to the whole device path during 
the initialization of VxVM cdsdisk layout on a disk with size greater than or 
equal to 1 TB.

 * Incident no::2253269	 Tracking ID ::2263317

 Symptom::vxdg(1M) man page does not clearly describe diskgroup import and destroy 
operations for the case in which original diskgroup is destroyed and cloned 
disks are present.

 Description::Diskgroup import with dgid is cosidered as a recovery operation. Therefore, 
while 
importing with dgid, even though the original diskgroup is destroyed, both the 
original as well as cloned disks are considered as available disks. Hence, the 
original diskgroup is imported in such a scenario.
The existing vxdg(1M) man page does not clearly describe this scenario.

 Resolution::Modified the vxdg(1M) man page to clearly describe the scenario.

 * Incident no::2256728	 Tracking ID ::2248730

 Symptom::Command hungs if "vxdg import" called from script with STDERR
redirected.

 Description::If script is having "vxdg import" with STDERR redirected then
script does not finish till DG import and recovery is finished. Pipe between
script and vxrecover is not closed properly which keeps calling script waiting
for vxrecover to complete.

 Resolution::Closed STDERR in vxrecover and redirected the output to
/dev/console.

 * Incident no::2316309	 Tracking ID ::2316297

 Symptom::The following error messages are printed on the console every time system
boots.           
 VxVM vxdisk ERROR V-5-1-534 Device [DEVICE NAME]: Device is in use

 Description::During system boot up, while Volume Manager diskgroup imports, vxattachd daemon
tries to online the disk. Since the disk may be already online sometimes, an
attempt to re-online disk gives the below error message:
 VxVM vxdisk ERROR V-5-1-534 Device [DEVICE NAME]: Device is in use

 Resolution::The solution is to check if the disk is already in "online" state. If so, avoid
reonline.

 * Incident no::2323999	 Tracking ID ::2323925

 Symptom::If the rootdisk is under VxVM control and /etc/vx/reconfig.d/state.d/install-db
file exists, the following messages are observed on the console:

UX:vxfs fsck: ERROR: V-3-25742: /dev/vx/dsk/rootdg/homevol:sanity check 
failed: cannot open /dev/vx/dsk/rootdg/homevol: No such device or address
UX:vxfs fsck: ERROR: V-3-25742: /dev/vx/dsk/rootdg/optvol:sanity check failed: 
cannot open /dev/vx/dsk/rootdg/optvol: No such device or address

 Description::In the vxvm-startup script, there is check for the
/etc/vx/reconfig.d/state.d/install-db file. If the install-db file exist on the
system, the VxVM assumes that volume manager is not configured and does not
start volume configuration daemon "vxconfigd". "install-db" file somehow existed
on the system for a VxVM rootable system, this causes the failure.

 Resolution::If install-db file exists on the system and the system is VxVM rootable, the
following warning message is displayed on the console:
"This is a VxVM rootable system.
 Volume configuration daemon could not be started due to the presence of
 /etc/vx/reconfig.d/state.d/install-db file.
 Remove the install-db file to proceed"

 * Incident no::2328219	 Tracking ID ::2253552

 Symptom::vxconfigd leaks memory while reading the default tunables related to
smartmove (a VxVM feature).

 Description::In Vxconfigd, memory allocated for default tunables related to
smartmove feature is not freed causing a memory leak.

 Resolution::The memory is released after its scope is over.

 * Incident no::2328286	 Tracking ID ::2244880

 Symptom::Initialization of VxVM cdsdisk layout on a disk with size greater than or equal 
to 1 TB fails.

 Description::During the initialization of VxVM cdsdisk layout on a disk with size greater 
than 
or equal to 1 TB, alternative (backup) GPT label is written in the last 33 
sectors of the disk.  SCSI pass-thru mode was used to write the alternative GPT 
label.  SCSI pass-thru mode could not handle device offsets equal to or greater 
than 1 TB

 Resolution::This issue has been resolved by using the posix system calls to write the 
alternative GPT label during initialization of VxVM cdsdisk layout.

 * Incident no::2337091	 Tracking ID ::2255182

 Symptom::If EMC CLARiiON arrays are configured with different failovermode for each host 
controllers ( e.g. one HBA has failovermode set as 1 while the other as 2 ), 
then VxVM's vxconfigd demon dumps core.

 Description::DDL (VxVM's Device Discovery Layer) determines the array type depending on the 
failovermode setting. DDL expects the same array type to be returned across all 
the paths going to that array. This fundamental assumption of DDL will be broken
with different failovermode settings thus leading to vxconfigd core dump.

 Resolution::Validation code is added in DDL to detect such configurations and emit 
appropriate warning messages to the user to take corrective actions and skips
the later set of paths that are reporting different array type.

 * Incident no::2349653	 Tracking ID ::2349352

 Symptom::Data corruption is observed on DMP device with single path during Storage 
reconfiguration (LUN addition/removal).

 Description::Data corruption can occur in the following configuration, when new LUNs are 
provisioned or removed under VxVM, while applications are on-line.
 
1. The DMP device naming scheme is EBN (enclosure based naming) and 
persistence=no
2. The DMP device is configured with single path or the devices are controlled 
by Third Party Multipathing Driver (Ex: MPXIO, MPIO etc.,)
 
There is a possibility of change in name of the VxVM devices (DA record), when 
LUNs are removed or added followed by the following commands, since the 
persistence naming is turned off.
 
(a) vxdctl enable
(b) vxdisk scandisks
 
Execution of above commands discovers all the devices and rebuilds the device 
attribute list with new DMP device names. The VxVM device records are then 
updated with this new attributes. Due to a bug in the code, the VxVM device 
records are mapped to wrong DMP devices. 
 
Example:
 
Following are the device before adding new LUNs.
 
sun6130_0_16 auto            -            -            nolabel
sun6130_0_17 auto            -            -            nolabel
sun6130_0_18 auto:cdsdisk    disk_0       prod_SC32    online nohotuse
sun6130_0_19 auto:cdsdisk    disk_1       prod_SC32    online nohotuse
 
The following are after adding new LUNs
 
sun6130_0_16 auto            -            -            nolabel
sun6130_0_17 auto            -            -            nolabel
sun6130_0_18 auto            -            -            nolabel
sun6130_0_19 auto            -            -            nolabel
sun6130_0_20 auto:cdsdisk    disk_0       prod_SC32    online nohotuse
sun6130_0_21 auto:cdsdisk    disk_1       prod_SC32    online nohotuse
 
The name of the VxVM device sun6130_0_18 is changed to sun6130_0_20.

 Resolution::The code that updates the VxVM device records is rectified.

 * Incident no::2353327	 Tracking ID ::2179259

 Symptom::When using disks of size > 2TB and the disk encounters a media error with offset >
2TB while the disk responds to SCSI inquiry, data corruption can occur incase of a
write operation

 Description::The I/O rety logic in DMP assumes that the I/O offset is within 2TB limit and
hence when using disks of size > 2TB and the disk encounters a media error with
offset > 2TB while the disk responds to SCSI inquiry, the I/O would be issued on a
wrong offset within the 2TB range causing data corruption incase of write I/Os.

 Resolution::The fix for this issue to change the I/O retry mechanism to work for >2TB offsets
as well so that no offset truncation happens that could lead to data corruption

 * Incident no::2353328	 Tracking ID ::2194685

 Symptom::vxconfigd dumps core in scenario where array side ports are disabled/enabled in
loop for some iterations. 

gdb) where
#0  0x081ca70b in ddl_delete_node ()
#1  0x081cae67 in ddl_check_migration_of_devices ()
#2  0x081d0512 in ddl_reconfigure_all ()
#3  0x0819b6d5 in ddl_find_devices_in_system ()
#4  0x0813c570 in find_devices_in_system ()
#5  0x0813c7da in mode_set ()
#6  0x0807f0ca in setup_mode ()
#7  0x0807fa5d in startup ()
#8  0x08080da6 in main ()

 Description::Due to disabling the array side ports, the secondary paths get removed. But the
primary paths are reusing the devno of the removed secondary paths which is not
correctly handled in current migration code. Due to this, the DMP database gets
corrupted and subsequent discoveries lead to configd core dump.

 Resolution::The issue is due to incorrect setting of a DMP flag.
The flag settting has been fixed to prevent the  DMP database from corruption in
the mentioned scenario.

 * Incident no::2353403	 Tracking ID ::2337694

 Symptom::"vxdisk -o thin list" displays size as 0 for thin luns of capacity greater than 
2 TB.

 Description::SCSI READ CAPACITY ioctl is invoked to get the disk capacity.  SCSI READ 
CAPACITY returns data in extended data format if a disk capacity is 2 TB or 
greater.  This extended data was parsed incorectly while calculating the disk 
capacity.

 Resolution::This issue has been resolved by properly parsing the extended data returned by 
SCSI READ CAPACITY ioctl for disks of size greater than 2 TB or greater.

 * Incident no::2353404	 Tracking ID ::2334757

 Symptom::Vxconfigd consumes a lot of memory when the DMP tunable
dmp_probe_idle_lun is set on.  "pmap" command on vxconfigd process shows
continuous growing heap.

 Description::DMP path restoration daemon probes idle LUNs(Idle LUNs are VxVM disks on
which no I/O requests are scheduled) and generates notify events to vxconfigd. 
        Vxconfigd in turn send the nofification of these events to its clients.
For any reasons, if vxconfigd could not deliver  these events (because client is
busy processing earlier sent event), it keeps these events to itself.
        Because of this slowness of events consumption by its clients, memory
consumption of vxconfigd grows.

 Resolution::dmp_probe_idle_lun is set to off by default.

 * Incident no::2353421	 Tracking ID ::2334534

 Symptom::In CVM (Cluster Volume Manager) environment, a node (SLAVE) join to the cluster
is getting stuck and leading to unending join hang unless join operation is
stopped on joining node (SLAVE) using command '/opt/VRTS/bin/vxclustadm
stopnode'. While CVM join is hung in user-land (also called as vxconfigd level
join), on CVM MASTER node, vxconfigd (Volume Manager Configuration daemon)
doesn't respond to any VxVM command, which communicates to vxconfigd process.

When vxconfigd level CVM join is hung in user-land, "vxdctl -c mode" on joining
node (SLAVE) displays an output such as:
 
     bash-3.00#  vxdctl -c mode
     mode: enabled: cluster active - SLAVE
     master: mtvat1000-c1d
     state: joining
     reconfig: vxconfigd in join

 Description::As part of a CVM node join to the cluster, every node in the cluster updates the
current CVM membership information (membership information which can be viewed
by using command '/opt/VRTS/bin/vxclustadm nidmap') in kernel first and then
sends a signal to vxconfigd in user land to use that membership in exchanging
configuration records among each others. Since each node receives the signal
(SIGIO) from kernel independently, the joining node's (SLAVE) vxconfigd is ahead
of the MASTER in its execution. Thus any requests coming from the joining node
(SLAVE) is denied by MASTER with the error "VE_CLUSTER_NOJOINERS" i.e. join
operation is not currently allowed (error number: 234) since MASTER's vxconfigd
has not got the updated membership from the kernel yet. While responding to
joining node (SLAVE) with error "VE_CLUSTER_NOJOINERS", if there is any change
in current membership (change in CVM node ID) as part of node join then MASTER
node is wrongly updating the internal data structure of vxconfigd, which is
being used to send response to joining (SLAVE) nodes. Due to wrong update of
internal data structure, later when the joining node retries its request, the
response from master is sent to a wrong node, which doesn't exist in the
cluster, and no response is sent to the joining node. Joining node (SLAVE) never
gets the response from MASTER for its request and hence CVM node join is not
completed and leading to cluster hang.

 Resolution::vxconfigd code is modified to handle the above mentioned scenario effectively. 
vxconfid on MASTER node will process connection request coming from joining node
(SLAVE) effectively only when MASTER node gets the updated CVM membership
information from kernel.

 * Incident no::2353425	 Tracking ID ::2320917

 Symptom::vxconfigd, the VxVM configuration daemon dumps core and loses disk group 
configuration while invoking the following VxVM reconfiguration steps:

1)	Volumes which were created on thin reclaimable disks are deleted.
2)	Before the space of the deleted volumes is reclaimed, the disks (whose 
volume is deleted) are removed from the DG with  'vxdg rmdisk' command using '-
k' option.
3)	The disks  are removed using  'vxedit rm' command.
4)	 New disks are added to the disk group using 'vxdg addisk' command.

The stack trace of the core dump is :
[
 0006f40c rec_lock3 + 330
 0006ea64 rec_lock2 + c
 0006ec48 rec_lock2 + 1f0
 0006e27c rec_lock + 28c
 00068d78 client_trans_start + 6e8
 00134d00 req_vol_trans + 1f8
 00127018 request_loop + adc
 000f4a7c main  + fb0
 0003fd40 _start + 108
]

 Description::When a volume is deleted from a disk group that uses thin reclaim luns, 
subdisks are not removed immediately, rather it is marked with a special flag. 
The reclamation happens at a scheduled time every day. "vxdefault" command can 
be invoked to list and modify the settings.

After the disk is removed from disk group using 'vxdg -k rmdisk' and 'vxedit 
rm' command, the subdisks records are still in core database and they are 
pointing to disk media record which has been freed. When the next command is 
run to add another new disk to the disk group, vxconfigd dumps core when 
locking the disk media record which has already been freed.

The subsequent disk group deport and import commands erase all disk group 
configuration as it detects an invalid association between the subdisks and the 
removed disk.

 Resolution::1)	The following message will be printed when 'vxdg rmdisk' is used to 
remove disk that has reclaim pending subdisks:

VxVM vxdg ERROR V-5-1-0 Disk <diskname> is used by one or more subdisks which
are pending to be reclaimed.
        Use "vxdisk reclaim <diskname>" to reclaim space used by these subdisks,
        and retry "vxdg rmdisk" command.
        Note: reclamation is irreversible.

2)	Add a check when using 'vxedit rm' to remove disk. If the disk is in 
removed state and has reclaim pending subdisks, following error message will be 
printed:

VxVM vxedit ERROR V-5-1-10127 deleting <diskname>:
        Record is associated

 * Incident no::2353427	 Tracking ID ::2337353

 Symptom::The "vxdmpadm include" command is including all the excluded devices along with 
the device given in the command.

Example:

# vxdmpadm exclude vxvm dmpnodename=emcpower25s2
# vxdmpadm exclude vxvm dmpnodename=emcpower24s2

# more /etc/vx/vxvm.exclude
exclude_all 0
paths
emcpower24c /dev/rdsk/emcpower24c emcpower25s2
emcpower10c /dev/rdsk/emcpower10c emcpower24s2
#
controllers
#
product
#
pathgroups
#

# vxdmpadm include vxvm dmpnodename=emcpower24s2

# more /etc/vx/vxvm.exclude
exclude_all 0
paths
#
controllers
#
product
#
pathgroups
#

 Description::When a dmpnode is excluded, an entry is made in /etc/vx/vxvm.exclude file. This 
entry has to be removed when the dmpnode is included later. Due to a bug in 
comparison of dmpnode device names, all the excluded devices are included.

 Resolution::The bug in the code which compares the dmpnode device names is rectified.

 * Incident no::2353464	 Tracking ID ::2322752

 Symptom::Duplicate device names are observed for NR (Not Ready) devices, when vxconfigd 
is restarted (vxconfigd -k).

# vxdisk list 

emc0_0052    auto            -            -            error
emc0_0052    auto:cdsdisk    -            -            error
emc0_0053    auto            -            -            error
emc0_0053    auto:cdsdisk    -            -            error

 Description::During vxconfigd restart, disk access records are rebuilt in vxconfigd 
database. As part of this process IOs are issued on all the devices to read the 
disk private regions. The failure of these IOs on NR devicess resulted in 
creating duplicate disk access records.

 Resolution::vxconfigd code is modified not to create dupicate disk access records.

 * Incident no::2353922	 Tracking ID ::2300195

 Symptom::Uninitialization of VxVM cdsdisk of size greater than 1 TB fails on HP-UX/ia-64 
platform.

 Description::During the uninitialization of VxVM cdsdisk of size greater than 1 TB on HP-
UX/ia-64 platform, DMP (dynamic-multi-pathing) paths corresponding to the GPT 
partitions on the VxVM cdsdisk were created incorrectly.  This resulted in i/o 
failure while destroying the VxVM cdsdisk.

 Resolution::This issue has been solved by performing the i/o to the whole device while 
uninitializing the VxVM cdsdisk.

 * Incident no::2360419	 Tracking ID ::2237089

 Symptom::=======
vxrecover failed to recover the data volumes with associated cache volume.

 Description::===========
vxrecover doesn't wait till the recovery of the cache volumes is complete before 
triggering the recovery of the data volumes that are created on top of cache
volume. Due to this the recovery might fail for the data volumes.

 Resolution::==========
Code changes are done to serialize the recovery for different volume types.

 * Incident no::2360719	 Tracking ID ::2359814

 Symptom::1. vxconfigbackup(1M) command fails with the following error:
ERROR V-5-2-3720 dgid mismatch

2. "-f" option for the vxconfigbackup(1M) is not documented in the man page.

 Description::1. In some cases, a *.dginfo file will have two lines starting with
"dgid:". It causes vxconfigbackup to fail.
The output from the previous awk command returns 2 lines instead of one for the
$bkdgid variable and the comparison fails, resulting in "dgid mismatch" error even
when the dgids are the same.
This happens in the case if the temp dginfo file is not removed during last run of
vxconfigbackup, such as the script is interrupted, the temp dginfo file is 
updated with appending mode, 

vxconfigbackup.sh:

   echo "TIMESTAMP" >> $DGINFO_F_TEMP 2>/dev/null

Therefore, there may have 2 or more dginfo are added into  the dginfo file, it 
causes the config backup failure with dgid mismatch.

2. "-f" option to force a backup is not documented in the man page of
vxconfigbackup(1M).

 Resolution::1. The solution is to change append mode to destroy mode.

2. Updated the vxconfigbackup(1M) man page with the "-f" option.

 * Incident no::2377317	 Tracking ID ::2408771

 Symptom::VXVM does not show all the discovered devices. Number of devices shown
by VXVM is lesser than those by the OS.

 Description::For every lunpath device discovered, VXVM creates a data structure and
is stored in a hash table. Hash value is computed based on unique minor of the
lunpath. In case minor number exceeds 831231, we encounter integer overflow and
store the data structure for this path at wrong location. When we later traverse
this hash list, we limit the accesses based on total number of discovered paths
and as the devices with minor numbers greater than 831232 are hashed wrongly, we
do not create DA records for such devices.

 Resolution::Integer overflow problem has been resolved by appropriately typecasting
the minor number and hence correct hash value is computed.

 * Incident no::2379034	 Tracking ID ::2379029

 Symptom::Changing of enclosure name was not working for all devices in enclosure. All these
devices were present in /etc/vx/darecs.

# cat /etc/vx/darecs
ibm_ds8x000_02eb        auto    online 
format=cdsdisk,privoffset=256,pubslice=2,privslice=2
ibm_ds8x000_02ec        auto    online 
format=cdsdisk,privoffset=256,pubslice=2,privslice=2
# vxdmpadm setattr enclosure ibm_ds8x000 name=new_ibm_ds8x000
# vxdisk -o alldgs list
DEVICE       TYPE            DISK         GROUP        STATUS
ibm_ds8x000_02eb auto:cdsdisk    ibm_ds8x000_02eb  mydg         online
ibm_ds8x000_02ec auto:cdsdisk    ibm_ds8x000_02ec  mydg         online
new_ibm_ds8x000_02eb auto            -            -            error
new_ibm_ds8x000_02ec auto            -            -            error

 Description::/etc/vx/darecs only stores foreign devices and nopriv or simple devices, 
the auto device should NOT be written into this file. A DA record is flushed in
the /etc/vx/darecs at the end of transaction, if R_NOSTORE flag is NOT set on a DA
record. There was a bug in VM where if we initialize a disk that does not
exist(e.g. using vxdisk rm) in da_list, the R_NOSTORE flag is NOT set for the new
created DA record. Hence duplicate entries for these devices were created and
resulted in these DAs going in error state.

 Resolution::Source has been modified to add R_NOSTORE flag for auto type DA record created by
auto_init() or auto_define().

# vxdmpadm setattr enclosure ibm_ds8x000 name=new_ibm_ds8x000
# vxdisk -o alldgs list
new_ibm_ds8x000_02eb auto:cdsdisk    ibm_ds8x000_02eb  mydg         online
new_ibm_ds8x000_02ec auto:cdsdisk    ibm_ds8x000_02ec  mydg         online

 * Incident no::2382705	 Tracking ID ::1675599

 Symptom::Vxconfigd leaks memory while excluding and including a Third party Driver
controlled LUN in a loop. As part of this vxconfigd loses its license information
and following error is seen in system log:
        "License has expired or is not available for operation"

 Description::In vxconfigd code, memory allocated for various data structures related to
device discovery layer is not freed which led to the memory leak.

 Resolution::The memory is released after its scope is over.

 * Incident no::2382710	 Tracking ID ::2139179

 Symptom::DG import can fail with SSB (Serial Split Brain) though the SSB does not exist.

 Description::An association between DM and DA records is done while importing any DG, if the 
SSB id of the DM and DA records match. On a system with stale cloned disks, the 
system is attempting to associate the DM with cloned DA, where the SSB id 
mismatch is observed and resulted in import failure with SSB mismatch.

 Resolution::The selection of DA to associate with DM is rectified to resolve the issue.

 * Incident no::2382717	 Tracking ID ::2197254

 Symptom::vxassist, the VxVM volume creation utility when creating volume with
"logtype=none" doesn't function as expected.

 Description::While creating volumes on thinrclm disks, Data Change Object(DCO) version 20 log
is attached to every volume by default. If the user do not want this default
behavior then "logtype=none" option can be specified as a parameter to vxassist
command. But with VxVM on HP 11.31 , this option does not work and DCO version
20 log is created by default.  The reason for this inconsistency is that  when
"logtype=none" option is specified, the utility sets the flag to prevent
creation of log. However, VxVM wasn't checking whether the flag is set before
creating DCO log which led to this issue.

 Resolution::This is a logical issue which is addressed by code fix. The solution is to check
for this corresponding flag of  "logtype=none" before creating DCO version 20 by
default.

 * Incident no::2383705	 Tracking ID ::2204752

 Symptom::The following message is observed after the diskgroup creation:
"VxVM ERROR V-5-3-12240: GPT entries checksum mismatch"

 Description::This message is observed with the disk which was initialized as cds_efi and later
on this was initialized as hpdisk. A harmless message "checksum mismatch" is
thrown out even when the diskgroup initialization is successful.

 Resolution::Remove the harmless message "GPT entries checksum mismatch"

 * Incident no::2384473	 Tracking ID ::2064490

 Symptom::vxcdsconvert utility fails if disk capacity is greater than or equal to 1 TB

 Description::VxVM cdsdisk uses GPT layout if the disk capacity is greater than 1 TB and uses 
VTOC layout if the disk capacity is less 1 TB.  Thus, vxcdsconvert utility was 
not able to convert to the GPT layout if the disk capacity is greater than or 
equal to 1 TB.

 Resolution::This issue has been resolved by converting to proper cdsdisk layout depending 
on the disk capacity

 * Incident no::2384844	 Tracking ID ::2356744

 Symptom::When "vxvm-recover" are executed manually, the duplicate instances of the
Veritas Volume Manager(VxVM) daemons (vxattachd, vxcached, vxrelocd, vxvvrsecdgd
and vxconfigbackupd) are invoked.
When user tries to kill any of the daemons manually, the other instances of the
daemons are left on this system.

 Description::The Veritas Volume Manager(VxVM) daemons (vxattachd, vxcached, vxrelocd,
vxvvrsecdgd and vxconfigbackupd) do not have :

  1. A check for duplicate instance.
  and
  2. Mechanism to clean up the stale processes.

Because of this, when user executes the startup script(vxvm-recover), all
daemons are invoked again and if user kills any of the daemons manually, the
other instances of the daemons are left on this system.

 Resolution::The VxVM daemons are modified to do the "duplicate instance check" and "stale
process cleanup" appropriately.

 * Incident no::2386763	 Tracking ID ::2346470

 Symptom::The Dynamic Multi Pathing Administration operations such as "vxdmpadm 
exclude vxvm dmpnodename=<daname>" and "vxdmpadm include vxvm dmpnodename=
<daname>" triggers memory leaks in the heap segment of VxVM Configuration Daemon 
(vxconfigd).

 Description::vxconfigd allocates chunks of memory to store VxVM specific information 
of the disk being included during "vxdmpadm include vxvm dmpnodename=<daname>" 
operation. The allocated memory is not freed while excluding the same disk from 
VxVM control. Also when excluding a disk from VxVM control, another chunk of 
memory is temporarily allocated by vxconfigd to store more details of the device 
being excluded. However this memory is not freed at the end of exclude 
operation.

 Resolution::Memory allocated during include operation of a disk is freed during 
corresponding exclude operation of the disk. Also temporary memory allocated 
during exclude operation of a disk is freed at the end of exclude operation.

 * Incident no::2389095	 Tracking ID ::2387993

 Symptom::In presence of NR (Not-Ready) devices, vxconfigd (VxVM configuration 
        daemon) goes into disabled mode once restarted.  

	# vxconfigd -k -x syslog
	# vxdctl mode
	mode: disabled

	If vxconfigd is restarted in debug mode at level 9 following message 
        could be seen. 

	# vxconfigd -k -x 9 -x syslog

	VxVM vxconfigd DEBUG  V-5-1-8856 DA_RECOVER() failed, thread 87: Kernel 
        and on-disk configurations don't match

 Description::When vxconfid is restarted, all the VxVM devices are recovered. As part
        of recovery the capacity of the device is read, which can fail with EIO.
        This error is not handled properly. As a result of this the vxconfigd is  
        going to DISABLED state.

 Resolution::EIO error code from read capacity ioctl is handled specifically.

 * Incident no::2397663	 Tracking ID ::2165394

 Symptom::If the cloned copy of a diskgroup and a destroyed diskgroup exists on the 
system, an import operation imports destroyed diskgroup instread of cloned one.
For example, consider a system with diskgroup dg containing disk disk1. Disk 
disk01 is cloned to disk02. When diskgroup dg containing disk01 is destroyed and 
diskgroup dg is imported, VXVM should import dg with cloned disk i.e disk02. 
However, it imports the diskgroup dg with disk01.

 Description::After destroying a diskgroup, if the cloned copy of the same diskgroup exists on 
the system, the following disk group import operation wrongly identifies the 
disks to be import and hence destroyed diskgroup gets imported.

 Resolution::The diskgroup import code is modified to identify the correct diskgroup when a 
cloned copy of the destroyed diskgroup exists.

 * Incident no::2405446	 Tracking ID ::2253970

 Symptom::Enhancement to customize private region I/O size based on maximum transfer size 
of underlying disk.

 Description::There are different types of Array Controllers which support data transfer 
sizes starting from 256K and beyond. VxVM tunable volmax_specialio controls 
vxconfigd's configuration I/O as well as Atomic Copy I/O size. When 
volmax_specialio is tuned to a value greater than 1MB to leverage maximum 
transfer sizes of underlying disks, import operation is failing for disks which 
cannot accept more than 256K I/O size. If the tunable is set to 256k then it 
will be the case where large transfer size of disks is not being leveraged.

 Resolution::All the above scenarios mentioned in Description are handled in this 
enhancement to leverage large disk transfer sizes as well as support Array 
controllers with 256K transfer sizes.

 * Incident no::2408209	 Tracking ID ::2291226

 Symptom::Data corruption can be observed on a CDS (Cross-platform Data Sharing) disk, 
whose capacity is more than 1 TB. The following pattern would be found in the 
data region of the disk.

<DISK-IDENTIFICATION> cyl <number-of-cylinders> alt 2 hd <number-of-tracks> sec 
<number-of-sectors-per-track>

 Description::The CDS disk maintains a SUN vtoc in the zeroth block of the disk. This VTOC 
maintains the disk geometry information like number of cylinders, tracks and 
sectors per track. These values are limited by a maximum of 65535 by design of 
SUN's vtoc, which limits the disk capacity to 1TB. As per SUN's requirement, 
few backup VTOC labels have to be maintained on the last track of the disk.

VxVM 5.0 MP3 RP3 allows to setup CDS disk on a disk with capacity more than 
1TB. The data region of the CDS disk would span more than 1TB utilizing all the 
accessible cylinders of the disk. As mentioned above, the VTOC labels would be 
written at zeroth block and on the last track considering the disk capacity as 
1TB. The backup labels would fall in to the data region of the CDS disk causing 
the data corruption.

 Resolution::Suppress writing the backup labels to prevent the data corruption.

 * Incident no::2409212	 Tracking ID ::2316550

 Symptom::While doing cold/ignite install to 11.31 + VxVM 5.1, following warning messages
are seen on a setup with ALUA array:

"VxVM vxconfigd WARNING V-5-1-0 ddl_add_disk_instr: Turning off NMP Alua mode
failed for dmpnode 0xffffffff with ret = 13 "

 Description::The above warning messages are displayed by vxconfigd started at the early boot 
if
fails to turn off the NMP ALUA mode for a given dmp device. These messages are 
not
harmful, as later vxconfigd started in enabled mode will turn off the NMP ALUA
mode for all the dmp devices.

 Resolution::Changes done in vxconfigd to not to print these warning messages in vxconfigd 
boot
mode.

 * Incident no::2411052	 Tracking ID ::2268408

 Symptom::1) On suppressing the underlying path of powerpath controlled device, the disk 
goes in error state. 2) "vxdmpadm exclude vxvm dmpnodename=<emcpower#>" command 
does not suppress TPD devices.

 Description::During discovery, H/W path corresponding to the basename is not generated for 
powerpath controlled devices because basename does not contain the slice 
portion. Device name with s2 slice is expected while generating H/W name.

 Resolution::Whole disk name i.e., device name with s2 slice is used to generate H/W path.

 * Incident no::2411053	 Tracking ID ::2410845

 Symptom::If a DG(Disk Group) is imported with reservation key, then during DG deport
lots of 'reservation conflict' messages will be seen.
                
    [DATE TIME] [HOSTNAME] multipathd: VxVM26000: add path (uevent)
    [DATE TIME] [HOSTNAME] multipathd: VxVM26000: failed to store path info
    [DATE TIME] [HOSTNAME] multipathd: uevent trigger error
    [DATE TIME] [HOSTNAME] multipathd: VxVM26001: add path (uevent)
    [DATE TIME] [HOSTNAME] multipathd: VxVM26001: failed to store path info
    [DATE TIME] [HOSTNAME] multipathd: uevent trigger error
    ..
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:2: reservation conflict
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:2: reservation conflict
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:1: reservation conflict
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:2: reservation conflict
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:1: reservation conflict
    [DATE TIME] [HOSTNAME] kernel: sd 2:0:0:2: reservation conflict

 Description::When removing a PGR(Persistent Group Reservation) key during DG deport, we
need to preempt the key but the preempt operation is failed with reservation
conflict error because the passing key for preemption is not correct.

 Resolution::Code changes are made to set the correct key value for the preemption 
operation.

 * Incident no::2413908	 Tracking ID ::2413904

 Symptom::Performing Dynamic LUN reconfiguration operations (adding and removing LUNs),
can cause corruption in DMP database. This in turn may lead to vxconfigd core
dump OR system panic.

 Description::When a LUN is removed from the VM using  'vxdisk rm' and at the same time some
new LUN is added and in case the newly added LUN reuses the devno of the removed
LUN then this may corrupt the DMP database as this condition is not handled
currently.

 Resolution::Fixed the DMP code to handle the mentioned issue.

 * Incident no::2415566	 Tracking ID ::2369177

 Symptom::When using > 2TB disks and the device respons to SCSI inquiry but fails to service
I/O, data corruption can occur as the write I/O would be directed at an incorrect
offset

 Description::Currently when the failed I/O is retried, DMP assumes the offset to be a 32 bit
value and hence I/O offsets >2TB can get truncated leading to the rety I/O issued
at wrong offset value

 Resolution::Change the offset value to a 64 bit quantity to avoid truncation during I/O
retries from DMP.

 * Incident no::2415577	 Tracking ID ::2193429

 Symptom::Enclosure attributes like iopolicy, recoveryoption etc do not persist across
reboots in case when before vold startup itself DMP driver is already 
configured before with different array type (e.g. in case of root support) than 
stored in array.info.

 Description::When DMP driver is already configured before vold comes up (as happens in root
support), then the enclosure attributes do not take effect if the enclosure name
in kernel has changed from previous boot cycle. This is because when vold comes 
up
da_attr_list will be NULL. And then it gets events from DMP kernel for data
structures already present in kernel. On receiving this information, it tries to
write da_attr_list into the array.info, but since da_attr_list is NULL, 
array.info
gets overwritten with no data. And hence later vold could not correlate the
enclosure attributes present in dmppolicy.info with enclosures present in
array.info, so the persistent attributes could not get applied.

 Resolution::Do not overwrite array.info of da_attr_list is NULL

 * Incident no::2417205	 Tracking ID ::2407699

 Symptom::The vxassist command dumps core if the file "/etc/default/vxassist" contains the
line "wantmirror=<ctlr|target|...>"

 Description::vxassist, the Veritas Volume Manager client utility can accept attributes from
the system defaults file (/etc/default/vxassist), the user specified alternate
defaults file and the command line. vxassist automatically merges all the
attributes by pre-defined priority. However, a null pointer checking is missed
while merging "wantmirror" attribute which leads to the core dump.

 Resolution::Within vxassist, while merging attributes, add a check for NULL pointer.

 * Incident no::2421491	 Tracking ID ::2396293

 Symptom::On VXVM rooted systems, during machine bootup, vxconfigd core dumps with
following assert and machine does not bootup.
Assertion failed: (0), file auto_sys.c, line 1024
05/30 01:51:25:  VxVM vxconfigd ERROR V-5-1-0 IOT trap - core dumped

 Description::DMP deletes and regenerates device numbers dynamically on every
boot. When we start static vxconfigd in boot mode, since ROOT file system is
READ only, new DSF's for DMP nodes are not created. But, DMP configures devices
in userland and kernel.
So, there is mismatch in device numbers of the DSF's and that in DMP kernel, as
there are stale DSF's from previous boot present.
This leads vxconfigd to actually send I/O's to wrong device numbers resulting in
claiming disk with wrong format.

 Resolution::Issue is fixed by getting the device numbers from vxconfigd and not
doing stat on DMP DSF's.

 * Incident no::2428179	 Tracking ID ::2425722

 Symptom::VxVM's subdisk operation - vxsd mv <source_subdisk> <destination_subdisk> - 
fails on subdisk sizes greater than or equal to 2TB. 

Eg: 

#vxsd -g nbuapp mv disk_1-03 disk_2-03 

VxVM vxsd ERROR V-5-1-740 New subdisks have different size than subdisk disk_1-
03, use -o force

 Description::VxVM code uses 32-bit unsigned integer variable to store the size of subdisks 
which can only accommodate values less than 2TB. Thus, for larger subdisk sizes 
integer overflows resulting in the subdisk move operation failure.

 Resolution::The code has been modified to accommodate larger subdisk sizes.

 * Incident no::2440351	 Tracking ID ::2440349

 Symptom::The grow operation on a DCO volume may grow it into any 'site' not
honoring the allocation requirements strictly.

 Description::When a DCO volume is grown, it may not honor the allocation
specification strictly to use only a particular site even though they are
specified explicitly.

 Resolution::The Data Change Object of Volume Manager is modified such that it
will honor the alloc specification strictly if provided explicitly

 * Incident no::2442850	 Tracking ID ::2317703

 Symptom::When the vxesd daemon is invoked by device attach & removal operations in a loop,
it leaves open file descriptors with vxconfigd daemon

 Description::The issue is caused due to multiple vxesd daemon threads trying to establish
contact with vxconfigd daemon at the same time and ending up using losing track of
the file descriptor through which the communication channel was established

 Resolution::The fix for this issue is to maintain a single file descriptor that has a thread
safe reference counter thereby not having multiple communication channels
established between vxesd and vxconfigd by various threads of vxesd.

 * Incident no::2477291	 Tracking ID ::2428631

 Symptom::Shared DG import or Node Join fails with Hitachi Tagmastore storage

 Description::CVM uses different fence key for every DG. The key format is of type
'NPGRSSSS' where N is the node id (A,B,C..) and 'SSSS' is the sequence number.
Some arrays have a restriction on total number of unique keys that can be
registered (eg Hitachi Tagmastore) and hence causes issues for configs involving
large number of DGs, rather the product of #DGs and #nodes in the cluster.

 Resolution::Having a unique key for each DG is not essential. Hence a tunable is added to
control this behavior. 

# vxdefault list
KEYWORD                        CURRENT-VALUE   DEFAULT-VALUE
...
same_key_for_alldgs            off             off
...

Default value of the tunable is 'off' to preserve the current behavior. If a
configuration hits the storage array limit on total number of unique keys, the
tunable value could be changed to 'on'. 

# vxdefault set same_key_for_alldgs on
# vxdefault list
KEYWORD                        CURRENT-VALUE   DEFAULT-VALUE
...
same_key_for_alldgs            on              off
...

This would make CVM generate same key for all subsequent DG imports/creates.
Already imported DGs need to be deported and re-imported for them to take into
consideration the changed value of the tunable.

 * Incident no::2480006	 Tracking ID ::2400654

 Symptom::"vxdmpadm listenclosure" command hangs because of duplicate enclosure entries in
/etc/vx/array.info file.

Example: 

Enclosure "emc_clariion0" has two entries.

#cat /etc/vx/array.info
DD4VM1S
emc_clariion0
0
EMC_CLARiiON
DISKS
disk
0
Disk
DD3VM2S
emc_clariion0
0
EMC_CLARiiON

 Description::When "vxdmpadm listenclosure" command is run, vxconfigd reads its in-core
enclosure list which is populated from the /etc/vx/array.info file. Since the
enclosure "emc_clariion0" (as mentioned in the example) is also a last entry
within the file, the command expects vxconfigd to return the enclosure
information at the last index of the enclosure list. However because of
duplicate enclosure entries,vxconfigd returns a different enclosure information
thereby leading to the hang.

 Resolution::The code changes are made in vxconfigd to detect duplicate entries in
/etc/vx/array.info file and return the appropriate enclosure information as
requested by the vxdmpadm command.

 * Incident no::2483476	 Tracking ID ::2491091

 Symptom::The vxdisksetup(1M) command fails on the disk which have stale EFI information,
with the following error:

VxVM vxdisksetup ERROR V-5-2-4686 Disk <disk name> is currently an EFI formatted
                                  disk. Use -f option to force EFI removal.

 Description::The vxdisksetup(1M) command checks for the EFI headers on the disk, if found then
vxdisksetup(1M) command exits with the suggestion to use "-f" option. The check
for the EFI headers is not correct to take care of stale EFI headers, so the
vxdisksetup(1M) command exits with the error.

 Resolution::The code is modified to check the valid EFI headers.

 * Incident no::2485230	 Tracking ID ::2481938

 Symptom::The vxdisk command displays incorrect pubpath of an EFI partitioned disk on HP
1131 platform.
For example,
# vxdisk list cXtXdXs2
Device:    diskX_p2
devicetag: cXtXdXs2
..
pubpaths:  block=/dev/vx/dmp/cXtXdX char=/dev/vx/rdmp/cXtXdX
...

 Description::The VxVM configuration daemon (vxconfigd) maintains all the disks' information. On
HP platform, an EFI disk has EFI_SDI_FLAG flag set. When the vxconfigd sets up an
EFI partitioned disk, the EFI_SDI_FLAG is checked firstly and vxconfigd changes
the disk access name according to the flag and the disk's original value. However,
in some cases vxconfigd still uses the original disk access name not the new disk
access name which lead to the incorrect subpath value.

 Resolution::vxconfigd is modified to use the new disk access name.

 * Incident no::2485278	 Tracking ID ::2386120

 Symptom::Error messages printed in the syslog in the event of master takeover 
failure in some situations are not be enough to find out the root cause of the
failure.

 Description::During master takeover if the new master encounters some errors, 
the master takeover operation fails. We have messages in the code to log the
reasons for the failure. These log messages are not available on the customer
setups. These are generally enabled in the internal development\testing 
scenarios.

 Resolution::Some of the relevant messages have been modified such that they will
now be available on the customer setups as well, logging crucial information
for root cause analysis of the issue.

 * Incident no::2485288	 Tracking ID ::2431470

 Symptom::vxpfto sets PFTO(Powerfail Timeout) value on a wrong VxVM device.


 Description::vxpfto invokes 'vxdisk set' command to set the PFTO value. 
vxdisk accepts both DA(Disk Access) and DM(Disk Media) names for device 
specification. DA and DM names can have conflicts such that even within the 
same disk group, the same name can refer to different devices - one as a DA 
name and another as a DM name. vxpfto command uses DM names when invoking the 
vxdisk command but vxdisk will choose a matching DA name before a DM name. This 
causes incorrect device to be acted upon.


 Resolution::Fixed the argument check procedure in 'vxdisk set' based on the 
common rule of VxVM (i.e.) if a disk group is specified with '-g' option, then 
only DM name is supported, else it can be a DA name.

 * Incident no::2492016	 Tracking ID ::2232411

 Symptom::Subsequent resize operations of raid5 or layered volumes may fail with "VxVM
vxassist ERROR V-5-1-16092 Volume TCv7-13263: There are other recovery activities.
Cannot grow volume"

 Description::If a user tries to grow or shrink a raid5 volume or a layered volume more than
once using vxassist command, the command may fail with the above mentioned error
message.

 Resolution::1. Skip setting recover offset for RAID volumes. 
2. For layered volumes, 
   topvol: skip setting recover offset. 
   subvols: handles separately later. (code exists).