* * * READ ME * * *
             * * * Veritas Volume Manager 5.0 MP2 RP3 * * *
                         * * * P-patch 2 * * *
                         Patch Date: 2012-07-17


This document provides the following information:

   * PATCH NAME
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH


PATCH NAME
----------
Veritas Volume Manager 5.0 MP2 RP3 P-patch 2


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxvm
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * Veritas Volume Manager 5.0 MP2
   * Veritas Storage Foundation for Oracle RAC 5.0 MP2
   * Veritas Storage Foundation Cluster File System 5.0 MP2
   * Veritas Volume Replicator 5.0 MP2
   * Veritas Storage Foundation 5.0 MP2
   * Veritas Storage Foundation High Availability 5.0 MP2
   * Veritas Storage Foundation for Oracle 5.0 MP2


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
HP-UX 11i v2 (11.23)


INCIDENTS FIXED BY THE PATCH
----------------------------
This patch fixes the following Symantec incidents:

Patch ID: PHCO_43057, PHKL_43058

* 2627009 (Tracking ID: 2413763)

SYMPTOM:
vxconfigd, the VxVM daemon dumps core with the following stack:

ddl_fill_dmp_info
ddl_init_dmp_tree
ddl_fetch_dmp_tree
ddl_find_devices_in_system
find_devices_in_system
mode_set
setup_mode
startup
main
__libc_start_main
_start

DESCRIPTION:
Dynamic Multi Pathing node buffer declared in the Device Discovery Layer was not 
initialized. Since  the node buffer is local to the function, an explicit 
initialization is required before copying another buffer into it.

RESOLUTION:
The node buffer is appropriately initialized using memset() to address the 
coredump.

* 2634096 (Tracking ID: 1206369)

SYMPTOM:
The recoveryoption of enclosure when set to nothrottle, is not persistent after 
reboot.

DESCRIPTION:
When the CLI 'vxdmpadm setattr' was used to set recovery option as nothrottle, 
the persistent information was not updated correctly.
After a reboot the nothrottle recoveryoption was not being considered, hence 
user set value was not effective and the value changed back to default.

RESOLUTION:
Corrected vxdmp code to update nothrottle recovery option and to read it back at 
boot time.

* 2662215 (Tracking ID: 2067319)

SYMPTOM:
In a multi node CVM (Cluster Volume Manager) environment, the vxconfigd process
(VxVM configuration daemon) on master node may hang during a cluster
reconfiguration.

The vxconfigd can be found in tight loop with following stack:
  msgtail() + 0x204
  msg() + 0x5c
  send_slaves() + 0xd94
  master_send_abort() + 0x90
  send_slaves() + 0xe4
  master_get_results() + 0x58
  commit() + 0x1acc
  req_vol_commit() + 0x968
  request_loop() + 0xec8
  main() + 0x14e0
  __start() + 0x68

A number of following messages can be seen in the syslog.
  VxVM vxconfigd WARNING  V-5-1-10377 send_slave: got slave_join: retry later

DESCRIPTION:
In CVM environment, master synchronizes the transaction details related to any
configuration change among all the joined slaves (passive). While a transaction
is in progress, if CVM reconfiguration happens due to join of a new node, the
master aborts the transaction. In this situation a race between a passive slave
and master is causing the vxconfigd hang on the master node.

RESOLUTION:
The transaction abort code is modified to handle the CVM reconfiguration properly.

* 2662216 (Tracking ID: 530741)

SYMPTOM:
In CVM(Cluster Volume Manager) environment where a private DG(Disk Group) is
imported, vxconfigd process may dump a core when 'vxdg -g <DG> flush' on the
private DG is executed just after the following commands in sequence:

i) A volume stop in the DG fails with error "Error in cluster processing" 

  # vxvol -g <Private DG> stop <Volume>
  VxVM vxvol ERROR V-5-1-10128  Error in cluster processing

ii) Subsequent volume stop succeeds.

  # vxvol -g <Private DG> -f stop <Volume>
  #

The core shows the following stack;

  dbf_fmt_tbl+0x4c0()
  voldbf_fmt_tbl+0x3c()
  voldbsup_format_record+0xa0()
  format_write+0x2d8()
  ddb_update+0x15c()
  dg_update+0x11c()
  req_dg_flush_common+0x354()
  req_dg_flush_name+0x7c()
  request_loop+0xae4()
  main+0xcb4()
  _start+0x108()

Then once the DG is deported, next import will be failed with the following error;

  # vxdg import <Private DG>
  VxVM vxdg ERROR V-5-1-10978 Disk group <Private DG>: import failed: 
  Disk group has no valid configuration copies

In the messages file, the following log will be seen;

  vxvm:vxconfigd: [ID 702911 daemon.error]      Disk <Disk>, copy 1: Block 1:
Duplicate record in configuration

DESCRIPTION:
On any configuration change, VxVM will try to keep the same configuration data
to be stored in the user land(vxconfigd), kernel land(vxio) and on-disk
database. If the transaction of configuration change encounters an error, VxVM
will clean up any inconsistencies between the user land, kernel land and on-disk
database.

In CVM environment, if cluster reconfiguration occurs following a transaction of
configuration change on a private DG, the transaction may be aborted because of
the reconfiguration. However appropriate clean up on the databases is not done
in an error case where transaction in kernel is aborted by the
reconfiguration.

Then another configuration change in the same private DG would result in a
duplicate record in the VxVM configuration database leading to a coredump or
disk group import issue.

This is a rare timing issue, however may be seen in a normal cluster stop operation.

RESOLUTION:
Code changes are made to hold on appropriate clean up and set appropriate
error code.

* 2803256 (Tracking ID: 2647975)

SYMPTOM:
Serial Split Brain (SSB) condition caused Cluster Volume Manager (CVM)
Master Takeover to fail. The below vxconfigd debug output was noticed when the
issue was noticed,

VxVM vxconfigd NOTICE V-5-1-7899 CVM_VOLD_CHANGE command received
V-5-1-0 Preempting CM NID 1
VxVM vxconfigd NOTICE V-5-1-9576 Split Brain. da id is 0.5, while dm id is 0.4 
for
dm cvmdgA-01
VxVM vxconfigd WARNING V-5-1-8060 master: could not delete shared disk groups
VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgA: Disabled by errors 
VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgB: Disabled by errors 
...
VxVM vxconfigd ERROR V-5-1-11467 kernel_fail_join() :           Reconfiguration
interrupted: Reason is transition to role failed (12, 1)
VxVM vxconfigd NOTICE V-5-1-7901 CVM_VOLD_STOP command received

DESCRIPTION:
When Serial Split Brain (SSB) condition is detected by the new CVM
master, on Veritas Volume Manager (VxVM)  versions 5.0 and 5.1, the default CVM 
behaviour will cause the new CVM master to leave the cluster and causes
cluster-wide downtime.

RESOLUTION:
When SSB is detected in a diskgroup, CVM will only disable that
particular diskgroup and keep the other diskgroups imported during the CVM 
Master
Takeover, the new CVM master will not leave the cluster with the fix applied.

* 2803260 (Tracking ID: 2040150)

SYMPTOM:
IO error messages for dmpnode observed in events logs (with reservation 
conflict error). Disk Marked Failing. We hit this issue when total number of PGR 
keys goes 32 or more by count.

DESCRIPTION:
Whenever there is 32 or more keys, we are not able to get all keys because 
only 7th byte of response buffer is used to get total bytes where keys are 
stored. In case of N keys, DMP can only read the first N % 32 (where % is 
mathematical modulo) keys of them.

RESOLUTION:
As per scsi-3 standard the 4th to 7th byte of response buffer of 
PGR_READ_KEYS command gives total bytes where keys are stored. Calculating 
buflen to follow using 4th to 7th byte of response buffer.

* 2803331 (Tracking ID: 2792748)

SYMPTOM:
In an HPUX CVM environment, the slave join fails with the 
following error message in syslog :

VxVM vxconfigd ERROR V-5-1-5784 cluster_establish:kernel interrupted vold on 
overlapping reconfig.

DESCRIPTION:
During the join, the slave node performs disk group import. As 
part of the import, the file descriptor pertaining to "Port u" is closed because 
of a wrong assignment of the return value of open(). Hence, the subsequent write 
to the same port was returning EBADF.

RESOLUTION:
This code issue is corrected by adding additional brackets 
thereby avoiding the wrong file descriptor close. Also, allow setting pfto 
attribute to 0.


INSTALLING THE PATCH
--------------------
$ swinstall -x autoreboot=true <patch id> 
Please do swverify after installing the patches in order to make sure
   that the patches are installed correctly using:

   $ swverify <patch id>


REMOVING THE PATCH
------------------
To remove the patch, enter the following command:

        # swremove  -x autoreboot=true <patch id>


SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
---------------------------
Incidents fixed in MP2RP3P1:
==========================
unixvm-cvs:
Incident parent                 Abstract
-------- ---------------------------------------------------------------------
2049321 (2441937) vxconfigrestore precommit fails with awk errors
2138782 (1822681) memory leak in vxio/voldrl_cleansio_start
2215565 (2209866) Inconsistent behavior with handling of siteconsistent flag
2236561 (990338)  FMR Refreshing a snapshot should keep the same name for the snap object
2248993 (1715204) vxsnap operations leads to orphan snap obj in case of any failure occurs during operation, orphan snap
 object can't be removed.
2274273 (850816)  Parallel 'vxsnap reattach' operations can cause data corruption. Also, lead to orphan snap objects.
2277306 (2183984) System panic in dmp_update_stats() routine
2280633 (2280624) Need to set site consistent only on mirrored-volumes.
2323929 (2323925) If rootdisk is encapsulated and if install-db is present, clear warning should be displayed on system
boot.
2325112 (1545835) vxconfigd core dump during system boot after VxVM4.1RP4 applied.
2353415 (2349352) During LUN provisioning in single path IO mode environment a data corruption is observed
2353422 (2334534) In CVM environment, vxconfigd level join is hung when Master returns error "VE_NO_JOINERS" to a joinin
g node and cluster nidmap is changed in new reconfiguration
2360720 (2359814) vxconfigbackup doesn't handle errors well
2368917 (1818780) Oakmont Linux consume many memories during dmp test.
2419942 (844758)  vxbrk_rootmir fails with second swapvol
2497156 (2165141) vxvm resets b_clock_ticks to zero if I/O hints are passed by VxFS
2497157 (2026773) DMP: vxconfigd hang after array side port disable followed by vxdisk scandisks
2497160 (1268784) Memory Leaks in VxVM plugin of VxMS
2497164 (1381772) Slow performance of snapshot backups after applying ET1269854.
2497166 (1215169) Customer concerned about high memory consumption by vxesd on 5.0MP2RP2 on HP 11.23
2497228 (2481938) vxconfigbackup throwing an error when DG contains a sectioned disk
2530898 (1137504) vxesd -k does not kill existing intance but starts multiple ones

vmprov:
2574355 Create a vmprovider patch for 5.0/11.23 MP2RP3P1

Incidents fixed in MP2RP3:
==========================
Incident parent                 Abstract
-------- ---------------------------------------------------------------------
1045826 (929437)  HP-UX 11.23 - vxvm 5.0 MP1 - errors from apm_keyget: invalid APM crc
1514814 (1362609) vxconfigd ERROR V-5-1- 12826 osuuid invalid guid in console.log VxVM 5.0
1677976 (1976961) vxconfigd is hung in dmp_close_path
1937718 (1933970) Restrict the max_specialio tunable value to the permissible limit.
2000303 (1589715) vxconfigd dumps core, after vxdmpadm getportids ctlr=<ctlr_name>  on a disabled ctlr
2027839 (2027831) vxdg free not reporting free space correctly on CVM master. vxprint not printing DEVICE column for SDs
2049394           /etc/vx/diag.d/vxcvmdiag cvminfo core dumps on HPUX 5.0MP1:UNOF_MP1RP6HF2 (Samsung Cards current VM patch level)
2078241 (2078209) resize of vxvm volume resulting in vxconfigd hang -- Need to increase the default value for volpagemod_max_memsz
2091195 (1224659) Customer appears to be hitting e1224659 when running vxconfigbackup on 5.0MP2RP2 on HP 11.23
2111423 (1488399) DMP: Detection of I/O being sent on SCTL device on HP
2111424 (1485075) vmtest/tc/scripts/admin/voldg/cds/set.tc hits DMP ted assert dmp_select_path:2a
2111441 (1954062) vxrecover results in os crash
2111639 (1114178) vxconfigd dumped a core in req_dg_get_info_common()
2111672 (1528932) vxconfigd asserts in config_db_disable()
2112470 (339187)  CVM activation tag in vxprint -m output breaks vxprint
2112477 (1138518) vx commands (such as vxdg, vxdctl, vxdisk) hanging

Incidents fixed in MP2RP2:
==========================
1927987 (1927982) vxpfto returns error arbitrarily even though the command sets the values correctly.
1944259 (1701865) join failed due to interrupted reconfig on joiner and same reconfig completed on master
1451436 (1593032) vxfenconfig ERROR V-11- 2-1064 DMP Idle Lun Monitoring DeRegistration FAILED
1957355 (1744224) FMR3: multiple vxplex attach cmds running in parallel on a volume lead to clearing DCO map and subsequently lead to corruption
1937832 (1755466) vol_find_ilock: searching of ilock is inefficient
1938088 (1755830) kmsg: sender: the logic for resend of messages needs to be optimized
1938117 (1755810) kmsg: sender thread is woken up unnecessarily during flowcontrol
1938114 (1755788) for a broadcast message, sender thread may end up sending the same message multiple times (not resend)
1938036 (1755519) kmsg layer: receiver side flowcontrol is not supported
1938076 (1755628) kmsg layer: with heavy messaging in the cluster the receiver thread slows down processing
1275028 (927444)  Makefile.kernel file needs dcoinc.h header file
1849485 (1677416) Node is not joining back into the cluster
1969589 (1969526) panic in voldiodone when a hung priv region I/O comes back
1944180 (1819777) panic caused because of voldisk getting deleted in kernel when I/Os are active, due to duplicate da rid.
1946107 (1435470) Cluster nodes panicked in voldco_or_pvmbuf_to_pvmbuf code after installing 5.0MP3
1957358 (1729558) multiple vxplex attach cmds running in parallel on a volume lead to clearing DCO map and subsequently lead to corruption in FMR2
1874059 (1435681) vxesd looping, using ~100% of one CPU.
1902781 (1228526) Running vxdg flush on a slave node in a cvm cluster disables the disk group
1924619 (1532363) vxdisk 'updateudid' is corrupting diskid. Import of diskgroup fails.
1587888 (1587885) vxdiskunsetup fails with error "awk: Input line cannot be longer than 3, 000 bytes"
1876291 (913890)  EMC ASL (libvxemc.so) with PowerPath co- existence is unable to skip LUNZ disks (CLARiiON) and PP co-existence broken on HP
1946112 (1471581) vxconfigd may hang when checking for ecopy functionality on array (ASL)
1946117 (1742702) vxvmconvert fails, probably due to wrong disk capacity calculation
1946110 (147037)  vxconfigd cores at start up
1946109 (1463547) Persistent vxconfigd core dump on dynamic LUN reconfiguration.
1921587 (1907796) Corrupted Blocks in Oracle after Dynamic LUN expansion and vxconfigd core dump
1946105 (1421078) Manpage for vxdg(1M) needs to cover last shared dg disk detach scenario better
1946116 (1059720) Switching to EBN does not imediately show EFI disks correctly
1885021 (839077)  vxresize fails on filesystems greater than 2TB
1946114 (1676061) System panic'd after 2 out of 4 paths to disk were removed.
1946118 (1192166) vxdg -n [newdg] deport [origdg] causes a sort of memory leak
1946104 (1203661) vxclustadm man page needs mcsg instead of hpsg, redundant ifdefs
1501595 (1650955) 'vxdctl enable' caused node panic after one path unmasked/unpresented on array


Incidents Fixed in MP2RP1
==================
e1164654 (795042)  vxvmconvert tools needs to be modified to eliminate the use of private LVM headers.
e1274122 (828910)  Double free of memory in voldg_clean_cpulist()
e1274138 (1087073) disk.convert script prints VGs converted list when one or more failed
e1274155 (1015605) Poison nibble: nibble of 0xc in the dmp minor will cause panic
e1274241 (1001370) Lun reuse issue not fixed by the DMP backport hotfix
e1274243 (1064826) "Could not do stat on path /devsdw"
e1274255 (972406)  vxconfigd hang
e1360836 (1321272) vxcommands hanging after re-connect the FC-site link
e1361304 (1260745) Node is not joining the cluster after reconnect the FC & heartbeat link.
e1394216 (1393764) vxconfigd hung on node which is on which try to become master on site2 when FC and haerbeat link is disabled at same time.
e1409142 (1361260) Slow I/O performance with VxFS filesystems on mirror-concat VxVM volumes with DCO and DRL.
e1455184 (1414336) Disk devices do not appear in vxdisk list, but in vxprint
e1470963 (1599295) /vmtest/tc/scripts/support/vxdisksetup/setup.tc is FAILED with vxdisksetup on IA
e1501518 (1395616) vxdmp for the PGR_PREEMPT command unnecessarily retries on all paths and multiple times incase of RESV_CONFLICT
e1501593 (1426480) VOLCVM_CLEAR_PR ioctl does not propogate the error returned by DMP to the caller
e1504534 (795129)  DG loses - CVM/VxVM with MC/SG
e1504777 (1131566) If a config or klog copy hits an error, vxconfigd should validate and possibly detach the disk
e1507982 (1507935) 5.0MP3RP1 Campus Cluster: vxconfigd core dumps when settag set to long sitename
e1509485 (1220091) Reduce slave disk re-onlines from SLAVE_DISK_OP_NOTIFY request from master.
e1514000 (1458481) volfmr_copymaps_instant panic on node of cluster during shared and private dg creations
e1522342 (1269468) Enclosure removed/presented back multiple times, whilst vxconfigd is restarted, core dumps
e1528686 (1227106) HxRT SFOR 5.0mp3 PowerPath/DMX : PGR key issues - Uncertain PGR key number and vold_pgr_unregister failed errors
e1542863 (1541662) System panicked in DRL code when running flashsnap
e1543470 (1159227) Getting core file related to vxesd in SFORAHA stack using combo installer.
e1555898 (1068626) Full resync occurred in remaining nodes after SFORAC panic rebooted
e1557153 (1729344) vxdg deport hung
e1592476 (1543908) While running vxevac command, Oracle process thread stuck into ogetblk() which leads to i/o hang.
e1631998 (1397234) Vm command hung consistently during DMP testing on PA machine.
e1632029 (1392872) Nodes has panicked on which failover has happened when master TOCed and disable FC and both sites are writing to same volume.
e1632058 (1393756) Vxcommands hung on master & slave after FC-site link disconnected
e1632081 (1321296) vxassist core dump
e1636487 (1289510) vxconfigd dumps core during vmcert run and later vm hung
e1650957 (1228140) After setting path attribute to active, path state is not updated.
e1670680 (1787772) deporting a dg hangs after re-connect the FC site Link
e1719779 (1468885) The vxbrk_rootmir script does not complete and is hanging after invoking vxprivutil
e1363314 (1260746) Node not joining back with 2min delay in disconnecting FC & heartbeat link
e1586930 (1878759) panic on 11.23 IVM Guest machine

Incidents Fixed in MP2
==============================
1003433 (600447)  vxprivutil dumpconfig <disk>  is showing last_platform as 0x0#bad
1274123 (832350)  vxdctl's initdmp section of man page require correction
1274142 (524055)  vxvm:vxassist : ERROR:Cannot update volume vol-1
1274148 (900090)  HP -- MSA arrays do not have an ASL and are not recognized as a JBOD
1274177 (1090155) During vxevac, the vxsd command can absorb all nfile resources in kernel
1274185 (1067501) 'vxdisksetup - iB' incorrectly calculates publen
1274194 (1079281) vxconfigrestore hits awk limitation in cbr_res_main()
1274272 (1189432) (DS6000)vxdmpadm disalbe/enable ctlr will hang all VX command
1274276 (1211302) EVA6k/HP-UX : Slave node panics after disable primary paths.
1299382 (1053529) Unable to import a shared disk group on the DR site
1299403 (1213239) CVM: Recovery for subvolumes(of a layered volume) does not happen due to missing -f option
1360759 (1260756) vxconfigd core dumps after fix for vxcommands hanging is applied
1360849 (1260757) Master node is getting crashed after reattach the site
1362201 (853822)  master stuck in 'master selection' during random shutdown -r (updown) on 16 node cluster
1362204 (865400)  panic in vol_kmsg_handle_send_err+00035C during shutdown -r and rejoin of 1 node in 16 node cluster
1376234 (1265794) vxvol set doesn't allow changing campus cluster options while the volume is open
1381724 (1321475) Join Failure Panic Loop on axe76 cluster
1415504 (990475)  FMR2 : Oring of DRL Recovery Map with FMR Detach Map when the volume is opened in RWBK mode
1416347 (1416080) System panic in vol_change_disk() routine due to NULL deference.
1416349 (1386980) Panic in vol_putdisk(). Looks like another version of same problem as in e1288427
1422656 (1004746) New TC's for Fmr2.
1427498 (913656)  abort from CBO gets stuck due to deadlock causing safetytimer expiry
1427499 (1133089) CVM: In consecutive master takeovers, slave state is not reset appropriately
1427500 (1145348) Reconfiguration deadlock in master node and passive slaves during DRL Rebuild response
1427503 (1156613) Safety timer expiry due to CVM reconfig taking too long for SG's comfort.
1427504 (1171932) CVM: master takeover reconfig continues on master but gets interrupted on slaves with a JOIN causing deadlock
1427505 (1168279) volsio stuck in defer Q causing reconfiguration to hang
1427506 (1210957) Vxio hung IO's and uncorrectable write errors
1450039 (1443679) FMR3: I/Os initiating DCO updates for clearing DRL async clear region may not wait for its completion.
1450048 (1246785) Panic in dmp_get_iocount() due to invalid cpu table address.
1450098 (1207898) Upon stop all nodes in the RAC cluster, some stale fencing keys are left from Master node with HDS USP V array
1450934 (1450932) Enhance DMP's delay queue processing logic to avoid infinite retries
1453894 (1453694) System panic in scsi_strategy_real() when an extra paths are added to an existing LUN on the fly.
1459367 (1033534) Enhancements for online and offline disk opertions
1465688 (1461717) 'vxsnap make' command result in vxconfigd and IO sleep too long time