Volume Manager on Solaris, patch detail

To use SORT, JavaScript must be enabled. How to enable JavaScript.
vm-sol_x64-5.1SP1RP2P2 Obsolete Go to Download Center to download. The latest patch(es) : sfha-sol10_x64-5.1SP1RP4 sfha-sol10_x64-5.1SP1PR3RP4
Basic information
Release type:	P-patch
Release date:	2011-11-03
OS update support:	None
Technote:	None
Documentation:	None
Popularity:	1095 viewed downloaded
Download size:	37.16 MB
Checksum:	151063201
Applies to one or more of the following products:
VirtualStore 5.1SP1 On Solaris 10 X64
VirtualStore 5.1SP1PR3 On Solaris 10 X64
Dynamic Multi-Pathing 5.1SP1 On Solaris 10 X64
Storage Foundation 5.1SP1 On Solaris 10 X64
Storage Foundation Cluster File System 5.1SP1 On Solaris 10 X64
Storage Foundation for Oracle RAC 5.1SP1 On Solaris 10 X64
Storage Foundation HA 5.1SP1 On Solaris 10 X64
Obsolete patches, incompatibilities, superseded patches, or other requirements:
This patch is obsolete. It is superseded by:	Release date
sfha-sol10_x64-5.1SP1PR3RP4	2013-08-21
sfha-sol10_x64-5.1SP1RP4	2013-08-21
vm-sol10_x64-5.1SP1RP3P2 (obsolete)	2013-03-15
sfha-sol_x64-5.1SP1PR3RP3 (obsolete)	2012-10-02
sfha-sol_x64-5.1SP1RP3 (obsolete)	2012-10-02
vm-sol_x64-5.1SP1RP2P3 (obsolete)	2012-06-13
This patch supersedes the following patches:	Release date
vm-sol_x64-5.1SP1RP2P1 (obsolete)	2011-10-19
vm-sol_x64-5.1SP1RP1P2 (obsolete)	2011-06-07
vm-sol_x64-5.1SP1RP1P1 (obsolete)	2011-03-02
vm-sol_x64-5.1SP1P2 (obsolete)	2010-12-07
This patch requires:	Release date
sfha-sol_x64-5.1SP1PR3RP2 (obsolete)	2011-09-28
sfha-sol_x64-5.1SP1RP2 (obsolete)	2011-09-28
Fixes the following incidents:
2440015, 2477272, 2497637, 2497796, 2507120, 2507124, 2508294, 2508418, 2511928, 2515137, 2525333, 2531983, 2531987, 2531993, 2552402, 2553391, 2562911, 2563291, 2574840, 2583307
Patch ID:
142630-14
Readme file
                          * * * READ ME * * *
             * * * Veritas Volume Manager 5.1 SP1 RP2 * * *
                         * * * P-patch 2 * * *
                         Patch Date: 2011-11-01


This document provides the following information:

   * PATCH NAME
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH


PATCH NAME
----------
Veritas Volume Manager 5.1 SP1 RP2 P-patch 2


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * Veritas Volume Manager 5.1 SP1
   * Veritas Storage Foundation for Oracle RAC 5.1 SP1
   * Veritas Storage Foundation Cluster File System 5.1 SP1
   * Veritas Storage Foundation 5.1 SP1
   * Veritas Storage Foundation High Availability 5.1 SP1
   * Veritas Dynamic Multi-Pathing 5.1 SP1
   * Symantec VirtualStore 5.1 SP1
   * Symantec VirtualStore 5.1 SP1 PR3


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
Solaris 10 X86


INCIDENTS FIXED BY THE PATCH
----------------------------
This patch fixes the following Symantec incidents:

Patch ID: 142629-14

* 2583307 (Tracking ID: 2185069)

SYMPTOM:
In a CVR setup, while the application IOs are going on all nodes of
primary, bringing down a slave node results in panic on master node with following
stack trace:

 #0 [ffff8800282a3680] machine_kexec at ffffffff8103695b
 #1 [ffff8800282a36e0] crash_kexec at ffffffff810b8f08
 #2 [ffff8800282a37b0] oops_end at ffffffff814cbbd0
 #3 [ffff8800282a37e0] no_context at ffffffff8104651b
 #4 [ffff8800282a3830] __bad_area_nosemaphore at ffffffff810467a5
 #5 [ffff8800282a3880] bad_area_nosemaphore at ffffffff81046873
 #6 [ffff8800282a3890] do_page_fault at ffffffff814cd658
 #7 [ffff8800282a38e0] page_fault at ffffffff814caf45
    [exception RIP: vol_rv_async_childdone+876]
    RIP: ffffffffa080b7ac  RSP: ffff8800282a3990  RFLAGS: 00010006
    RAX: ffff8801ee8a5200  RBX: ffff8801f6e17200  RCX: ffff8802324290c0
    RDX: ffff8801f7c8fac8  RSI: 0000000000000009  RDI: ffff8801f7c8fac8
    RBP: ffff8800282a3a00   R8: ffff8801f38d8000   R9: 0000000000000001
    R10: 000000000000003f  R11: 000000000000000c  R12: ffff8801f2580000
    R13: ffff88021bdfa7c0  R14: ffff8801f7c8fa00  R15: ffff8801ed46a200
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff8800282a3a08] volsiodone at ffffffffa0672c3e
 #9 [ffff8800282a3a88] vol_subdisksio_done at ffffffffa06764a7
#10 [ffff8800282a3ac8] volkcontext_process at ffffffffa0642a59
#11 [ffff8800282a3b18] voldiskiodone at ffffffffa062f1c1
#12 [ffff8800282a3bc8] voldiskiodone_intr at ffffffffa062f3a2
#13 [ffff8800282a3bf8] voldmp_iodone at ffffffffa05f7806
#14 [ffff8800282a3c08] bio_endio at ffffffff811a0d3d
#15 [ffff8800282a3c18] gendmpiodone at ffffffffa059594a
#16 [ffff8800282a3c68] dmpiodone at ffffffffa0596cf2
#17 [ffff8800282a3cb8] bio_endio at ffffffff811a0d3d
#18 [ffff8800282a3cc8] req_bio_endio at ffffffff8123f7fb
#19 [ffff8800282a3cf8] blk_update_request at ffffffff8124083f
#20 [ffff8800282a3d58] blk_update_bidi_request at ffffffff81240ba7
#21 [ffff8800282a3d88] blk_end_bidi_request at ffffffff81241c7f
#22 [ffff8800282a3db8] blk_end_request at ffffffff81241d20
#23 [ffff8800282a3dc8] scsi_io_completion at ffffffff8134a42f
#24 [ffff8800282a3e48] scsi_finish_command at ffffffff81341812
#25 [ffff8800282a3e88] scsi_softirq_done at ffffffff8134aa3d
#26 [ffff8800282a3eb8] blk_done_softirq at ffffffff81247275
#27 [ffff8800282a3ee8] __do_softirq at ffffffff81073bd7
#28 [ffff8800282a3f58] call_softirq at ffffffff810142cc
#29 [ffff8800282a3f70] do_softirq at ffffffff81015f35
#30 [ffff8800282a3f90] irq_exit at ffffffff810739d5
#31 [ffff8800282a3fa0] smp_call_function_single_interrupt at ffffffff8102eab5
#32 [ffff8800282a3fb0] call_function_single_interrupt at ffffffff81013e33
--- <IRQ stack> ---
#33 [ffff8801f3ca9af8] call_function_single_interrupt at ffffffff81013e33
    [exception RIP: page_waitqueue+125]
    RIP: ffffffff8110b16d  RSP: ffff8801f3ca9ba8  RFLAGS: 00000213
    RAX: 0000000000000b9d  RBX: ffff8801f3ca9ba8  RCX: 0000000000000034
    RDX: ffff880000027d80  RSI: 0000000000000000  RDI: 00000000000003df
    RBP: ffffffff81013e2e   R8: ea00000000000000   R9: 5000000000000000
    R10: 0000000000000000  R11: ffff8801ecd0f268  R12: ffffea0006c13d40
    R13: 0000000000001000  R14: ffffffff8119d881  R15: ffff8801f3ca9b18
    ORIG_RAX: ffffffffffffff04  CS: 0010  SS: 0018
#34 [ffff8801f3ca9bb0] unlock_page at ffffffff8110c16a
#35 [ffff8801f3ca9bd0] blkdev_write_end at ffffffff811a3cd0
#36 [ffff8801f3ca9c00] generic_file_buffered_write at ffffffff8110c944
#37 [ffff8801f3ca9cd0] __generic_file_aio_write at ffffffff8110e230
#38 [ffff8801f3ca9d90] blkdev_aio_write at ffffffff811a339c
#39 [ffff8801f3ca9dc0] do_sync_write at ffffffff8116c51a
#40 [ffff8801f3ca9ef0] vfs_write at ffffffff8116c818
#41 [ffff8801f3ca9f30] sys_write at ffffffff8116d251
#42 [ffff8801f3ca9f80] sysenter_dispatch at ffffffff8104ca7f

DESCRIPTION:
The reason for panic is that an internal data structure access is not
properly serialized resulting in corruption of that data structure.

RESOLUTION:
Resolution is to properly serialize access to the internal data
structure so that its contents are not corrupted under any scenario,

Patch ID: 142629-13

* 2440015 (Tracking ID: 2428170)

SYMPTOM:
I/O hangs when reading or writing to a volume after a total storage 
failure in CVM environments with Active-Passive arrays.

DESCRIPTION:
In the event of a storage failure, in active-passive environments, 
the CVM-DMP fail over protocol is initiated. This protocol is responsible for 
coordinating the fail-over of primary paths to secondary paths on all nodes in 
the 
cluster.
In the event of a total storage failure, where both the primary paths and 
secondary paths fail, in some situations the protocol fails to cleanup some 
internal structures, leaving the devices quiesced.

RESOLUTION:
After a total storage failure all devices should be un-quiesced, 
allowing the I/Os to fail. The CVM-DMP protocol has been changed to cleanup 
devices, even if all paths to a device have been removed.

* 2477272 (Tracking ID: 2169726)

SYMPTOM:
After import operation, the imported diskgroup contains combination of cloned 
and original disks. For example, after importing the diskgroup which has four 
disks, two of the disks from imported diskgroup are cloned disks and the other 
two are original disks.

DESCRIPTION:
For a particular diskgroup, if some of the original disks are not available at 
the time of diskgroup import operation and the corresponding cloned disks are 
present, then the diskgroup imported through vxdg import operation contains 
combination of cloned and original disks.
Example - 
Diskgroup named dg1 with the disks disk1 and disk2 exists on some machine. 
Clones of disks named disk1_clone disk2_clone are also available. If disk2 goes 
offline and the import for dg1 is performed, then the resulting diskgroup will 
contain disks disk1 and disk2_clone.

RESOLUTION:
The diskgroup import operation will consider cloned disks only if no original 
disk is available. If any of the original disks exists at the time of import 
operation, then the import operation will be attempted using original disks 
only.

* 2497637 (Tracking ID: 2489350)

SYMPTOM:
In a Storage Foundation environment running Symantec Oracle Disk Manager (ODM),
Veritas File System (VxFS), Cluster volume Manager (CVM) and Veritas Volume
Replicator (VVR), kernel memory is leaked under certain conditions.

DESCRIPTION:
In CVR (CVM + VVR), under certain conditions (for example when I/O throttling
gets enabled or kernel messaging subsystem is overloaded), the I/O resources
allocated before are freed and the I/Os are being restarted afresh. While
freeing the I/O resources, VVR primary node doesn't free the kernel memory
allocated for FS-VM private information data structure and causing the kernel
memory leak of 32 bytes for each restarted I/O.

RESOLUTION:
Code changes are made in VVR to free the kernel memory allocated for FS-VM
private information data structure before the I/O is restarted afresh.

* 2497796 (Tracking ID: 2235382)

SYMPTOM:
IOs can hang in DMP driver when IOs are in progress while carrying out path
failover.

DESCRIPTION:
While restoring any failed path to a non-A/A LUN, DMP driver is checking that
whether any pending IOs are there on the same dmpnode. If any are present then DMP
is marking the corresponding LUN with special flag so that path failover/failback
can be triggered by the pending IOs. There is a window here and by chance if all
the pending IOs return before marking the dmpnode, then any future IOs on the
dmpnode get stuck in wait queues.

RESOLUTION:
Make sure that whenever the LUN is having pending IOs then only to set the flag on
it so that failover can be triggered by pending IOs.

* 2507120 (Tracking ID: 2438426)

SYMPTOM:
The following messages are displayed after vxconfigd is started.

pp_claim_device: Could not get device number for /dev/rdsk/emcpower0 
pp_claim_device: Could not get device number for /dev/rdsk/emcpower1

DESCRIPTION:
Device Discovery Layer(DDL) has incorrectly marked a path under dmp device with 
EFI flag even though there is no corresponding Extensible Firmware Interface 
(EFI) device in /dev/[r]dsk/. As a result, Array Support Library (ASL) issues a 
stat command on non-existent EFI device and displays the above messages.

RESOLUTION:
Avoided marking EFI flag on Dynamic MultiPathing (DMP) paths which correspond to 
non-efi devices.

* 2507124 (Tracking ID: 2484334)

SYMPTOM:
The system panic occurs with the following stack while collecting the DMP 
stats.

dmp_stats_is_matching_group+0x314()
dmp_group_stats+0x3cc()
dmp_get_stats+0x194()
gendmpioctl()
dmpioctl+0x20()

DESCRIPTION:
Whenever new devices are added to the system, the stats table is adjusted to
accomodate the new devices in the DMP. There exists a race between the stats
collection thread and the thread which adjusts the stats table to accomodate
the new devices. The race can result the stats collection thread to access the
memory beyond the known size of the table causing the system panic.

RESOLUTION:
The stats collection code in the DMP is rectified to restrict the access to the 
known size of the stats table.

* 2508294 (Tracking ID: 2419486)

SYMPTOM:
Data corruption is observed with single path when naming scheme is changed 
from enclodure based (EBN) to OS Native (OSN).

DESCRIPTION:
The Data corruption can occur in the following configuration, 
when the naming scheme is changed while applications are on-line.

1. The DMP device is configured with single path or the devices are controlled
   by Third party Multipathing Driver (Ex: MPXIO, MPIO etc.,)

2. The DMP device naming scheme is EBN (enclosure based naming) and 
persistence=yes

3. The naming scheme is changed to OSN using the following command
   # vxddladm set namingscheme=osn


There is possibility of change in name of the VxVM device (DA record) while
the naming scheme is changing. As a result of this the device attribute list 
is updated with new DMP device names. Due to a bug in the code which updates 
the attribute list, the VxVM device records are mapped to wrong DMP devices.

Example:

Following are the device names with EBN naming scheme.

MAS-usp0_0   auto:cdsdisk    hitachi_usp0_0  prod_SC32    online
MAS-usp0_1   auto:cdsdisk    hitachi_usp0_4  prod_SC32    online
MAS-usp0_2   auto:cdsdisk    hitachi_usp0_5  prod_SC32    online
MAS-usp0_3   auto:cdsdisk    hitachi_usp0_6  prod_SC32    online
MAS-usp0_4   auto:cdsdisk    hitachi_usp0_7  prod_SC32    online
MAS-usp0_5   auto:none       -            -            online invalid
MAS-usp0_6   auto:cdsdisk    hitachi_usp0_1  prod_SC32    online
MAS-usp0_7   auto:cdsdisk    hitachi_usp0_2  prod_SC32    online
MAS-usp0_8   auto:cdsdisk    hitachi_usp0_3  prod_SC32    online
MAS-usp0_9   auto:none       -            -            online invalid
disk_0       auto:cdsdisk    -            -            online
disk_1       auto:none       -            -            online invalid

bash-3.00# vxddladm set namingscheme=osn

The follwoing is after executing the above command.
The MAS-usp0_9 is changed as MAS-usp0_6 and the following devices
are changed accordingly.

bash-3.00# vxdisk list
DEVICE       TYPE            DISK         GROUP        STATUS
MAS-usp0_0   auto:cdsdisk    hitachi_usp0_0  prod_SC32    online
MAS-usp0_1   auto:cdsdisk    hitachi_usp0_4  prod_SC32    online
MAS-usp0_2   auto:cdsdisk    hitachi_usp0_5  prod_SC32    online
MAS-usp0_3   auto:cdsdisk    hitachi_usp0_6  prod_SC32    online
MAS-usp0_4   auto:cdsdisk    hitachi_usp0_7  prod_SC32    online
MAS-usp0_5   auto:none       -            -            online invalid
MAS-usp0_6   auto:none       -            -            online invalid
MAS-usp0_7   auto:cdsdisk    hitachi_usp0_1  prod_SC32    online
MAS-usp0_8   auto:cdsdisk    hitachi_usp0_2  prod_SC32    online
MAS-usp0_9   auto:cdsdisk    hitachi_usp0_3  prod_SC32    online
c4t20000014C3D27C09d0s2 auto:none       -            -            online invalid
c4t20000014C3D26475d0s2 auto:cdsdisk    -            -            online

RESOLUTION:
Code changes are made to update device attribute list correctly even if name of
the VxVM device is changed while the naming scheme is changing.

* 2508418 (Tracking ID: 2390431)

SYMPTOM:
In a Disaster Recovery environment, when DCM (Data Change Map) is active and 
during SRL(Storage Replicator Log)/DCM flush, the system panics due to missing
parent on one of the DCM in an RVG (Replicated Volume Group).

DESCRIPTION:
The DCM flush happens during every log update and its frequency depends on the 
IO load. If the I/O load is high, the DCM flush happens very often and if there 
are more volumes in the RVG, the frequency is very high. Every DCM flush 
triggers the DCM flush on all the volumes in the RVG. If there are 50 volumes, 
in an RVG, then each DCM flush creates 50 children and is controlled by one 
parent SIO. Once all the 50 children are done, then the parent SIO releases 
itself for the next flush. Once the DCM flush of each child completes, it 
detaches itself from the parent by setting the parent field to NULL. It so 
happens that, if the 49th child is done and before it is detaching it from the 
parent, the 50th child completes and releases the parent_SIO for the next DCM 
flush. Before the 49th child detaches, the new DCM flush is started on the same 
50th child. After the next flush is started, the 49th child of the previous 
flush detaches itself from the parent and since it is a static SIO, it 
indirectly resets the new flush parent field. Also, the lock is not obtained 
before modifing the sio state field in a few scenarios.

RESOLUTION:
Before reducing the children count, detach the parent first. This will make 
sure the new flush will not race with the previous flush. Protect the field 
with the required lock in all the scenarios.

* 2511928 (Tracking ID: 2420386)

SYMPTOM:
Corrupted data is seen near the end of a sub-disk, on thin-reclaimable 
disks with either CDS EFI or sliced disk formats.

DESCRIPTION:
In environments with thin-reclaim disks running with either CDS-EFI 
disks or sliced disks, misaligned reclaims can be initiated. In some situations, 
when reclaiming a sub-disk, the reclaim does not take into account the correct 
public region start offset, which in rare instances can potentially result in 
reclaiming data before the sub-disk which is being reclaimed.

RESOLUTION:
The public offset is taken into account when initiating all reclaim
operations.

* 2515137 (Tracking ID: 2513101)

SYMPTOM:
When VxVM is upgraded from 4.1MP4RP2 to 5.1SP1RP1, the data on CDS disk gets
corrupted.

DESCRIPTION:
When CDS disks are initialized with VxVM version 4.1MP4RP2, the no of cylinders
are calculated based on the disk raw geometry. If the calculated no. of
cylinders exceed Solaris VTOC limit (65535), because of unsigned integer
overflow, truncated value of no of cylinders gets written in CDS label.
    After the VxVM is upgraded to 5.1SP1RP1, CDS label gets wrongly written in
the public region leading to the data corruption.

RESOLUTION:
The code changes are made  to suitably adjust the no. of tracks & heads so that
the calculated no. of cylinders be within Solaris VTOC limit.

* 2525333 (Tracking ID: 2148851)

SYMPTOM:
"vxdisk resize" operation fails on a disk with VxVM cdsdisk/simple/sliced layout
on Solaris/Linux platform with the following message:

      VxVM vxdisk ERROR V-5-1-8643 Device emc_clariion0_30: resize failed: New
      geometry makes partition unaligned

DESCRIPTION:
The new cylinder size selected during "vxdisk resize" operation is unaligned with
the partitions that existed prior to the "vxdisk resize" operation.

RESOLUTION:
The algorithm to select the new geometry has been redesigned such that the new
cylinder size is always aligned with the existing as well as new partitions.

* 2531983 (Tracking ID: 2483053)

SYMPTOM:
VVR Primary system consumes very high kernel heap memory and appear to 
be hung.

DESCRIPTION:
There is a race between REGION LOCK deletion thread which runs as 
part of SLAVE leave reconfiguration and the thread which process the DATA_DONE 
message coming from log client to logowner. Because of this race, the flags 
which stores the status information about the I/Os was not correctly updated. 
This used to cause a lot of SIOs being stuck in a queue consuming a large kernel 
heap.

RESOLUTION:
The code changes are made to take the proper locks while updating 
the SIOs' fields.

* 2531987 (Tracking ID: 2510523)

SYMPTOM:
In CVM-VVR configuration, I/Os on "master" and "slave" nodes hang when "master"
role is switched to the other node using "vxclustadm setmaster" command.

DESCRIPTION:
Under heavy I/O load, the I/Os are sometimes throttled in VVR, if number of
outstanding I/Os on SRL reaches a certain limit (2048 I/Os).
When "master" role is switched to the other node by using "vxclustadm setmaster"
command, the throttled I/Os on original master are never restarted. This causes
the I/O hang.

RESOLUTION:
Code changes are made in VVR to make sure the throttled I/Os are restarted
before "master" switching is started.

* 2531993 (Tracking ID: 2524936)

SYMPTOM:
Disk group is disabled after rescanning disks with "vxdctl enable"
command with the console output below,


 <timestamp> pp_claim_device:         0 
 <timestamp> Could not get metanode from ODM database  
 <timestamp> pp_claim_device:         0 
 <timestamp> Could not get metanode from ODM database  

The error messages below are also seen in vxconfigd debug log output,
              
<timestamp>  VxVM vxconfigd ERROR V-5-1-12223 Error in claiming /dev/<disk>: The 
process file table is full. 
<timestamp>  VxVM vxconfigd ERROR V-5-1-12223 Error in claiming /dev/<disk>: The 
process file table is full. 
...
<timestamp> VxVM vxconfigd ERROR V-5-1-12223 Error in claiming /dev/<disk>: The 
process file table is full.

AIX-

DESCRIPTION:
When the total physical memory in AIX machine is greater than or equal to
40GB & multiple of 40GB (like 80GB, 120GB), a limitation/bug in setulimit
function causes an overflowed value set as the new limit/size of the data area,
which results in memory allocation failures in vxconfigd. Creation of the shared
memory segment also fails during this course. Error handling of this case is 
missing in vxconfigd code, hence resulting in error in claiming disks and 
offlining configuration copies which in-turn results in disabling disk group.

AIX-

RESOLUTION:
Code changes are made to handle the failure case on shared memory segment
creation.

* 2552402 (Tracking ID: 2432006)

SYMPTOM:
System intermittently hangs during boot if disk is encapsulated.
When this problem occurs, OS boot process stops after outputing this:
"VxVM sysboot INFO V-5-2-3409 starting in boot mode..."

DESCRIPTION:
The boot process hung due to a dead lock between two threads, one VxVM
transaction thread and another thread attempting a read on root volume 
issued by dhcpagent.  Read I/O is deferred till transaction is finished but
read count incremented earlier is not properly adjusted.

RESOLUTION:
Proper care is taken to decrement pending read count if read I/O is deferred.

* 2553391 (Tracking ID: 2536667)

SYMPTOM:
[04DAD004]voldiodone+000C78 (F10000041116FA08) 
[04D9AC88]volsp_iodone_common+000208 (F10000041116FA08, 
0000000000000000, 
  0000000000000000) 
[04B7A194]volsp_iodone+00001C (F10000041116FA08) 
[000F3FDC]internal_iodone_offl+0000B0 (??, ??) 
[000F3F04]iodone_offl+000068 () 
[000F20CC]i_softmod+0001F0 () 
[0017C570].finish_interrupt+000024 ()

DESCRIPTION:
Panic happened due to accessing a stale DG pointer as DG got deleted before the 
I/O returned. It may happen on cluster configuration where commands generating 
private region i/os and "vxdg deport/delete" commands are executing 
simultaneously on two nodes of the cluster.

RESOLUTION:
Code changes are made to drain private region I/Os before deleting the DG.

* 2562911 (Tracking ID: 2375011)

SYMPTOM:
User is not able to change the "dmp_native_support" tunable to "on" or "off"
in the presence of the root ZFS pool.

SOL_

DESCRIPTION:
DMP does not allow the dmp_native_support tunable to be changed if any of the
ZFS pools is in use. Therefore in the presence of root ZFS pool, DMP reports the
following error when the user tried to change the "dmp_native_support" tunable
to "on" or "off"

# vxdmpadm settune dmp_native_support=off
VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more zpools
VxVM vxdmpadm ERROR V-5-1-15686 The following zpool(s) could not be migrated as
they are in use -
     rpool

SOL_

RESOLUTION:
DMP code has been changed to skip the root ZFS pool in its internal checks for
active ZFS pools prior to changing the value of dmp_native_support tunable.

* 2563291 (Tracking ID: 2527289)

SYMPTOM:
In a Campus Cluster setup, storage fault may lead to DETACH of all the
configured site. This also results in IOfailure on all the nodes in the Campus
Cluster.

DESCRIPTION:
Site detaches are done on site consistent dgs when any volume in the dg looses
all the mirrors of a Site. During the processing of the DETACH of last mirror in
a site we identify that it is the last mirror and DETACH the site which in turn
detaches all the objects of that site.

In Campus Cluster setup we attach a dco volume for any data volume created on a
site-consistent dg. The general configuration is to have one DCO mirror on each
site. Loss of a single mirror of the dco volume on any node will result in the
detach of that site. 

In a 2 site configuration this particular scenario would result in both the dco
mirrors being lost simultaneously. While the site detach for the first mirror is
being processed we also signal for DETACH of the second mirror which ends up
DETACHING the second site too. 

This is not hit in other tests as we already have a check to make sure that we
do not DETACH the last mirror of a Volume. This check is being subverted in this
particular case due to the type of storage failure.

RESOLUTION:
Before triggering the site detach we need to have an explicit check to see if we
are trying to DETACH the last ACTIVE site.

* 2574840 (Tracking ID: 2344186)

SYMPTOM:
In a master-slave configuration with FMR3/DCO volumes, reboot of a cluster node 
fails to join back the cluster again with following error messages in the 
console

[..]
Jul XX 18:44:09 vienna vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-11092 
cleanup_client: (Volume recovery in progress) 230
Jul XX 18:44:09 vienna vxvm:vxconfigd: [ID 702911 daemon.error] V-5-1-11467 
kernel_fail_join() :                Reconfiguration interrupted: Reason is 
retry to add a node failed (13, 0)
[..]

DESCRIPTION:
VxVM volumes with FMR3/DCO have inbuilt DRL mechanism to track the disk block 
of 
in-flight IOs in order to recover the data much quicker in case of a node 
crash. 
Thus, a joining node awaits the variable, responsible for recovery, to get 
unset 
to join the cluster. However, due to a bug in FMR3/DCO code, this variable was 
set 
forever, thus leading to node join failure.

RESOLUTION:
Modified the FMR3/DCO code to appropriately set and unset this recovery 
variable.


INSTALLING THE PATCH
--------------------
o Before-the-upgrade :-
  (a) Stop I/Os to all the VxVM volumes.
  (b) Umount any filesystems with VxVM volumes.
  (c) Stop applications using any VxVM volumes.

For Solaris  9, and 10 releases, refer to the man pages for instructions on using 'patchadd' and 'patchrm' scripts provided with Solaris.
Any other special or non-generic installation instructions should be described below as special instructions.  The following example installs a patch to a st
andalone machine:

        example# patchadd 146884-xx


REMOVING THE PATCH
------------------
The following example removes a patch from a standalone system:

        example# patchrm 146884-xx


SPECIAL INSTRUCTIONS
--------------------
You need to use the shutdown command to reboot the system after patch
installation or de-installation:

    shutdown -g0 -y -i6


A Solaris 10 issue prevents this patch from complete installation.
Before installing this VM patch, install the Solaris patch
119254-70 (or a later revision). This Solaris patch fixes packaging,
installation and patch utilities. [Sun Bug ID 6337009]

Download Solaris 10 patch 119254-70 (or later) from Sun at
http://sunsolve.sun.com


OTHERS
------
NONE