README VERSION		     : 1.1
README CREATION DATE         : 2019-02-25
PATCH-ID                     : 7.3.1.2300 
PATCH NAME                   : VRTSvxvm 7.3.1.2300
BASE PACKAGE NAME            : VRTSvxvm
BASE PACKAGE VERSION         : 7.3.1.000
SUPERSEDED PATCHES           : 7.3.1.100
REQUIRED PATCHES             : NONE
INCOMPATIBLE PATCHES         : NONE
SUPPORTED PADV               : sol11_sparc
(P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION)
PATCH CATEGORY               :  CORE ,  CORRUPTION ,  HANG ,  MEMORYLEAK ,  PANIC ,  PERFORMANCE
PATCH CRITICALITY            : CRITICAL
HAS KERNEL COMPONENT         : YES
ID                           : NONE
REBOOT REQUIRED              : YES
REQUIRE APPLICATION DOWNTIME : YES

PATCH INSTALLATION INSTRUCTIONS:
--------------------------------
The installation of this P-Patch will cause downtime.

Run the Installer script to automatically install the patch:
-----------------------------------------------------------

To install the patch perform the following steps on at least one node in the cluster:

1. Copy the patch vm-sol11_sparc-Patch-7.3.1.2300.tar.gz to /tmp
2. Untar vm-sol11_sparc-Patch-7.3.1.2300.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/vm-sol11_sparc-Patch-7.3.1.2300.tar.gz
    # tar xf /tmp/vm-sol11_sparc-Patch-7.3.1.2300.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSvxvm731P2300 [<host1>    <host2>   ...] 

This patch can be installed in connection with the InfoScale 7.3.1 maintenance release using the Install Bundles installer feature:

1. Download this patch and extract it to a directory of your choosing
2. Change to the directory hosting the Veritas InfoScale 7.3.1 base product software and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory you created in step 1.

    # ./installer -patch_path [<host1>    <host2>   ...] 


Install the patch manually:
--------------------------
o Before-applying-the-patch:-
  (a) Stop applications that access VxVM volumes.
  (b) Stop I/Os to all the VxVM volumes.
  (c) Umount any filesystems residing on VxVM volumes.
  (d) In case of multiple boot environments, boot using the BE (Boot-Environment) you wish to install the patch on.

For Solaris 11, refer to the man pages for specific instructions on using the 'pkg' command to install the patch provided.

Any other special or non-generic installation instructions should be described below as special instructions.

The following example installs the updated VRTSvxvm patch on a single-standalone machine:

        Example# pkg install --accept -g /patch_location/VRTSvxvm.p5p VRTSvxvm

After 'pkg install' please follow mandatory configuration steps mentioned in special instructions.

Please follow the special instructions mentioned below after installing the patch.


PATCH UNINSTALLATION INSTRUCTIONS:
----------------------------------
For Solaris 11.1 or later, if DMP native support is enabled, DMP controls the ZFS root pool. Turn off native support before removing the patch.

***  If DMP native support is enabled:
                 a.It is essential to disable DMP native support.

                        Run the following command to disable DMP native support

                        # vxdmpadm settune dmp_native_support=off

                  b.Reboot the system
                         # reboot

NOTE: If you do not disable native support prior to removing the VxVM patch, the system cannot be restarted after you remove DMP.
Please ensure you have access to the base 7.3.1 Veritas software prior to removing the updated VRTSvxvm package.

NOTE: Uninstalling the patch will remove the entire package. 

The following example removes a patch from a standalone system:

The VRTSvxvm package cannot be removed unless you also remove the VRTSaslapm package.Therefore the pkg uninstall command will fail as follows:

# pkg uninstall VRTSvxvm
Creating Plan (Solver setup): -
pkg uninstall: Unable to remove 'VRTSvxvm@7.3.1.2300' due to the following packages that depend on it:
  VRTSaslapm@7.3.1.0


You will also need to uninstall the VRTSaslapm package.

# pkg uninstall VRTSvxvm VRTSaslapm

NOTE: You will need access to the base software of the VRTSvxvm package (original source media) to reinstall the uninstalled packages.


SPECIAL INSTRUCTIONS:
---------------------
NONE

SUMMARY OF FIXED ISSUES:
-----------------------------------------
 PATCH ID:7.3.1.2300
3933888 (3868533)  IO hang happens because of a deadlock situation. 
3935967 (3935965)  Update VxVM(Veritas Volume Manager) package on SunOS alternate BE(Boot 
Environment) not supported well. 
3958860 (3953681)  Data corruption issue is seen when more than one plex of volume is detached. 
3959451 (3913949)  The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen. 
3959452 (3931678)  Memory allocation and locking optimizations during the CVM
(Cluster Volume Manager) IO shipping. 
3959453 (3932241)  VxVM (Veritas Volume Manager) creates some required files under /tmp and
/var/tmp directories. These directories could be modified by non-root users and
will affect the Veritas Volume Manager Functioning. 
3959455 (3932496)  In an FSS environment, volume creation might fail on the 
SSD devices if vxconfigd was earlier restarted. 
3959458 (3936535)  Poor performance due to frequent cache drops. 
3959460 (3942890)  IO hang as DRL flush gets into infinite loop. 
3959461 (3946350)  kmalloc-1024 and kmalloc-2048 memory consuming keeps increasing when VVR IO 
size is more than 256K. 
3959463 (3954787)  Data corruption may occur in GCO along with FSS environment on RHEL 7.5 Operating system. 
3959471 (3932356)  vxconfigd dumping core while importing DG 
3959473 (3945115)  VxVM (Veritas Volume Manager) vxassist relayout command fails for volumes with 
RAID layout. 
3959475 (3950384)  In a scenario where volume encryption at rest is enabled, data corruption may
occur if the file system size exceeds 1TB. 
3959476 (3950675)  vxdg import appears to hang forever 
3959477 (3953845)  IO hang can be experienced when there is memory pressure situation because of "vxencryptd". 
3959478 (3956027)  System panicked while removing disks from disk group because of race condition between IO stats and disk removal code. 
3959479 (3956727)  In SOLARIS DDL discovery when SCSI ioctl fails, direct disk IO on device can lead to high memory consumption and vxconfigd hangs. 
3959480 (3957227)  Disk group import succeeded, but with error message. This may cause confusion. 
3961353 (3950199)  System may panic while DMP(Dynamic Multipathing) path restoration. 
3961356 (3953481)  A stale entry of the old disk is left under /dev/[r]dsk even after replacing it. 
3961358 (3955101)  Panic observed in GCO environment (cluster to cluster replication) during replication. 
3961359 (3955725)  Utility  to clear "failio" flag on disk after storage connectivity is back. 
3961468 (3926067)  vxassist relayout /vxassist commands may fail in Campus Cluster environment. 
3961469 (3948140)  System panic can occur if size of RTPG (Report Target Port Groups) data returned
by underlying array is greater than 255. 
3961480 (3957549)  Server panicked when tracing event because of NULL pointer check missing. 
3964315 (3952042)  vxdmp iostat memory allocation might cause memory fragmentation and pagecache drop. 
3964360 (3964359)  The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen. 
3967893 (3966872)  Deport and rename clone DG changes the name of clone DG along with source DG 
3967895 (3877431)  System panic after filesystem expansion. 
3967898 (3930914)  Master node panic occurs while sending responding message to slave node. 
3968854 (3964779)  Changes to support Solaris 11.4 with Volume Manager 
3969591 (3964337)  Partition size getting set to default after running vxdisk scandisks. 
3969997 (3964359)  The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen. 
3970119 (3943952)  Rolling upgrade to Infoscale 7.4 and above is broken. 
3970370 (3970368)  Sol11.4 DMP+DR observed error messages in dmpdr -o refresh utility 
 PATCH ID:7.3.1.100
3932464 (3926976)  Frequent loss of VxVM functionality due to vxconfigd unable to validate license. 
3933874 (3852146)  Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together 
3933875 (3872585)  System panics with storage key exception. 
3933877 (3914789)  System may panic when reclaiming on secondary in VVR environment. 
3933878 (3918408)  Data corruption when volume grow is attempted on thin reclaimable disks whose space is just freed. 
3933880 (3864063)  Application I/O hangs because of a race between the Master Pause SIO (Staging
I/O) and the Error Handler SIO. 
3933882 (3865721)  Vxconfigd may hang while pausing the replication in CVR(cluster Veritas Volume 
Replicator) environment. 
3933883 (3867236)  Application IO hang happens because of a race between Master Pause SIO(Staging IO) 
and RVWRITE1 SIO. 
3933890 (3879324)  VxVM DR tool fails to handle busy device problem while LUNs are removed from  OS 
3933893 (3890602)  OS command cfgadm command hangs after reboot when hundreds devices are under 
DMP's(Dynamic Multi-Pathing) control. 
3933894 (3893150)  VxDMP vxdmpadm native ls command sometimes doesn't report imported disks' 
pool name 
3933897 (3907618)  vxdisk resize leads to data corruption on filesystem 
3933899 (3910675)  Disks directly attached to the system cannot be exported in FSS environment 
3933900 (3915523)  Local disk from other node belonging to private DG(diskgroup) is exported to the
node when a private DG is imported on current 
node. 
3933901 (3915953)  Enabling dmp_native_support takes much more time to get complete. 
3933903 (3918356)  zpools are imported automatically when DMP(Dynamic Multipathing) native support is set to on which may lead to zpool corruption. 
3933904 (3921668)  vxrecover command with -m option fails when executed on the slave
nodes. 
3933907 (3873123)  If the disk with CDS EFI label is used as remote
disk on the cluster node, restarting the vxconfigd
daemon on that particular node causes vxconfigd
to go into disabled state 
3933910 (3910228)  Registration of GAB(Global Atomic Broadcast) port u fails on slave nodes after 
multiple new devices are added to the system. 
3933913 (3905030)  system hang when install/uninstall VxVM(Veritas Volume Manager) 
3936428 (3932714)  OS panicked while performing IO on dmpnode. 
3937541 (3911930)  Provide a way to clear the PGR_FLAG_NOTSUPPORTED on the device instead of using
exclude/include commands 
3937545 (3932246)  vxrelayout operation fails to complete. 
3937549 (3934910)  DRL map leaks during snapshot creation/removal cycle with dg reimport. 
3937550 (3935232)  Replication and IO hang during master takeover because of racing between log 
owner change and master switch. 
3937808 (3931936)  VxVM(Veritas Volume Manager) command hang on master node after 
restarting 
slave node. 
3937811 (3935974)  When client process shuts down abruptly or resets connection during 
communication with the vxrsyncd daemon, it may terminate
vxrsyncd daemon. 
3938392 (3909630)  OS Panic happens while registering DMP(Dynamic Multi Pathing) statistic
information. 
3944743 (3945411)  System wasn't able to boot after enabling DMP native support for ZFS boot 
devices 


SUMMARY OF KNOWN ISSUES:
-----------------------------------------
NONE 


KNOWN ISSUES : 
--------------
NONE


FIXED INCIDENTS: 
----------------


 PATCH ID:7.3.1.2300

 * INCIDENT NO:3933888	 TRACKING ID:3868533

SYMPTOM: IO hang happens when starting replication. VXIO deamon hang with stack like 
following:

vx_cfs_getemap at ffffffffa035e159 [vxfs]
vx_get_freeexts_ioctl at ffffffffa0361972 [vxfs]
vxportalunlockedkioctl at ffffffffa06ed5ab [vxportal]
vxportalkioctl at ffffffffa06ed66d [vxportal]
vol_ru_start at ffffffffa0b72366 [vxio]
voliod_iohandle at ffffffffa09f0d8d [vxio]
voliod_loop at ffffffffa09f0fe9 [vxio] 

DESCRIPTION: While performing DCM replay in case Smart Move feature is enabled, VxIO 
kernel needs to issue IOCTL to VxFS kernel to get file system free region. 
VxFS kernel needs to clone map by issuing IO to VxIO kernel to complete this 
IOCTL. Just at the time RLINK disconnection happened, so RV is serialized to 
complete the disconnection. As RV is serialized, all IOs including the 
clone map IO form VxFS is queued to rv_restartq, hence the deadlock. 

RESOLUTION: Code changes have been made to handle the dead lock situation. 

 * INCIDENT NO:3935967	 TRACKING ID:3935965

SYMPTOM: After updating VxVM package on alternate BE, some VxVM binaries aren't 
updated. 

DESCRIPTION: VxVM isn't supporting updating package on alternate BE very well. In post-
install script, SunOS version specific binaries should be copied with installation 
root directory defined by PKG_INSTALL_ROOT environment variable, but this 
variable is defined as fixed value "/", hence these binaries aren't copied. 

RESOLUTION: Code changes have been made to post-install script to handle situation of 
installation on alternate BE. 

 * INCIDENT NO:3958860	 TRACKING ID:3953681

SYMPTOM: Data corruption issue is seen when more than one plex of volume is detached. 

DESCRIPTION: When a plex of volume gets detached, DETACH map gets enabled in the DCO (Data Change Object). The incoming IO's are tracked in DRL (Dirty Region Log) and then asynchronously copied to DETACH map for tracking.
If one more plex gets detached then it might happen that some of the new incoming regions are missed in the DETACH map of the previously detached plex.
This leads to corruption when the disk comes back and plex resync happens using corrupted DETACH map. 

RESOLUTION: Code changes are done to correctly track the IO's in the DETACH map of previously detached plex and avoid corruption. 

 * INCIDENT NO:3959451	 TRACKING ID:3913949

SYMPTOM: The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen.

The DG import may fail due to split brain with following messages in syslog:
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF000043042DD4A5
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF00004003FE9356 

DESCRIPTION: When a disk is detached, the SSB ID of the remaining DA and DM records
shall be incremented. Unfortunately for some reason, the SSB ID of DA 
record is only incremented, but the SSB ID of DM record is NOT updated. 
One probable reason may be because the disks get detached before updating
the DM records. 

RESOLUTION: A work-around option is provided to bypass the SSB checks while importing the DG, the user
can import the DG with 'vxdg -o overridessb import <dgname>' command if a false split brain
happens.
For using '-o overridessb', one should confirm that all DA records 
of the DG are available in ENABLED state and are differing with DM records against 
SSB by 1. 

 * INCIDENT NO:3959452	 TRACKING ID:3931678

SYMPTOM: There was a performance issue while shipping IO to the remote disks 
due
to non-cached memory allocation and the redundant locking. 

DESCRIPTION: There was redundant locking while checking if the flow control is
enabled by GAB during IO shipping. The redundant locking is optimized.
Additionally, the memory allocation during IO shipping is optimized. 

RESOLUTION: Changes are done in VxVM code to optimize the memory allocation and reduce the
redundant locking to improve the performance. 

 * INCIDENT NO:3959453	 TRACKING ID:3932241

SYMPTOM: VxVM (Veritas Volume Manager) creates some required files under /tmp
and /var/tmp directories. 

DESCRIPTION: VxVM (Veritas Volume Manager) creates some required files under /tmp and
/var/tmp directories. 
The non-root users have access to these folders, and they may accidently modify,
move or delete those files.
Such actions may interfere with the normal functioning of the Veritas Volume
Manager. 

RESOLUTION: This Hot Fix address the issue by moving the required Veritas Volume Manager
files to secure location. 

 * INCIDENT NO:3959455	 TRACKING ID:3932496

SYMPTOM: In an FSS environment, volume creation might fail on SSD devices if 
vxconfigd was earlier restarted on the master node. 

DESCRIPTION: In an FSS environment, if a shared disk group is created using 
SSD devices and vxconfigd is restarted, then the volume creation might fail. 
The problem is because the mediatype attribute of the disk was NOT propagated 
from kernel to vxconfigd while restarting the vxconfigd daemon. 

RESOLUTION: Changes are done in VxVM code to correctly propagate mediatype for remote 
devices during vxconfigd startup. 

 * INCIDENT NO:3959458	 TRACKING ID:3936535

SYMPTOM: The IO performance is poor due to frequent cache drops on snapshots 
configured system. 

DESCRIPTION: On VxVM snapshots configured system, along with the IO going, DCO map update 
will happen and it could allocate lots of chucks of pages memory, which 
triggered kswapd to swap the cache memory out, so cache drops were seen. 

RESOLUTION: Code changes are done to allocate big size memory for DCO map update without 
triggering memory swap out. 

 * INCIDENT NO:3959460	 TRACKING ID:3942890

SYMPTOM: In case of Data Change Object (DCO) configured, IO hang may happen with 
heavy IO
load plus slow Storage Replicator Log (SRL) 
flush. 

DESCRIPTION: Application IO needs to wait for DRL flush complete to proceed. Due to a 
defect
in DRL 
code, DRL flush couldn't proceed in case there're large amount of IO which
exceeded 
avaiable DRL chunks, hence IO hang. 

RESOLUTION: Code changes have been made to fix the issue. 

 * INCIDENT NO:3959461	 TRACKING ID:3946350

SYMPTOM: kmalloc-1024 and kmalloc-2048 memory consuming keeps increasing when Veritas 
Volume Replicator (VVR) IO size is more than 256K. 

DESCRIPTION: In case of VVR ,if I/O size is more than 256K,then the IO is broken into 
child 
IOs. Due to code defect, the allocated space doesn't got freed when splited 
IOs are completed. 

RESOLUTION: The code is modified to free VxVM allocated memory after split IOs competed. 

 * INCIDENT NO:3959463	 TRACKING ID:3954787

SYMPTOM: On a RHEL 7.5 FSS environment with GCO configured having NVMe devices and Infiniband network, data corruption might occur when sending the IO from Master to slave node. 

DESCRIPTION: In the recent RHEL 7.5 release, linux stopped allowing IO on the underlying NVMe device which has gaps in between BIO vectors. In case of VVR, the SRL header of 3 blocks is added to the BIO . When the BIO is sent through LLT to the other node because of LLT limitation of 32 fragments can lead to unalignment of BIO vectors. When this unaligned BIO is sent to the underlying NVMe device, the last 3 blocks of the BIO are skipped and not written to the disk on the slave node. This leads to incomplete data written on the slave node which leads to data corruption. 

RESOLUTION: Code changes have been done to handle this case and send the BIO aligned to the underlying NVMe device. 

 * INCIDENT NO:3959471	 TRACKING ID:3932356

SYMPTOM: In a two node cluster vxconfigd dumps core while importing the DG -

 dapriv_da_alloc ()
 in setup_remote_disks ()
 in volasym_remote_getrecs ()
 req_dg_import ()
 vold_process_request ()
 start_thread () from /lib64/libpthread.so.0
 from /lib64/libc.so.6 

DESCRIPTION: The vxconfigd is dumping core due to address alignment issue. 

RESOLUTION: The alignment issue is fixed. 

 * INCIDENT NO:3959473	 TRACKING ID:3945115

SYMPTOM: VxVM vxassist relayout command fails for volumes with RAID layout with the 
following message:
VxVM vxassist ERROR V-5-1-2344 Cannot update volume <vol-name>
VxVM vxassist ERROR V-5-1-4037 Relayout operation aborted. (7) 

DESCRIPTION: During relayout operation, the target volume inherits the attributes from 
original volume. One of those attributes is read policy. In case if the layout 
of original volume is RAID, it will set RAID read policy. RAID read policy 
expects the target volume to have appropriate log required for RAID policy. 
Since the target volume is of different layout, it does not have the log 
present and hence relayout operation fails. 

RESOLUTION: Code changes have been made to set the read policy to SELECT for target 
volumes rather than inheriting it from original volume in case original volume 
is of RAID layout. 

 * INCIDENT NO:3959475	 TRACKING ID:3950384

SYMPTOM: In a scenario where volume data encryption at rest is enabled, data corruption
may occur if the file system size exceeds 1TB and the data is located in a file
extent which has an extent size bigger than 256KB. 

DESCRIPTION: In a scenario where data encryption at rest is enabled, data corruption may
occur when both the following cases are satisfied:
- File system size is over 1TB
- The data is located in a file extent which has an extent size bigger than 256KB
This issue occurs due to a bug which causes an integer overflow for the offset. 

RESOLUTION: As a part of this fix, appropriate code changes have been made to improve data
encryption behavior such that the data corruption does not occur. 

 * INCIDENT NO:3959476	 TRACKING ID:3950675

SYMPTOM: The following command is not progressing and appears stuck.

# vxdg import <dg-name> 

DESCRIPTION: The DG import command is found to be non-progressing some times on backup system.
The analysis of the situation has shown that devices belonging to the DG are
found to be reporting "devid mismatch" more than once, due to not following
graceful DR steps. An erroneous processing of such situation resulted in not
allowing IOs on such devices leading to DG import hang. 

RESOLUTION: The code processing the "devid mismatch" is rectified. 

 * INCIDENT NO:3959477	 TRACKING ID:3953845

SYMPTOM: IO hang can be experienced when there is memory pressure situation because of "vxencryptd". 

DESCRIPTION: For large size IO, VxVM(Veritas Volume Manager)  tries to acquire contiguous pages in memory for some of its internal data structures. In heavy memory pressure scenarios, there can be a possibility that contiguous pages are not available. In such case it waits till the required pages are available for allocation and does not process the request further. This causes IO hang kind of situation where IO can't progress further or progresses very slowly. 

RESOLUTION: Code changes are done to avoid the IO hang situation. 

 * INCIDENT NO:3959478	 TRACKING ID:3956027

SYMPTOM: System panicked while removing disks from disk group, stack likes the following:

[0000F4C4]___memmove64+0000C4 ()
[077ED5FC]vol_get_one_io_stat+00029C ()
[077ED8FC]vol_get_io_stats+00009C ()
[077F1658]volinfo_ioctl+0002B8 ()
[07809954]volsioctl_real+0004B4 ()
[079014CC]volsioctl+00004C ()
[07900C40]vols_ioctl+000120 ()
[00605730]rdevioctl+0000B0 ()
[008012F4]spec_ioctl+000074 ()
[0068FE7C]vnop_ioctl+00005C ()
[0069A5EC]vno_ioctl+00016C ()
[006E2090]common_ioctl+0000F0 ()
[00003938]mfspurr_sc_flih01+000174 () 

DESCRIPTION: IO stats function was trying to access a freed disk that was being removed from disk group,  result in panic for illegal memory access. 

RESOLUTION: Code changes have been made to resolve this race condition. 

 * INCIDENT NO:3959479	 TRACKING ID:3956727

SYMPTOM: In SOLARIS DDL discovery when SCSI ioctl fails, direct disk IO on device can lead to high memory consumption and vxconfigd hangs. 

DESCRIPTION: In SOLARIS DDL discovery when SCSI ioctls on disk for private region IO fails we attempt a direct disk read/write to the disk.
Due to compiler issue this direct read/write gets invalid arguments which leads to high memory consumption and vxconfigd hangs. 

RESOLUTION: Changes are done in VxVM code to ensure correct arguments are passed to disk read/write. 

 * INCIDENT NO:3959480	 TRACKING ID:3957227

SYMPTOM: Disk group import succeeded, but with below error message:

vxvm:vxconfigd: [ID ** daemon.error] V-5-1-0 dg_import_name_to_dgid: Found dgid = ** 

DESCRIPTION: When do disk group import, two configuration copies may be found. Volume Manager will use the latest configuration copy, then print a message to indicate this scenario. Due to wrong log level, this message got printed in error category. 

RESOLUTION: Code changes has been made to suppress this harmless message. 

 * INCIDENT NO:3961353	 TRACKING ID:3950199

SYMPTOM: System may panic with following stack while DMP(Dynamic Mulitpathing) path 
restoration:

#0 [ffff880c65ea73e0] machine_kexec at ffffffff8103fd6b
 #1 [ffff880c65ea7440] crash_kexec at ffffffff810d1f02
 #2 [ffff880c65ea7510] oops_end at ffffffff8154f070
 #3 [ffff880c65ea7540] no_context at ffffffff8105186b
 #4 [ffff880c65ea7590] __bad_area_nosemaphore at ffffffff81051af5
 #5 [ffff880c65ea75e0] bad_area at ffffffff81051c1e
 #6 [ffff880c65ea7610] __do_page_fault at ffffffff81052443
 #7 [ffff880c65ea7730] do_page_fault at ffffffff81550ffe
 #8 [ffff880c65ea7760] page_fault at ffffffff8154e2f5
    [exception RIP: _spin_lock_irqsave+31]
    RIP: ffffffff8154dccf  RSP: ffff880c65ea7818  RFLAGS: 00210046
    RAX: 0000000000010000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000200246  RSI: 0000000000000040  RDI: 00000000000000e8
    RBP: ffff880c65ea7818   R8: 0000000000000000   R9: ffff8824214ddd00
    R10: 0000000000000002  R11: 0000000000000000  R12: ffff88302d2ce400
    R13: 0000000000000000  R14: ffff880c65ea79b0  R15: ffff880c65ea79b7
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff880c65ea7820] dmp_open_path at ffffffffa07be2c5 [vxdmp]
#10 [ffff880c65ea7980] dmp_restore_node at ffffffffa07f315e [vxdmp]
#11 [ffff880c65ea7b00] dmp_revive_paths at ffffffffa07ccee3 [vxdmp]
#12 [ffff880c65ea7b40] gendmpopen at ffffffffa07cbc85 [vxdmp]
#13 [ffff880c65ea7c10] dmpopen at ffffffffa07cc51d [vxdmp]
#14 [ffff880c65ea7c20] dmp_open at ffffffffa07f057b [vxdmp]
#15 [ffff880c65ea7c50] __blkdev_get at ffffffff811d7f7e
#16 [ffff880c65ea7cb0] blkdev_get at ffffffff811d82a0
#17 [ffff880c65ea7cc0] blkdev_open at ffffffff811d8321
#18 [ffff880c65ea7cf0] __dentry_open at ffffffff81196f22
#19 [ffff880c65ea7d50] nameidata_to_filp at ffffffff81197294
#20 [ffff880c65ea7d70] do_filp_open at ffffffff811ad180
#21 [ffff880c65ea7ee0] do_sys_open at ffffffff81196cc7
#22 [ffff880c65ea7f30] compat_sys_open at ffffffff811eee9a
#23 [ffff880c65ea7f40] symev_compat_open at ffffffffa0c9b08f 

DESCRIPTION: System panic can be encounter due to race condition.  There is a possibility 
that a path picked by DMP restore daemon for processing
may be deleted before the restoration process is complete. Hence when the 
restoration daemon tries to access the path properties it leads to system panic 
as the path properties are already freed. 

RESOLUTION: Code changes are done to handle the race condition. 

 * INCIDENT NO:3961356	 TRACKING ID:3953481

SYMPTOM: A stale entry of a replaced disk was left behind under /dev/[r]dsk 
to represent the replaced disk. 

DESCRIPTION: Whenever a disk is removed from DMP view, the driver property information of 
the disk has to be removed from the kernel, if not a stale entry will be 
left out under the /dev/[r]dsk. Now when a new disk replaces with the 
same minor number, instead of refreshing the property, the stale information is left. 

RESOLUTION: Code is modified to remove the stale device property when a disk is removed. 

 * INCIDENT NO:3961358	 TRACKING ID:3955101

SYMPTOM: Server might panic in a GCO environment with the following stack:

nmcom_server_main_tcp()
ttwu_do_wakeup()
ttwu_do_activate.constprop.90()
try_to_wake_up()
update_curr()
update_curr()
account_entity_dequeue()
 __schedule()
nmcom_server_proc_tcp()
kthread()
kthread_create_on_node()
ret_from_fork()
kthread_create_on_node() 

DESCRIPTION: There are recent changes done in the code to handle Dynamic port changes i.e deletion and addition of ports can now happen dynamically. It might happen that while accessing the port, it was deleted in the background by other thread. This would lead to a panic in the code since the port to be accessed has been already deleted. 

RESOLUTION: Code changes have been done to take care of this situation and check if the port is available before accessing it. 

 * INCIDENT NO:3961359	 TRACKING ID:3955725

SYMPTOM: Utility to clear "failio" flag on disk after storage connectivity is back. 

DESCRIPTION: If I/Os to the disks timeout due to some hardware failures like weak Storage  Area Network (SAN) cable link or Host Bus Adapter (HBA) failure, VxVM assumes 
that disk is bad or slow and it sets "failio" flag on the disk. Because of this  flag, all the subsequent I/Os fail with the No such device error. After the connectivity is back, the "failio" needs to clear using "vxdisk  <disk_name> failio=off". We have come up with a utility  "vxcheckfailio" which will
clear the "failio" flag for all the disks whose all paths are enabled. 

RESOLUTION: Code changes are done to add utility "vxcheckfailio" that will clear the "failio" flag on the disks. 

 * INCIDENT NO:3961468	 TRACKING ID:3926067

SYMPTOM: In a Campus Cluster environment, vxassist relayout command may fail with 
following error:
VxVM vxassist ERROR V-5-1-13124  Site  offline or detached
VxVM vxassist ERROR V-5-1-4037 Relayout operation aborted. (20)

vxassist convert command also might fail with following error:
VxVM vxassist ERROR V-5-1-10128  No complete plex on the site. 

DESCRIPTION: For vxassist "relayout" and "convert" operations in Campus Cluster 
environment, VxVM (Veritas Volume Manager) needs to sort the plexes of volume 
according to sites. When the number of 
plexes of volumes are greater than 100, the sorting of plexes fail due to a 
bug in the code. Because of this sorting failure, vxassist relayout/convert 
operations fail. 

RESOLUTION: Code changes are done to properly sort the plexes according to site. 

 * INCIDENT NO:3961469	 TRACKING ID:3948140

SYMPTOM: System may panic if RTPG data returned by the array is greater than 255 with
below stack:

dmp_alua_get_owner_state()
dmp_alua_get_path_state()
dmp_get_path_state()
dmp_check_path_state()
dmp_restore_callback()
dmp_process_scsireq()
dmp_daemons_loop() 

DESCRIPTION: The size of the buffer given to RTPG SCSI command is currently 255 bytes. But the
size of data returned by underlying array for RTPG can be greater than 255
bytes. As a result
incomplete data is retrieved (only the first 255 bytes) and when trying to read
the RTPG data, it causes invalid access of memory resulting in error while
claiming the devices. This invalid access of memory may lead to system panic. 

RESOLUTION: The RTPG buffer size has been increased to 1024 bytes for handling this. 

 * INCIDENT NO:3961480	 TRACKING ID:3957549

SYMPTOM: Server panicked when resyncing mirror volume with the following stack:

voliot_object_event+0x2e0
vol_oes_sio_start+0x80
voliod_iohandle+0x30
voliod_loop+0x248
thread_start+4 

DESCRIPTION: In case IO error happened during mirror resync, need to log trace event for the IO error. As the IO is from mirror resync, KIO should be NULL. But NULL pointer check for KIO is missed during logging trace event, hence the panic. 

RESOLUTION: Code changes have been made to fix the issue. 

 * INCIDENT NO:3964315	 TRACKING ID:3952042

SYMPTOM: dmpevents.log is flooding with below messages:
Tue Jul 11 09:28:36.620: Lost 12 DMP I/O statistics records
Tue Jul 11 10:05:44.257: Lost 13 DMP I/O statistics records
Tue Jul 11 10:10:05.088: Lost 6 DMP I/O statistics records
Tue Jul 11 11:28:24.714: Lost 6 DMP I/O statistics records
Tue Jul 11 11:46:35.568: Lost 1 DMP I/O statistics records
Tue Jul 11 12:04:10.267: Lost 13 DMP I/O statistics records
Tue Jul 11 12:04:16.298: Lost 5 DMP I/O statistics records
Tue Jul 11 12:44:05.656: Lost 31 DMP I/O statistics records
Tue Jul 11 12:44:38.855: Lost 2 DMP I/O statistics records 

DESCRIPTION: when DMP (Dynamic Multi-Pathing) expand iostat table, DMP allocates a new larger table, replaces the old table with the new one and frees the old one. This 
increases the possibility of memory fragmentation. 

RESOLUTION: The code is modified to increase the initial value for iostat table. 

 * INCIDENT NO:3964360	 TRACKING ID:3964359

SYMPTOM: The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen.

The DG import may fail due to split brain with following messages in syslog:
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF000043042DD4A5
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF00004003FE9356 

DESCRIPTION: When a disk is detached, the SSB ID of the remaining DA and DM records
shall be incremented. Unfortunately for some reason, the SSB ID of DA 
record is only incremented, but the SSB ID of DM record is NOT updated. 
One probable reason may be because the disks get detached before updating
the DM records. 

RESOLUTION: The code changes are done in DG import process to identify the false split brain condition and correct the
disk SSB IDs during the import. With this fix, the import shall NOT be fail due to a false split brain condition.

Additionally one more improvement is done in -o overridessb option to correct the disk
SSB IDs during import. 

Ideally with this fix the disk group import shall ideally NOT fail due to false split brain conditions. 
But if the disk group import still fails with a false split brain condition, then user can try -o overridessb option. 
For using '-o overridessb', one should confirm that all the DA records of the DG are available in ENABLED state 
and are differing with DM records against SSB by 1. 

 * INCIDENT NO:3967893	 TRACKING ID:3966872

SYMPTOM: The deport and rename of a cloned DG is renaming both source and cloned DG. 

DESCRIPTION: On a DR site both source and clone DGs can co-exist, where the source DG is deported while
the cloned DG is in imported state. Now the user attempts to deport-rename the cloned DG,
then names of both source and cloned DGs are changed. 

RESOLUTION: The deport code is fixed to take care of the situation. 

 * INCIDENT NO:3967895	 TRACKING ID:3877431

SYMPTOM: System panic after filesystem expansion with below stack:
#volkio_to_kio_copy at [vxio]
#volsio_nmstabilize at [vxio]
#vol_rvsio_preprocess at [vxio]
#vol_rv_write1_start at [vxio]
#voliod_iohandle at [vxio]
#voliod_loop at [vxio] 

DESCRIPTION: Veritas Volume Replicator (VVR) will generate different IOs in different stages. In some cases, parent IO doesn't wait till child IO is freed. Due a bug, a child IO accessed a freed parent IO's memory, further caused system panic. 

RESOLUTION: Code changes have been made to avoid modifying freed memory. 

 * INCIDENT NO:3967898	 TRACKING ID:3930914

SYMPTOM: Master node panicked with following stack:
[exception RIP: vol_kmsg_respond_common+111]
#9 [ffff880898513d18] vol_kmsg_respond at ffffffffa08fd8af [vxio]
#10 [ffff880898513d30] vol_rv_wrship_srv_done at ffffffffa0b9a955 [vxio]
#11 [ffff880898513d98] volkcontext_process at ffffffffa09e7e5c [vxio]
#12 [ffff880898513de0] vol_rv_write2_start at ffffffffa0ba3489 [vxio]
#13 [ffff880898513e50] voliod_iohandle at ffffffffa09e743a [vxio]
#14 [ffff880898513e88] voliod_loop at ffffffffa09e7640 [vxio]
#15 [ffff880898513ec8] kthread at ffffffff810a5b8f
#16 [ffff880898513f50] ret_from_fork at ffffffff81646a98 

DESCRIPTION: In case sending response for write shipping request to slave node, it may end up using the stale pointer of message handler, in which message block is NULL. When dereferencing the message block panic occurs. 

RESOLUTION: Code changes have been made to fix the issue. 

 * INCIDENT NO:3968854	 TRACKING ID:3964779

SYMPTOM: Current load of Vxvm modules i.e vxio and vxspec is failing on Solaris 11.4 

DESCRIPTION: The function page_numtopp_nolock has been replaced and renamed as pp_for_pfn_canfail. The _depends_on has been deprecated and cannot be used. VxVM was making use of the attribute to specify the dependency between the modules. 

RESOLUTION: The changes are mainly around the way we handle unmapped buf in vxio driver.
The Solaris API that we were using is no longer valid and is a private API.
Replaced hat_getpfnum() -> ppmapin/ppmapout calls with bp_copyin/bp_copyout in I/O code path.
In ioshipping, replaced it with miter approach and hat_kpm_paddr_mapin()/hat_kpm_paddr_mapout. 

 * INCIDENT NO:3969591	 TRACKING ID:3964337

SYMPTOM: After running vxdisk scandisks, the partition size gets set to default value of 512. 

DESCRIPTION: During device discovery, VxVM (Veritas Volume Manager) compares the original partition size present and the new partition size which is reported. In the code while reading the partition size from Kernel memory, the buffer utilized in the userland memory is not initialized and has a garbage value. Because of this the different between old partition size and new partition size is detected which leads to partition size being set to a default value. 

RESOLUTION: Code changes have been done to properly initialize the buffer in userland which is used to read data from kernel. 

 * INCIDENT NO:3969997	 TRACKING ID:3964359

SYMPTOM: The DG import is failing with Split Brain after the system is rebooted or when a storage 
disturbance is seen.

The DG import may fail due to split brain with following messages in syslog:
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF000043042DD4A5
V-5-1-9576 Split Brain. da id is 0.1, while dm id is 0.0 for dm
B000F8BF40FF00004003FE9356 

DESCRIPTION: When a disk is detached, the SSB ID of the remaining DA and DM records
shall be incremented. Unfortunately for some reason, the SSB ID of DA 
record is only incremented, but the SSB ID of DM record is NOT updated. 
One probable reason may be because the disks get detached before updating
the DM records. 

RESOLUTION: The code changes are done in DG import process to identify the false split brain condition and correct the
disk SSB IDs during the import. With this fix, the import shall NOT be fail due to a false split brain condition.

Additionally one more improvement is done in -o overridessb option to correct the disk
SSB IDs during import. 

Ideally with this fix the disk group import shall ideally NOT fail due to false split brain conditions. 
But if the disk group import still fails with a false split brain condition, then user can try -o overridessb option. 
For using '-o overridessb', one should confirm that all the DA records of the DG are available in ENABLED state 
and are differing with DM records against SSB by 1. 

 * INCIDENT NO:3970119	 TRACKING ID:3943952

SYMPTOM: Rolling upgrade from Infoscale 7.3.1.100 and above to Infoscale 7.4 and above 
in Flexible Storage Sharing (FSS) environment may lead to system panic. 

DESCRIPTION: As a part of the code changes for the option (islocal=yes/no) which was added to the the command "vxddladm addjbod"
in IS 7.3.1.100, the size of UDID of the DMP nodes has been increased. In case of Flexible Storage Sharing,
when performing rolling upgrade from the patches 7.3.1.100 and above, to any Infoscale 7.4 and above releases,
mismatch of this UDID between the nodes may cause the systems to panic when IO is shipped from one node to the other. 

RESOLUTION: Code changes have been made to handle the mismatch of the UDID and rolling upgrade to IS 7.4 and above is now fixed. 

 * INCIDENT NO:3970370	 TRACKING ID:3970368

SYMPTOM: while performing the DMP + DR test case error messages are observed while running the dmpdr -o refresh utility.
/usr/lib/vxvm/voladm.d/bin/dmpdr -o refresh
WARN: Please Do not Run any Device Discovery Operations outside the Tool during Reconfiguration operations
INFO: The logs of current operation can be found at location /var/adm/vx/dmpdr_20181128_1638.log
INFO: Collecting OS Version Info
ERROR: Collecting LeadVille Version - Failed..Because,command [modinfo | grep "SunFC FCP"] failed with the Error:[]
INFO: Collecting SF Product version Info
INFO: Checking if MPXIO is enabled 

DESCRIPTION: In Solaris 11.4 the module "fcp (SunFC FCP)" has been renamed to "fcp (Fibre Channel SCSI ULP)". The module is used during DMPDR testing for refreshing/checking the FC devices. Because the name of the module has changed, failure is observed while running "dmpdr -o refresh" command. 

RESOLUTION: The code has been changed to take care of name change between Solaris 11.3 and 11.4. 

 PATCH ID:7.3.1.100

 * INCIDENT NO:3932464	 TRACKING ID:3926976

SYMPTOM: Excessive number of connections are found in open state causing FD leak and
eventually reporting license errors. 

DESCRIPTION: The vxconfigd reports license errors as it fails to open the license files. The
failure to open is due to FD exhaustion, caused by excessive FIFO connections
left in open state.

The FIFO connections used to communicate with vxconfigd by clients (vx
commands). Usually these should get closed once the client exits. One of such
client "vxdclid" which is a daemon connecting frequently and leaving the
connection is open state, causing FD leak. 

This issue is applicable to Solaris platform only. 

RESOLUTION: One of the API, a library call is leaving the connection in open state while
leaving, which is fixed. 

 * INCIDENT NO:3933874	 TRACKING ID:3852146

SYMPTOM: In a CVM cluster, when importing a shared diskgroup specifying both -c and -o
noreonline options, the following error may be returned: 
VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed: Disk for disk
group not found. 

DESCRIPTION: The -c option will update the disk ID and disk group ID on the private region
of the disks in the disk group being imported. Such updated information is not
yet seen by the slave because the disks have not been re-onlined (given that
noreonline option is specified). As a result, the slave cannot identify the
disk(s) based on the updated information sent from the master, causing the
import to fail with the error Disk for disk group not found. 

RESOLUTION: The code is modified to handle the working of the "-c" and "-o noreonline"
options together. 

 * INCIDENT NO:3933875	 TRACKING ID:3872585

SYMPTOM: System running with VxFS and VxVM panics with storage key exception with the 
following stack:

simple_lock
dispatch
flih_util
touchrc
pin_seg_range
pin_com
pinx_plock
plock_pinvec
plock
mfspurr_sc_flih01 

DESCRIPTION: The xntpd process mounted on a vxfs filesystem could panic with storage key 
exception. The xntpd binary page faulted and did an IO,after which the storage 
key exception was detected OS as it couldn't locate it's keyset. From the code 
review it was found that in a few error cases in the vxvm, the storage key may 
not be restored after they're replaced. 

RESOLUTION: Do storage key restore even when in the error cases in vxio and dmp layer. 

 * INCIDENT NO:3933877	 TRACKING ID:3914789

SYMPTOM: System may panic when reclaiming on secondary in VVR(Veritas Volume Replicator)
environment. It's due to accessing invalid address, error message is similiar to
"data access MMU miss". 

DESCRIPTION: VxVM maintains a linked list to keep memory segment information. When accessing
its content with certain offset, linked list is traversed. Due to code defect
when offset is equal to segment chunk size, end of such segement is returned
instead of start of next segment. It can result silent memory corruption because
it tries to access memory out of its boundary. System can panic when out of
boundary address isn't allocated yet. 

RESOLUTION: Code changes have been made to fix the out-of-boundary access. 

 * INCIDENT NO:3933878	 TRACKING ID:3918408

SYMPTOM: Data corruption when volume grow is attempted on thin reclaimable disks whose space is just freed. 

DESCRIPTION: When the space in the volume is freed by deleting some data or subdisks, the corresponding subdisks are marked for 
reclamation. It might take some time for the periodic reclaim task to start if not issued manually. In the meantime, if 
same disks are used for growing another volume, it can happen that reclaim task will go ahead and overwrite the data 
written on the new volume. Because of this race condition between reclaim and volume grow operation, data corruption 
occurs. 

RESOLUTION: Code changes are done to handle race condition between reclaim and volume grow operation. Also reclaim is skipped for 
those disks which have been already become part of new volume. 

 * INCIDENT NO:3933880	 TRACKING ID:3864063

SYMPTOM: Application I/O hangs after the Master Pause command is issued. 

DESCRIPTION: Some flags (VOL_RIFLAG_DISCONNECTING or VOL_RIFLAG_REQUEST_PENDING) in VVR
(Veritas Volume Replicator) kernel are not cleared because of a race between the
Master Pause SIO and the Error Handler SIO. This causes the RU (Replication
Update) SIO to fail to proceed, which leads to I/O hang. 

RESOLUTION: The code is modified to handle the race condition. 

 * INCIDENT NO:3933882	 TRACKING ID:3865721

SYMPTOM: Vxconfigd hang in dealing transaction while pausing the replication in 
Clustered VVR environment. 

DESCRIPTION: In Clustered VVR (CVM VVR) environment, while pausing replication which is in 
DCM (Data Change Map) mode, the master pause SIO (staging IO) can not finish 
serialization since there are metadata shipping SIOs in the throttle queue 
with the activesio count added. Meanwhile, because master pause 
SIOs SERIALIZE flag is set, DCM flush SIO can not be started to flush the 
throttle queue. It leads to a dead loop hang state. Since the master pause 
routine needs to sync up with transaction routine, vxconfigd hangs in 
transaction. 

RESOLUTION: Code changes were made to flush the metadata shipping throttle queue if master 
pause SIO can not finish serialization. 

 * INCIDENT NO:3933883	 TRACKING ID:3867236

SYMPTOM: Application IO hang happens after issuing Master Pause command. 

DESCRIPTION: The flag VOL_RIFLAG_REQUEST_PENDING in VVR(Veritas Volume Replicator) kernel is 
not cleared because of a race between Master Pause SIO and RVWRITE1 SIO resulting 
in RU (Replication Update) SIO to fail to proceed thereby causing IO hang. 

RESOLUTION: Code changes have been made to handle the race condition. 

 * INCIDENT NO:3933890	 TRACKING ID:3879324

SYMPTOM: VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool fails to 
handle busy device problem while LUNs are removed from OS 

DESCRIPTION: OS devices may still be busy after removing them from OS, it fails 'luxadm -
e offline <disk>' operation and leaves staled entries in 'vxdisk list' 
output 
like:
emc0_65535   auto            -            -            error
emc0_65536   auto            -            -            error 

RESOLUTION: Code changes have been done to address busy devices issue. 

 * INCIDENT NO:3933893	 TRACKING ID:3890602

SYMPTOM: OS command cfgadm command hangs after reboot when hundreds devices are under 
DMP's control. 

DESCRIPTION: DMP generates the same entry for each of the partitions (8). A large number of 
vxdmp properties that devfsadmd has to touch causes anything that is touching 
devlinks to temporarily hang behind it. 

RESOLUTION: Code changes have been done to reduce the properties count by a factor of 8. 

 * INCIDENT NO:3933894	 TRACKING ID:3893150

SYMPTOM: VxDMP(Veritas Dynamic Multi-Pathing) vxdmpadm native ls command sometimes 
doesn't report imported disks' pool name 

DESCRIPTION: When Solaris pool is imported with extra options like -d or -R, paths in 
'zpool status <pool name>' can be disk full path. 'Vxdmpadm native ls' 
command doesn't handle such situation hence fails to report its pool name. 

RESOLUTION: Code changes have been made to correctly handle disk full path to get its 
pool name. 

 * INCIDENT NO:3933897	 TRACKING ID:3907618

SYMPTOM: vxdisk resize leads to data corruption on filesystem with MSDOS labelled disk having VxVM sliced format. 

DESCRIPTION: vxdisk resize changes the geometry on the device if required. When vxdisk resize is in progress, absolute offsets i.e offsets starting 
from start of the device are used. For MSDOS labelled disk, the full disk is devoted on Slice 4 but not slice 0. Thus when IO is 
scheduled on the device an extra 32 sectors gets added to the IO which is not required since we are already starting the IO from start of 
the device. This leads to data corruption since the IO on the device shifted by 32 sectors. 

RESOLUTION: Code changes have been made to not add 32 sectors to the IO when vxdisk resize is in progress to avoid corruption. 

 * INCIDENT NO:3933899	 TRACKING ID:3910675

SYMPTOM: Disks directly attached to the system cannot be exported in FSS environment 

DESCRIPTION: In some cases, UDID (Unique Disk Identifier) of the disk directly connected to a
cluster node might not be globally 
unique i.e another different disk might have a similar UDID which is directly
connected to a different node in the 
cluster. This leads to issues while exporting the device in FSS (Flexible
Storage Sharing) environment since two 
different disks have the same UDID which is not expected. 

RESOLUTION: A new option "islocal=yes" has been added to the vxddladm addjbod command so
that hostguid will get appended to UDID to make it unique. 

 * INCIDENT NO:3933900	 TRACKING ID:3915523

SYMPTOM: Local disk from other node belonging to private DG is exported to the node when
a private DG is imported on current node. 

DESCRIPTION: When we try to import a DG, all the disks belonging to the DG are automatically
exported to the current node so as to make sure 
that the DG gets imported. This is done to have same behaviour as SAN with local
disks as well. Since we are exporting all disks in 
the DG, then it happens that disks which belong to same DG name but different
private DG on other node get exported to current node 
as well. This leads to wrong disk getting selected while DG gets imported. 

RESOLUTION: Instead of DG name, DGID (diskgroup ID) is used to decide whether disk needs to
be exported or not. 

 * INCIDENT NO:3933901	 TRACKING ID:3915953

SYMPTOM: When we enable dmp_native_support using 'vxdmpadm settune 
dmp_native_support=on', it takes too long to get completed. 

DESCRIPTION: When we enable dmp_native_support , we import all the zpools so as to make 
them come under DMP, so that when we import them later on with native 
support on they should be under the DMP. For this the command used was 
taking much time and now the command has been modified in the script to 
reduce the time. 

RESOLUTION: Instead of searching the whole /dev/vx/dmp directory to import the zpools ,
import them by using their specific attributes. 

 * INCIDENT NO:3933903	 TRACKING ID:3918356

SYMPTOM: zpools are imported automatically when DMP native support is set to on which may lead to zpool corruption. 

DESCRIPTION: When DMP native support is set to on all zpools are imported using DMP devices so that when the import happens for the same zpool again it is 
automatically imported using DMP device. In clustered environment if the import of the same zpool is triggered on two different nodes at the 
same time it can lead to zpool corruption. A way needs to be provided so that zpools are not imported. 

RESOLUTION: Changes are made to provide a way to customer to not import the zpools if required. The way is to set the variable auto_import_exported_pools 
to off in the file /var/adm/vx/native_input like below:
bash:~# cat /var/adm/vx/native_input
auto_import_exported_pools=off 

 * INCIDENT NO:3933904	 TRACKING ID:3921668

SYMPTOM: Running the vxrecover command with -m option fails when run on the
slave node with message "The command can be executed only on the master." 

DESCRIPTION: The issue occurs as currently vxrecover -g <dgname> -m command on shared
disk groups is not shipped using the command shipping framework from CVM
(Cluster Volume Manager) slave node to the master node. 

RESOLUTION: Implemented code change to ship the vxrecover -m command to the master
node, when its triggered from the slave node. 

 * INCIDENT NO:3933907	 TRACKING ID:3873123

SYMPTOM: When remote disk on node is EFI disk, vold enable fails.
And following message get logged, and eventually causing the vxconfigd to go 
into disabled state:
Kernel and on-disk configurations don't match; transactions are disabled. 

DESCRIPTION: This is becasue one of the cases of EFI remote disk is not properly handled
in disk recovery part when vxconfigd is enabled. 

RESOLUTION: Code changes have been done to set the EFI flag on darec in recovery code 

 * INCIDENT NO:3933910	 TRACKING ID:3910228

SYMPTOM: Registration of GAB(Global Atomic Broadcast) port u fails on slave nodes after 
multiple new devices are added to the system.. 

DESCRIPTION: Vxconfigd sends command to GAB for port u registration and waits for a respnse 
from GAB. During this timeframe if the vxconfigd is interrupted by any other 
module apart from GAB then it will not be able to receive the signal from GAB 
of successful registration. Since the signal is not received, vxconfigd 
believes the registration did not succeed and treats it as a failure. 

RESOLUTION: Mask the signals which vxconfigd can receive before waiting for the signal from 
GAB for registration of gab u port. 

 * INCIDENT NO:3933913	 TRACKING ID:3905030

SYMPTOM: system hang when install/uninstall VxVM with bellow stack:

genunix:cv_wait+0x3c()
genunix:ndi_devi_enter+0x54()
genunix:devi_config_one+0x114()
genunix:ndi_devi_config_one+0xd0()
genunix:resolve_pathname_noalias+0x244()
genunix:resolve_pathname+0x10()
genunix:ldi_vp_from_name+0x100()
genunix:ldi_open_by_name+0x40()
vxio:vol_ldi_init+0x60()
vxio:vol_attach+0x5c()

Or
genunix:cv_wait+0x38
genunix:ndi_devi_enter
genunix:devi_config_one
genunix:ndi_devi_config_one
genunix:resolve_pathname_noalias
genunix:resolve_pathname
genunix:ldi_vp_from_name
genunix:ldi_open_by_name
vxdmp:dmp_setbootdev
vxdmp:dmp_attach 

DESCRIPTION: According to Oracle, ldi_open_by_name should not be called from a device's
attach, detach, 
or power entry point. This could result in a system crash or deadlock. 

RESOLUTION: Code changes have been done to avoid calling ldi_open_by_name during device
attach. 

 * INCIDENT NO:3936428	 TRACKING ID:3932714

SYMPTOM: Turning off DMP_FAST_RECOVERY and performing IO directly on dmpnode 
whose PGR key reserved by other host, OS panicked with following stack:

void unix:panicsys+0x40()
unix:vpanic_common+0x78()
void unix:panic)
size_t unix:miter_advance+0x36c()
unix:miter_next_paddr(
int unix:as_pagelock+0x108()
genunix:physio(
int scsi:scsi_uscsi_handle_cmdf+0x254()
int scsi:scsi_uscsi_handle_cmd+0x1c)
int ssd:ssd_ssc_send+0x2a8()
int ssd:ssdioctl+0x13e4()
genunix:cdev_ioctl(
int vxdmp:dmp_scsi_ioctl+0x1c0()
int vxdmp:dmp_send_scsireq+0x74()
int vxdmp:dmp_bypass_strategy+0x98()
void vxdmp:dmp_path_okay+0xf0()
void vxdmp:dmp_error_action+0x68()
vxdmp:dmp_process_scsireq()
void vxdmp:dmp_daemons_loop+0x164() 

DESCRIPTION: DMP issues USCSI_CMD IOCTL to SSD driver to fire the IO request in case IO 
failed with conventional way, path status is okay and DMP_FAST_RECOVERY is 
set. Because IO request is performed on dmpnode directly instead of from VxIO 
driver, IO data buffer isn't copied into kernel space and it has user virtual 
address. SSD driver can't map the user virtual address to kernel address 
without user process information, hence the panic. 

RESOLUTION: Code changes have been made to skip USCSI_CMD IOCTL and returns error in 
case of user address specified. 

 * INCIDENT NO:3937541	 TRACKING ID:3911930

SYMPTOM: Valid PGR operations sometimes fail on a dmpnode. 

DESCRIPTION: As part of the PGR operations, if the inquiry command finds that PGR is not
supported on the dmpnode node, a flag PGR_FLAG_NOTSUPPORTED is set on the
dmpnode.
Further PGR operations check this flag and issue PGR commands only if this flag
is
NOT set.
This flag remains set even if the hardware is changed so as to support PGR. 

RESOLUTION: A new command (namely enablepr) is provided in the vxdmppr utility to clear this
flag on the specified dmpnode. 

 * INCIDENT NO:3937545	 TRACKING ID:3932246

SYMPTOM: vxrelayout operation fails to complete. 

DESCRIPTION: IF we lose connectivity to underlying storage  while volume relayout is in 
progress, some intermediate volumes for the relayout could be in disabled or 
undesirable state either due to I/O error. Once the storage connectivity is 
back 
such intermediate volumes should be recovered by vxrecover  utility and resume 
the vxrelayout operation automatically. But due to bug in vxrecover utility 
the 
volumes remained in disable state due to which the vxrelayout operation didn't 
complete. 

RESOLUTION: Changes are done in  vxrecover utility to enable the intermediate volumes. 

 * INCIDENT NO:3937549	 TRACKING ID:3934910

SYMPTOM: IO errors on data volume or file system happen after some cycles of snapshot 
creation/removal with dg reimport. 

DESCRIPTION: With the snapshot of the data volume removal and the dg reimport, the DRL map 
keep active rather than to be inactivated. With the new snapshot created, the 
DRL would be re-enabled and new DRL map allocated with the first write to the 
data volume. The original active DRL map would not be used and leaked. After 
some such cycles, the extent of the DCO volume would be exhausted due to the 
active but not be used DRL maps, then no more DRL map could be allocated and 
the IOs would be failed or unable to be issued on the data volume. 

RESOLUTION: Code changes are done to inactivate the DRL map if the DRL is disabled during 
the volume start, then it could be reused later safely. 

 * INCIDENT NO:3937550	 TRACKING ID:3935232

SYMPTOM: Replication and IO hang may happen on new master node during master 
takeover. 

DESCRIPTION: During master switch is in progress if log owner change kicks in, flag 
VOLSIO_FLAG_RVC_ACTIVE will be set by log owner change SIO. 
RVG(Replicated Volume Group) recovery initiated by master switch  will clear 
flag VOLSIO_FLAG_RVC_ACTIVE after RVG recovery done. When log owner 
change done,  as flag VOLSIO_FLAG_RVC_ACTIVE has been cleared, resetting 
flag VOLOBJ_TFLAG_VVR_QUIESCE is skipped. The present of flag 
VOLOBJ_TFLAG_VVR_QUIESCE will make replication and application IO on RVG 
always be in pending state. 

RESOLUTION: Code changes have been done to make log owner change wait until master 
switch completed. 

 * INCIDENT NO:3937808	 TRACKING ID:3931936

SYMPTOM: In FSS(Flexible Storage Sharing) environment, after restarting slave node VxVM 
command on master node hang result in failed disks on slave node could not 
rejoin disk group. 

DESCRIPTION: While lost remote disks on slave node comes back, online these disk and add 
them to disk group operations are performed on master node. Disk online 
includes operations from both master and slave node. On slave node these 
disks 
should be offlined then reonlined, but due to code defect reonline disks are 
missed result in these disks are kept in reonlining state. The following add disk 
to 
disk group operation needs to issue private region IOs on the disk. These IOs 
are 
shipped to slave node to complete. As the disks are in reonline state, busy error 
gets returned and remote IOs keep retrying, hence VxVM command hang on 
master node. 

RESOLUTION: Code changes have been made to fix the issue. 

 * INCIDENT NO:3937811	 TRACKING ID:3935974

SYMPTOM: While communicating with client process, vxrsyncd daemon terminates and after 
sometime it gets started or may require a reboot to start. 

DESCRIPTION: When the client process shuts down abruptly and vxrsyncd daemon attempt to write 
on the client socket, SIGPIPE signal is generated. The default action for this 
signal is to terminate the process. Hence vxrsyncd gets terminated. 

RESOLUTION: This SIGPIPE signal should be handled in order to prevent the termination of 
vxrsyncd. 

 * INCIDENT NO:3938392	 TRACKING ID:3909630

SYMPTOM: OS panic happens as the following stack after some DMP devices migrated to
TPD(Third 
Party Driver) devices.

void vxdmp:dmp_register_stats+0x120()
int vxdmp:gendmpstrategy+0x244()
vxdmp:dmp_restart_io()
int vxdmp:dmp_process_deferbp+0xec()
void vxdmp:dmp_process_deferq+0x68()
void vxdmp:dmp_daemons_loop+0x160() 

DESCRIPTION: When updating CPU index for new path migrated to TPD, IOs on this path are 
unquiesced before increasing last CPU's stats table, as a result ,while
registering 
IO stat for restarted IO on this path, if need to access last CPUs stats
table, 
invalid memory access and panic may happen. 

RESOLUTION: Code changes have been made to fix this issue. 

 * INCIDENT NO:3944743	 TRACKING ID:3945411

SYMPTOM: system kept in cyclic reboot after enabling DMP(Dynamic Multi-Pathing) 
native support for ZFS boot devices with below error:
NOTICE: VxVM vxdmp V-5-0-1990 driver version VxVM  Multipathing Driver 
installed
WARNING: VxVM vxdmp V-5-3-2103 dmp_claim_device: Boot device not found in OS 
tree
NOTICE: zfs_parse_bootfs: error 19
Cannot mount root on rpool/40 fstype zfs
panic[cpu0]/thread=20012000: vfs_mountroot: cannot mount root
Warning - stack not written to the dumpbuf
000000002000fa00 genunix:main+1dc () 

DESCRIPTION: The boot device was under DMP control after enabling DMP native support. Hence
DMP failed to get its device number by inquiring the device under OS control,
hence the issue. 

RESOLUTION: code changes were made to get the correct device number of boot device. 

INCIDENTS FROM OLD PATCHES:
---------------------------
NONE