infoscale-rhel7.3_x86_64-Patch-7.2.0.100

 Basic information
Release type: Patch
Release date: 2017-04-18
OS update support: RHEL7 x86-64 Update 3
Technote: None
Documentation: None
Popularity: 5773 viewed    downloaded
Download size: 122.88 MB
Checksum: 1665930145

 Applies to one or more of the following products:
InfoScale Availability 7.2 On RHEL7 x86-64
InfoScale Enterprise 7.2 On RHEL7 x86-64
InfoScale Foundation 7.2 On RHEL7 x86-64
InfoScale Storage 7.2 On RHEL7 x86-64

 Obsolete patches, incompatibilities, superseded patches, or other requirements:

This patch supersedes the following patches: Release date
fsadv-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-25
odm-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-25
vm-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-24
fs-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-24
vxfen-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-18
amf-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-17
gab-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-04-17
llt-rhel7_x86_64-Patch-7.2.0.200 (obsolete) 2017-04-17
llt-rhel7_x86_64-Patch-7.2.0.100 (obsolete) 2017-03-08

 Fixes the following incidents:
3906300, 3908392, 3909937, 3909938, 3909939, 3909940, 3909941, 3909943, 3909946, 3909992, 3910000, 3910083, 3910084, 3910085, 3910086, 3910088, 3910090, 3910093, 3910094, 3910095, 3910096, 3910097, 3910098, 3910101, 3910103, 3910105, 3910356, 3910426, 3910586, 3910588, 3910590, 3910591, 3910592, 3910593, 3910704, 3911290, 3911718, 3911719, 3911732, 3911926, 3911964, 3911968, 3912529, 3912532, 3912988, 3912989, 3912990, 3913004, 3913423, 3913424, 3913425, 3914384, 3915587

 Patch ID:
VRTSvxfs-7.2.0.100-RHEL7
VRTSodm-7.2.0.100-RHEL7
VRTSfsadv-7.2.0.100-RHEL7
VRTSllt-7.2.0.200-RHEL7
VRTSgab-7.2.0.100-RHEL7
VRTSvxfen-7.2.0.100-RHEL7
VRTSamf-7.2.0.100-RHEL7
VRTSdbac-7.2.0.100-RHEL7
VRTSvxvm-7.2.0.100-RHEL7
VRTSaslapm-7.2.0.200-RHEL7

Readme file
                          * * * READ ME * * *
                       * * * InfoScale 7.2 * * *
                         * * * Patch 100 * * *
                         Patch Date: 2017-04-13

Note: The patch installer was updated on May 9th, 2017 to fix following incident.
Incident: 3917542

SYMPTOM:
The installer fails to install the patch infoscale-rhel7.3_x86_64-Patch-7.2.0.100
on Oracle Linux 7 Update 3.

DESCRIPTION:
The installer aborts the installation of infoscale-rhel7.3_x86_64-Patch-7.2.0.100
on Oracle Linux 7 Update 3 with the error message 'No pkg object 
defined for pkg VRTSaslapm72 and padv OL7x8664'.

RESOLUTION:
The code of the Installer is modified to support the installation of the patch
infoscale-rhel7.3_x86_64-Patch-7.2.0.100 on Oracle Linux 7 Update 3.


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH
   * KNOWN ISSUES


PATCH NAME
----------
InfoScale 7.2 Patch 100


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
RHEL7 x86-64


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSamf
VRTSaslapm
VRTSdbac
VRTSfsadv
VRTSgab
VRTSllt
VRTSodm
VRTSvxfen
VRTSvxfs
VRTSvxvm


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * Veritas InfoScale Foundation 7.2
   * Veritas InfoScale Storage 7.2
   * Veritas InfoScale Availability 7.2
   * Veritas InfoScale Enterprise 7.2


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: VRTSvxvm-7.2.0.100-RHEL7
* 3909992 (3898069) System panic may happen in dmp_process_stats routine.
* 3910000 (3893756) 'vxconfigd' is holding a task device for long time, after the kernel counter rewinds, it may create a boundary issue.
* 3910426 (3868533) IO hang happens because of a deadlock situation.
* 3910586 (3852146) Shared DiskGroup(DG) fails to import when "-c" and "-o noreonline" options 
are
specified together
* 3910588 (3868154) When DMP Native Support is set to ON, dmpnode with multiple VGs cannot be listed
properly in the 'vxdmpadm native ls' command
* 3910590 (3878030) Enhance VxVM DR tool to clean up OS and VxDMP device trees without user 
interaction.
* 3910591 (3867236) Application IO hang happens because of a race between Master Pause SIO(Staging IO) 
and RVWRITE1 SIO.
* 3910592 (3864063) Application IO hang happens because of a race between Master Pause SIO(Staging IO) 
and Error Handler SIO.
* 3910593 (3879324) VxVM DR tool fails to handle busy device problem while LUNs are removed from  OS
* 3912529 (3878153) VVR 'vradmind' deamon core dump.
* 3912532 (3853144) VxVM mirror volume's stale plex is incorrectly marked as "Enable Active" after 
it comes back.
Patch ID: VRTSodm-7.2.0.100-RHEL7
* 3910095 (3757609) CPU usage going high because of contention over ODM_IO_LOCK
* 3911964 (3907933) ODM module failed to load on SLES12 SP2.
Patch ID: VRTSdbac-7.2.0.100-RHEL7
* 3915587 (3915585) Veritas Oracle Real Application Cluster does not support Red Hat Enterprise 
Linux 7 Update 3 (RHEL7.3).
Patch ID: VRTSfsadv-7.2.0.100-RHEL7
* 3910704 (3729030) The fsdedupschd daemon failed to start on RHEL7.
Patch ID: VRTSvxfen-7.2.0.100-RHEL7
* 3913425 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).
Patch ID: VRTSvxfs-7.2.0.100-RHEL7
* 3909937 (3908954) Some writes could be missed causing data loss.
* 3909938 (3905099) VxFS unmount panicked in deactive_super().
* 3909939 (3906548) Start up vxfs after local file system mounted with rw through systemd.
* 3909940 (3894712) ACL permissions are not inherited correctly on cluster 
file system.
* 3909941 (3868609) High CPU usage seen because of vxfs thread 
while applying Oracle redo logs
* 3909943 (3729030) The fsdedupschd daemon failed to start on RHEL7.
* 3909946 (3685391) Execute permissions for a file not honored correctly.
* 3910083 (3707662) Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap.
* 3910084 (3812330) slow ls -l across cluster nodes
* 3910085 (3779916) vxfsconvert fails to upgrade layout verison for a vxfs file 
system with large number of inodes.
* 3910086 (3830300) Degraded CPU performance during backup of Oracle archive logs
on CFS vs local filesystem
* 3910088 (3855726) Panic in vx_prot_unregister_all().
* 3910090 (3790721) High cpu usage caused by vx_send_bcastgetemapmsg_remaus
* 3910093 (1428611) 'vxcompress' can spew many GLM block lock messages over the 
LLT network.
* 3910094 (3879310) The file system may get corrupted after a failed vxupgrade.
* 3910096 (3757609) CPU usage going high because of contention over ODM_IO_LOCK
* 3910097 (3817734) Direct command to run  fsck with -y|Y option was mentioned in
the message displayed to user when file system mount fails.
* 3910098 (3861271) Missing an inode clear operation when a Linux inode is being de-initialized on
SLES11.
* 3910101 (3846521) "cp -p" fails if modification time in nano seconds have 10 
digits.
* 3910103 (3817734) Direct command to run  fsck with -y|Y option was mentioned in
the message displayed to user when file system mount fails.
* 3910105 (3907902) System panic observed due to race between dalloc off thread
and getattr thread.
* 3910356 (3908785) System panic observed because of null page address in writeback 
structure in case of 
kswapd process.
* 3911290 (3910526) fsadm fails with error number 28 during resize operation
* 3911718 (3905576) CFS hang during a cluster wide freeze
* 3911719 (3909583) Disable partitioning of directory if directory size is greater than upper 
threshold 
value.
* 3911732 (3896670) Intermittent CFS hang like situation with many CFS pglock 
grant messages pending on LLT layer
* 3911926 (3901318) VxFS module failed to load on RHEL7.3.
* 3911968 (3911966) VxFS module failed to load on SLES12 SP2.
* 3912988 (3912315) EMAP corruption while freeing up the extent
* 3912989 (3912322) vxfs tunable max_seqio_extent_size cannot be tuned to any 
value less than 32768.
* 3912990 (3912407) CFS hang on VxFS 7.2 while thread on a CFS node waits for EAU 
delegation.
* 3913004 (3911048) LDH corrupt and filesystem hang.
* 3914384 (3915578) vxfs module is failed to load after reboot.
Patch ID: VRTSamf-7.2.0.100-RHEL7
* 3908392 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).
Patch ID: VRTSgab-7.2.0.100-RHEL7
* 3913424 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).
Patch ID: VRTSllt-7.2.0.200-RHEL7
* 3913423 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).
Patch ID: VRTSllt-7.2.0.100-RHEL7
* 3906300 (3905430) Application IO hangs in case of FSS with LLT over RDMA during heavy data transfer.


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: VRTSvxvm-7.2.0.100-RHEL7

* 3909992 (Tracking ID: 3898069)

SYMPTOM:
System panic may happen in dmp_process_stats routine with the following stack:

dmp_process_stats+0x471/0x7b0 
dmp_daemons_loop+0x247/0x550 
kthread+0xb4/0xc0
ret_from_fork+0x58/0x90

DESCRIPTION:
When aggregate the pending IOs per DMP path over all CPUs, out of bound 
access issue happened due to the wrong index of statistic table, which could 
cause a system panic.

RESOLUTION:
Code changes have been done to correct the wrong index.

* 3910000 (Tracking ID: 3893756)

SYMPTOM:
Under certain circumstances, after vxconfigd running for a long time, a task might be dangling in system. Which may be seen by issuing 'vxtask -l list'.

DESCRIPTION:
- voltask_dump() gets a task id by calling ' vol_task_dump' in kernel (ioctl) as the minor number of the taskdev.
- the task id (or minor number) increases by 1 when a new task is registered. 
- task id starts from 160 and rewinds when it meets 65536. there is a global counter 'vxtask_next_minor' indicating next task id.
- at the time vxconfigd opens a taskdev by calling voltask_dump() and holding it, it gets a task id too (let's say 165). from then on, there's a
  vnode with this minor number (major=273, minor=165) exists in kernel.
- as time goes by, the task id increases and meets 65536, it then rewinds and starts from 160 again.
- when taskid goes by 165 again with a cli command (say 'vxdisk -othin, fssize list'), then it's taskdev gets the same major and minor number 
 (165) as vxconfigd's. 
- at the same time, vxconfigd is still holding this vnode too. vxdisk doesn't know this and opens the taskdev, and registers a task structure in 
  kernel hash table, this adds a reference to the same vnode which vxconfigd is holding, now the reference count of the common snode is 2.
- when vxdisk (fsusage_collect_stats_task) has done it's job, it calls voltask_complete->close()->spec_close(), trying to remove this task 
  (165). but the os function spec_close() ( from specfs ) gets in the way, it detects reference count of the common snode (vnode->v_data-
  >snode->s_commonvp->v_data->common snode). spec_close() finds out the value of s_count is 2, then it only drops the reference by one 
  and returns success to caller, without calling the actual closing function 'volsclose()'.
- volsclose() is not called by spec_close(), then it's subsequent functions are not called too: volsclose_real()->voltask_close()
  ->vxtask_rm_task(), among those, vxtask_rm_task() does the actual job removing a task from the kernel hashtable.
- after calling close(), fsusage_collect_stats_task returns, and vxdisk command exits. from this point on, the task is dangling in kernel hash 
  table, until vxconfigd exits.

RESOLUTION:
Source change to avoid vxconfigd holding task device.

* 3910426 (Tracking ID: 3868533)

SYMPTOM:
IO hang happens when starting replication. VXIO deamon hang with stack like 
following:

vx_cfs_getemap at ffffffffa035e159 [vxfs]
vx_get_freeexts_ioctl at ffffffffa0361972 [vxfs]
vxportalunlockedkioctl at ffffffffa06ed5ab [vxportal]
vxportalkioctl at ffffffffa06ed66d [vxportal]
vol_ru_start at ffffffffa0b72366 [vxio]
voliod_iohandle at ffffffffa09f0d8d [vxio]
voliod_loop at ffffffffa09f0fe9 [vxio]

DESCRIPTION:
While performing DCM replay in case Smart Move feature is enabled, VxIO 
kernel needs to issue IOCTL to VxFS kernel to get file system free region. 
VxFS kernel needs to clone map by issuing IO to VxIO kernel to complete this 
IOCTL. Just at the time RLINK disconnection happened, so RV is serialized to 
complete the disconnection. As RV is serialized, all IOs including the 
clone map IO form VxFS is queued to rv_restartq, hence the deadlock.

RESOLUTION:
Code changes have been made to handle the dead lock situation.

* 3910586 (Tracking ID: 3852146)

SYMPTOM:
Shared DiskGroup fails to import when "-c" and "-o noreonline" options are
specified together with the below error:

VxVM vxdg ERROR V-5-1-10978 Disk group <dgname>: import failed:
Disk for disk group not found

DESCRIPTION:
The -c option will update the disk ID and disk group ID on the private region 
of the disks in the disk group being imported. Such updated information is not 
yet seen by the slave because the disks have not been re-onlined (given 
that noreonline option is specified). As a result, the slave cannot identify 
the disk(s) based on the updated information sent from the master, causing the 
import to fail with the error Disk for disk group not found.

RESOLUTION:
Code changes have been done to handle the working of "-c" and "-o 
noreonline"
together.

* 3910588 (Tracking ID: 3868154)

SYMPTOM:
When DMP Native Support is set to ON, and if a dmpnode has multiple VGs,
'vxdmpadm native ls' shows incorrect VG entries for dmpnodes.

DESCRIPTION:
When DMP Native Support is set to ON, multiple VGs can be created on a disk as
Linux supports creating VG on a whole disk as well as on a partition of 
a disk.This possibility was not handled in the code, hence the display of
'vxdmpadm native ls' was getting messed up.

RESOLUTION:
Code now handles the situation of multiple VGs of a single disk

* 3910590 (Tracking ID: 3878030)

SYMPTOM:
Enhance VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool to 
clean up OS and VxDMP(Veritas Dynamic Multi-Pathing) device trees without 
user interaction.

DESCRIPTION:
When users add or remove LUNs, stale entries in OS or VxDMP device trees can 
prevent VxVM from discovering changed LUNs correctly. It even causes VxVM 
vxconfigd process core dump under certain conditions, users have to reboot 
system to let vxconfigd restart again.
VxVM has DR tool to help users adding or removing LUNs properly but it 
requires user inputs during operations.

RESOLUTION:
Enhancement has been done to VxVM DR tool. It accepts '-o refresh' option to 
clean up OS and VxDMP device trees without user interaction.

* 3910591 (Tracking ID: 3867236)

SYMPTOM:
Application IO hang happens after issuing Master Pause command.

DESCRIPTION:
The flag VOL_RIFLAG_REQUEST_PENDING in VVR(Veritas Volume Replicator) kernel is 
not cleared because of a race between Master Pause SIO and RVWRITE1 SIO resulting 
in RU (Replication Update) SIO to fail to proceed thereby causing IO hang.

RESOLUTION:
Code changes have been made to handle the race condition.

* 3910592 (Tracking ID: 3864063)

SYMPTOM:
Application IO hang happens after issuing Master Pause command.

DESCRIPTION:
Some flags(VOL_RIFLAG_DISCONNECTING or VOL_RIFLAG_REQUEST_PENDING) in VVR(Veritas 
Volume Replicator) kernel are not cleared because of a race between Master Pause SIO 
and Error Handler SIO resulting in RU (Replication Update) SIO to fail to proceed 
thereby causing IO hang.

RESOLUTION:
Code changes have been made to handle the race condition.

* 3910593 (Tracking ID: 3879324)

SYMPTOM:
VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool fails to 
handle busy device problem while LUNs are removed from OS

DESCRIPTION:
OS devices may still be busy after removing them from OS, it fails 'luxadm -
e offline <disk>' operation and leaves staled entries in 'vxdisk list' 
output 
like:
emc0_65535   auto            -            -            error
emc0_65536   auto            -            -            error

RESOLUTION:
Code changes have been done to address busy devices issue.

* 3912529 (Tracking ID: 3878153)

SYMPTOM:
VVR (Veritas Volume Replicator)  'vradmind' deamon core dump.

DESCRIPTION:
Under certain circumstances 'vradmind' daemon may core dump freeing a variable allocated in 
stack.

RESOLUTION:
Code change has been done to address the issue.

* 3912532 (Tracking ID: 3853144)

SYMPTOM:
VxVM(Veritas Volume Manager) mirror volume's stale plex is incorrectly marked as 
"Enable Active" after it comes back, which prevents resync of such stale plex 
from up-to-date ones. It can cause data corruption if the stale plex happens to 
be the preferred or slected plex, or read policy "round" is set for the volume.

DESCRIPTION:
When volume plex is detached abruptly while vxconfigd is unavailable, VxVM 
kernel logging records the detach activity along with its detach transaction id 
for future resync or recover. Because of code defect, such detach transaction id 
can be wrongly selected under certain situation.

RESOLUTION:
Code changes have been done to correctly select the detach transaction id.

Patch ID: VRTSodm-7.2.0.100-RHEL7

* 3910095 (Tracking ID: 3757609)

SYMPTOM:
High CPU usage because of contention over ODM_IO_LOCK

DESCRIPTION:
While performing ODM IO, to update some of the ODM counters we take
ODM_IO_LOCK which leads to contention from multiple  of iodones trying to update
 these counters at the same time. This is results in high CPU usage.

RESOLUTION:
Code modified to remove the lock contention.

* 3911964 (Tracking ID: 3907933)

SYMPTOM:
ODM module failed to load on SLES12 SP2.

DESCRIPTION:
Since SLES12 SP2 is new release therefore ODM module failed to load
on it.

RESOLUTION:
Added ODM support for SLES12 SP2.

Patch ID: VRTSdbac-7.2.0.100-RHEL7

* 3915587 (Tracking ID: 3915585)

SYMPTOM:
Veritas Oracle Real Application Cluster does not work with Red Hat 
Enterprise Linux 7 Update 3 and is unable to load the vcsmm module .

DESCRIPTION:
Veritas Oracle Real Application Cluster did not support RHEL versions later 
than RHEL7 Update 2.

RESOLUTION:
Veritas Oracle Real Application Cluster support for Red Hat Enterprise Linux 
7 Update 3 (RHEL7.3) is now introduced.

Patch ID: VRTSfsadv-7.2.0.100-RHEL7

* 3910704 (Tracking ID: 3729030)

SYMPTOM:
The fsdedupschd daemon failed to start on RHEL7.

DESCRIPTION:
The dedup service daemon failed to start because RHEL 7 changed the service 
management mechanism. The daemon uses the new systemctl to start and stop 
the service. For the systemctl to properly start, stop, or query the 
service, it needs a service definition file under the 
/usr/lib/systemd/system.

RESOLUTION:
The code is modified to create the fsdedupschd.service file while installing 
the VRTSfsadv package.

Patch ID: VRTSvxfen-7.2.0.100-RHEL7

* 3913425 (Tracking ID: 3896877)

SYMPTOM:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).

DESCRIPTION:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 
versions later than RHEL7 Update 2.

RESOLUTION:
Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 
3(RHEL7.3) is now introduced.

Patch ID: VRTSvxfs-7.2.0.100-RHEL7

* 3909937 (Tracking ID: 3908954)

SYMPTOM:
Whilst performing vectored writes, using writev(), where two iovec-writes write
to different offsets within the same 4K page-aligned range of a file, it is
possible to find null data at the beginning of the 4Kb range when reading
the
data back.

DESCRIPTION:
Whilst multiple processes are performing vectored writes to a file, using
writev(),

The following situation can occur 

We have 2 iovecs, the first is 448 bytes and the second is 30000 bytes. The
first iovec of 448 bytes completes, however the second iovec finds that the
source page is no longer in memory. As it cannot fault-in during uiomove, it has
to undo both iovecs. It then faults the page back in and retries the second iovec
only. However, as the undo operation also undid the first iovec, the first 448
bytes of the page are populated with nulls. When reading the file back, it seems
that no data was written for the first iovec. Hence, we find nulls in the file.

RESOLUTION:
Code has been changed to handle the unwind of multiple iovecs accordingly in the
scenarios where certain amount of data is written out of a particular iovec and
some from other.

* 3909938 (Tracking ID: 3905099)

SYMPTOM:
VxFS unmount panicked in deactive_super(), the panic stack looks like 
following:

 #9 vx_fsnotify_flush [vxfs]
#10 vx_softcnt_flush [vxfs]
#11 vx_idrop  [vxfs]
#12 vx_detach_fset [vxfs]
#13 vx_unmount  [vxfs]
#14 generic_shutdown_super 
#15 kill_block_super
#16 vx_kill_sb
#17 amf_kill_sb
#18 deactivate_super
#19 mntput_no_expire
#20 sys_umount
#21 system_call_fastpath

DESCRIPTION:
Suspected there is a a race between unmount, and a user-space notifier install 
for root inode.

RESOLUTION:
Added diaged code and defensive check for fsnotify_flush in vx_softcnt_flush.

* 3909939 (Tracking ID: 3906548)

SYMPTOM:
Read-only file system errors reported while loading drivers to start up vxfs 
through systemd.

DESCRIPTION:
On SLES system, vxfs starting up could be invoked by other systemd unit while 
root 
and local file system still not be mounted as read/write, in such case, Read-
only 
file system errors could be reported during system start up.

RESOLUTION:
Start up vxfs after local file system mounted with rw through systemd, also 
delay vxfs file system mounting by add "_netdev" mount option in /etc/fstab.

* 3909940 (Tracking ID: 3894712)

SYMPTOM:
ACL permissions are not inherited correctly on cluster file system.

DESCRIPTION:
The ACL counts stored on a directory inode gets reset every 
time directory inodes 
ownership is switched between the nodes. When ownership on directory inode 
comes back to the node, 
which  previously abdicated it, ACL permissions were not getting inherited 
correctly for the newly 
created files.

RESOLUTION:
Modified the source such that the ACLs are inherited correctly.

* 3909941 (Tracking ID: 3868609)

SYMPTOM:
High CPU usage seen because of vxfs thread.

DESCRIPTION:
To avoid memory deadlocks, and to track exiting
threads with outstanding ODM requests, we need to hook into
he kernels memory management. While rescheduling happens for
Oracle threads, they hold the mmap_sem on which FDD threads
keep on waiting causing contention and high CPU usage.

RESOLUTION:
Remove the bouncing of the spinlock between the
CPUs, and so reduce the CPU spike.

* 3909943 (Tracking ID: 3729030)

SYMPTOM:
The fsdedupschd daemon failed to start on RHEL7.

DESCRIPTION:
The dedup service daemon failed to start because RHEL 7 changed the service 
management mechanism. The daemon uses the new systemctl to start and stop 
the service. For the systemctl to properly start, stop, or query the 
service, it needs a service definition file under the 
/usr/lib/systemd/system.

RESOLUTION:
The code is modified to create the fsdedupschd.service file while installing 
the VRTSfsadv package.

* 3909946 (Tracking ID: 3685391)

SYMPTOM:
Execute permissions for a file not honored correctly.

DESCRIPTION:
The user was able to execute the file regardless of not having the execute permissions.

RESOLUTION:
The code is modified such that an error is reported when the execute permissions are not applied.

* 3910083 (Tracking ID: 3707662)

SYMPTOM:
Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap with the following stack::

vx_iunlock
vx_reorg_iunlock_rct_reorg
vx_reorg_emap
vx_extmap_reorg
vx_reorg
vx_aioctl_full
vx_aioctl_common
vx_aioctl
vx_ioctl
fop_ioctl
ioctl

DESCRIPTION:
When the timer expires (fsadm with -t option), vx_do_close() calls vx_reorg_clear() on local mount which performs cleanup on reorg rct inode. Another thread currently active in vx_reorg_emap() will panic due to null pointer dereference.

RESOLUTION:
When fop_close is called in alarm handler context, we defer the cleaning up untill the kernel thread performing reorg completes its operation.

* 3910084 (Tracking ID: 3812330)

SYMPTOM:
slow ls -l across cluster nodes.

DESCRIPTION:
When we issue "ls -l" , VOP getattr is issued by which in VxFS, we 
update the necessary stats of an inode whose owner is some other node in the 
cluster. Ideally this update process should be done through asynchronous message 
passing mechanism which is not happening in this case.Instead the non-owner 
node, where we are issuing "ls -l", tries to pull strong ownership towards itself 
to update the inode stats.Hence a lot of time is consumed in this ping-pong of 
ownership.

RESOLUTION:
Avoiding pulling strong ownership for the inode when doing ls from 
some other node which is not current owner and doing inode stats update through 
asynchronous message passing mechanism using a module parameter "vx_lazyisiz".

* 3910085 (Tracking ID: 3779916)

SYMPTOM:
vxfsconvert fails to upgrade layout verison for a vxfs file system with 
large number of inodes. Error message will show some inode discrepancy.

DESCRIPTION:
vxfsconvert walks through the ilist and converts inode. It stores 
chunks of inodes in a buffer and process them as a batch. The inode number 
parameter for this inode buffer is of type unsigned integer. The offset of a 
particular inode in the ilist is calculated by multiplying the inode number with 
size of inode structure. For large inode numbers this product of inode_number * 
inode_size can overflow the unsigned integer limit, thus giving wrong offset 
within the ilist file. vxfsconvert therefore reads wrong inode and eventually 
fails.

RESOLUTION:
The inode number parameter is defined as unsigned long to avoid 
overflow.

* 3910086 (Tracking ID: 3830300)

SYMPTOM:
Heavy cpu usage while oracle archive process are running on a clustered
fs.

DESCRIPTION:
The cause of the poor read performance in this case was due to fragmentation,
fragmentation mainly happens when there are multiple archivers running on the
same node. The allocation pattern of the oracle archiver processes is 

1. write header with O_SYNC
2. ftruncate-up the file to its final size ( a few GBs typically)
3. do lio_listio with 1MB iocbs

The problem occurs because all the allocations in this manner go through
internal allocations i.e. allocations below file size instead of allocations
past the file size. Internal allocations are done at max 8 Pages at once. So if
there are multiple processes doing this, they all get these 8 Pages alternately
and the fs becomes very fragmented.

RESOLUTION:
Added a tunable, which will allocate zfod extents when ftruncate
tries to increase the size of the file, instead of creating a hole. This will
eliminate the allocations internal to file size thus the fragmentation. Fixed
the earlier implementation of the same fix, which ran into
locking issues. Also fixed the performance issue while writing from secondary node.

* 3910088 (Tracking ID: 3855726)

SYMPTOM:
Panic happens in vx_prot_unregister_all(). The stack looks like this:

- vx_prot_unregister_all
- vxportalclose
- __fput
- fput
- filp_close
- sys_close
- system_call_fastpath

DESCRIPTION:
The panic is caused by a NULL fileset pointer, which is due to referencing the
fileset before it's loaded, plus, there's a race on fileset identity array.

RESOLUTION:
Skip the fileset if it's not loaded yet. Add the identity array lock to prevent
the possible race.

* 3910090 (Tracking ID: 3790721)

SYMPTOM:
High CPU usage on the vxfs thread process. The backtrace of such kind of threads
usually look like this:

schedule
schedule_timeout
__down
down
vx_send_bcastgetemapmsg_remaus
vx_send_bcastgetemapmsg
vx_recv_getemapmsg
vx_recvdele
vx_msg_recvreq
vx_msg_process_thread
vx_kthread_init
kernel_thread

DESCRIPTION:
The locking mechanism in vx_send_bcastgetemapmsg_process() is inefficient. So that
every
time vx_send_bcastgetemapmsg_process() is called, it will perform a series of
down-up
operation on a certain semaphore. This can result in a huge CPU cost when multiple
threads have contention on this semaphore.

RESOLUTION:
Optimize the locking mechanism in vx_send_bcastgetemapmsg_process(),
so that it only do down-up operation on the semaphore once.

* 3910093 (Tracking ID: 1428611)

SYMPTOM:
'vxcompress' command can cause many GLM block lock messages to be 
sent over the network. This can be observed with 'glmstat -m' output under the 
section "proxy recv", as shown in the example below -

bash-3.2# glmstat -m
         message     all      rw       g      pg       h     buf     oth    
loop
master send:
           GRANT     194       0       0       0       2       0     192      
98
          REVOKE     192       0       0       0       0       0     192      
96
        subtotal     386       0       0       0       2       0     384     
194

master recv:
            LOCK     193       0       0       0       2       0     191      
98
         RELEASE     192       0       0       0       0       0     192      
96
        subtotal     385       0       0       0       2       0     383     
194

    master total     771       0       0       0       4       0     767     
388

proxy send:
            LOCK      98       0       0       0       2       0      96      
98
         RELEASE      96       0       0       0       0       0      96      
96
      BLOCK_LOCK    2560       0       0       0       0    2560       0       
0
   BLOCK_RELEASE    2560       0       0       0       0    2560       0       
0
        subtotal    5314       0       0       0       2    5120     192     
194

DESCRIPTION:
'vxcompress' creates placeholder inodes (called IFEMR inodes) to 
hold the compressed data of files. After the compression is finished, IFEMR 
inode exchange their bmap with the original file and later given to inactive 
processing. Inactive processing truncates the IFEMR extents (original extents 
of the regular file, which is now compressed) by sending cluster-wide buffer 
invalidation requests. These invalidations need GLM block lock. Regular file 
data need not be invalidated across the cluster, thus making these GLM block 
lock requests unnecessary.

RESOLUTION:
Pertinent code has been modified to skip the invalidation for the 
IFEMR inodes created during compression.

* 3910094 (Tracking ID: 3879310)

SYMPTOM:
The file system may get corrupted after the file system freeze during 
vxupgrade. The full fsck gives the following errors:

UX:vxfs fsck: ERROR: V-3-20451: No valid device inodes found
UX:vxfs fsck: ERROR: V-3-20694: cannot initialize aggregate

DESCRIPTION:
The vxupgrade requires the file system to be frozen during its functional 
operation. It may happen that the corruption can be detected while the freeze 
is in progress and the full fsck flag can be set on the file system. However, 
this doesn't stop the vxupgrade from proceeding.
At later stage of vxupgrade, after structures related to the new disk layout 
are updated on the disk, vxfs frees up and zeroes out some of the old metadata 
inodes. If any error occurs after this point (because of full fsck being set), 
the file system needs to go back completely to the previous version at the tile 
of full fsck. Since the metadata corresponding to the previous version is 
already cleared, the full fsck cannot proceed and gives the error.

RESOLUTION:
The code is modified to check for the full fsck flag after freezing the file 
system during vxupgrade. Also, disable the file system if an error occurs after 
writing new metadata on the disk. This will force the newly written metadata to 
be loaded in memory on the next mount.

* 3910096 (Tracking ID: 3757609)

SYMPTOM:
High CPU usage because of contention over ODM_IO_LOCK

DESCRIPTION:
While performing ODM IO, to update some of the ODM counters we take
ODM_IO_LOCK which leads to contention from multiple  of iodones trying to update
 these counters at the same time. This is results in high CPU usage.

RESOLUTION:
Code modified to remove the lock contention.

* 3910097 (Tracking ID: 3817734)

SYMPTOM:
If file system with full fsck flag set is mounted, direct command message
is printed to the user to clean the file system with full fsck.

DESCRIPTION:
When mounting file system with full fsck flag set, mount will fail
and a message will be printed to clean the file system with full fsck. This
message contains direct command to run, which if run without collecting file
system metasave will result in evidences being lost. Also since fsck will remove
the file system inconsistencies it may lead to undesired data being lost.

RESOLUTION:
More generic message is given in error message instead of direct
command.

* 3910098 (Tracking ID: 3861271)

SYMPTOM:
Due to the missing inode clear action, a page can also be in a strange state.
Also, inode is not fully quiescent which leads to races in the inode code.
Sometime this can cause panic from iput_final().

DESCRIPTION:
We're missing an inode clear operation when a Linux inode is being
de-initialized on SLES11.

RESOLUTION:
Add the inode clear operation on SLES11.

* 3910101 (Tracking ID: 3846521)

SYMPTOM:
cp -p is failing with EINVAL for files with 10 digit 
modification time. EINVAL error is returned if the value in tv_nsec field is 
greater than/outside the range of 0 to 999, 999, 999.  VxFS supports the 
update in usec but when copying in the user space, we convert the usec to 
nsec. So here in this case, usec has crossed the upper boundary limit i.e 
999, 999.

DESCRIPTION:
In a cluster, its possible that time across nodes might 
differ.so 
when updating mtime, vxfs check if it's cluster inode and if nodes mtime is 
newer 
time than current node time, then accordingly increment the tv_usec instead of 
changing mtime to older time value. There might be chance that it,  tv_usec 
counter got overflowed here, which resulted in 10 digit mtime.tv_nsec.

RESOLUTION:
Code is modified to reset usec counter for mtime/atime/ctime when 
upper boundary limit i.e. 999999 is reached.

* 3910103 (Tracking ID: 3817734)

SYMPTOM:
If file system with full fsck flag set is mounted, direct command message
is printed to the user to clean the file system with full fsck.

DESCRIPTION:
When mounting file system with full fsck flag set, mount will fail
and a message will be printed to clean the file system with full fsck. This
message contains direct command to run, which if run without collecting file
system metasave will result in evidences being lost. Also since fsck will remove
the file system inconsistencies it may lead to undesired data being lost.

RESOLUTION:
More generic message is given in error message instead of direct
command.

* 3910105 (Tracking ID: 3907902)

SYMPTOM:
System panic observed due to race between dalloc off thread and 
getattr
thread.

DESCRIPTION:
With 7.2 release of VxFS, dalloc states are now stored in a new 
structure. when getting attributes of file, dalloc blocks are calculated and 
stored into this new structure. If a dalloc off thread races with getattr 
thread, there is a possibility of dereferencing of NULL dalloc structure by
getattr thread.

RESOLUTION:
Code changes has been done to take appropriate dalloc lock while
calculating dalloc blocks in getattr function to avoid the race.

* 3910356 (Tracking ID: 3908785)

SYMPTOM:
System panic observed because of null page address in writeback structure in case of 
kswapd 
process.

DESCRIPTION:
Secfs2/Encryptfs layers had used write VOP as a hook when Kswapd is triggered to 
free page. 
Ideally kswapd should call writepage() routine where writeback structure are correctly filled.  When 
write VOP is 
called because of hook in secfs2/encrypts, writeback structures are cleared, resulting in null page 
address.

RESOLUTION:
Code changes has been done to call VxFS kswapd routine only if valid page address is 
present.

* 3911290 (Tracking ID: 3910526)

SYMPTOM:
In case of a full filesystem, fsadm resize operation to increase the filesystem
size may fail with following error :
Attempt to resize <volume-name> failed with errno 28

DESCRIPTION:
If there is no space available in the filesystem and resize operation gets
initiated, intent log extents are being used for metadata setup to continue the
resize operation. If resize is successful, we update superblock with the new
size and try to reallocate new intent log inode of same size. if resize size is
smaller than log inode size, we hit the ENOSPC when reallocating intent log
inode extents. Due to this failure, resize command fails and shrinks back the
volume to its original size but the superblock continue to have the new
filesystem size.

RESOLUTION:
The code has been modified to fail the resize operation if resize
size is less than intent log inode size.

* 3911718 (Tracking ID: 3905576)

SYMPTOM:
Cluster file system hangs. On one node, all worker threads are blocked due to file
system freeze. And there's another thread blocked with stack like this:

- __schedule
- schedule
- vx_svar_sleep_unlock
- vx_event_wait
- vx_olt_iauinit
- vx_olt_iasinit
- vx_loadv23_fsetinit
- vx_loadv23_fset
- vx_fset_reget
- vx_fs_reinit
- vx_recv_cwfa
- vx_msg_process_thread
- vx_kthread_init

DESCRIPTION:
The frozen CFS won't thaw because the mentioned thread is waiting for a work item
to be processed in vx_olt_iauinit(). Since all the worker threads are blocked,
there is no free thread to process this work item.

RESOLUTION:
Change the code in vx_olt_iauinit(), so that the work item will be processed even
with all worker threads blocked.

* 3911719 (Tracking ID: 3909583)

SYMPTOM:
Disable partitioning of directory if directory size is greater than upper threshold value.

DESCRIPTION:
If PD is enabled during mount, mount may take long time to complete. Because mount 
tries to partition all the directories hence looks like hung. To avoid such hangs, a new upper 
threshold 
value for PD is added which will disable partitioning of directory if directory size is above that value.

RESOLUTION:
Code is modified to disable partitioning of directory if directory size is greater than 
upper 
threshold value.

* 3911732 (Tracking ID: 3896670)

SYMPTOM:
Intermittent CFS hang like situation with many CFS pglock grant 
messages pending on LLT layer.

DESCRIPTION:
To optimize the CFS locking, VxFS may send greedy pglock grant 
messages to speed up the upcoming write operations. In certain scenarios 
created due to particular read and write pattern across nodes, one node can 
send these greedy msgs far more faster than the speed of response. This may 
cause build up lot of msgs on the CFS layer and may delay the response of 
other msgs and cause slowness in CFS operations.

RESOLUTION:
Fix is to send the next greedy msg only after receiving the 
response of the previous one. This way at a given time there will be only 
one 
pglock greedy msg will be in flight.

* 3911926 (Tracking ID: 3901318)

SYMPTOM:
VxFS module failed to load on RHEL7.3.

DESCRIPTION:
Since RHEL7.3 is new release therefore VxFS module failed to load
on it.

RESOLUTION:
Added VxFS support for RHEL7.3.

* 3911968 (Tracking ID: 3911966)

SYMPTOM:
VxFS module failed to load on SLES12 SP2.

DESCRIPTION:
Since SLES12 SP2 is new release therefore VxFS module failed to load
on it.

RESOLUTION:
Added VxFS support for SLES12 SP2.

* 3912988 (Tracking ID: 3912315)

SYMPTOM:
EMAP corruption while freeing up the extent.
Feb  4 15:10:45 localhost kernel: vxfs: msgcnt 2 mesg 056: V-2-56: vx_mapbad - 
vx_smap_stateupd - file system extent allocation unit state bitmap number 0 
marked bad

Feb  4 15:10:45 localhost kernel: Call Trace:
vx_setfsflags+0x103/0x140 [vxfs]
vx_mapbad+0x74/0x2d0 [vxfs]
vx_smap_stateupd+0x113/0x130 [vxfs]
vx_extmapupd+0x552/0x580 [vxfs]
vx_alloc+0x3d6/0xd10 [vxfs]
vx_extsub+0x0/0x5f0 [vxfs]
vx_semapclone+0xe1/0x190 [vxfs]
vx_clonemap+0x14d/0x230 [vxfs]
vx_unlockmap+0x299/0x330 [vxfs]
vx_smap_dirtymap+0xea/0x120 [vxfs]
vx_do_extfree+0x2b8/0x2e0 [vxfs]
vx_extfree1+0x22e/0x7c0 [vxfs]
vx_extfree+0x9f/0xd0 [vxfs]
vx_exttrunc+0x10d/0x2a0 [vxfs]
vx_trunc_ext4+0x65f/0x7a0 [vxfs]
vx_validate_ext4+0xcc/0x1a0 [vxfs]
vx_trunc_tran2+0xb7f/0x1450 [vxfs]
vx_trunc_tran+0x18f/0x1e0 [vxfs]
vx_trunc+0x66a/0x890 [vxfs]
vx_iflush_list+0xaee/0xba0 [vxfs]
vx_iflush+0x67/0x80 [vxfs]
vx_workitem_process+0x24/0x50 [vxfs]

DESCRIPTION:
The issue oocurs due to wrongly validating the smap and changing 
the AU state from free to allocated.

RESOLUTION:
skip validation and change the EAU state only if it is a SMAP 
update.

* 3912989 (Tracking ID: 3912322)

SYMPTOM:
The vxfs tunable max_seqio_extent_size cannot be tuned to any 
value less than 32768.

DESCRIPTION:
In vxfs 7.1 the default for max_seqio_extent_size was changed from 
2048 to 32768. Due to a bug it doesn't allow setting any value less than 32768
for this tunable.

RESOLUTION:
Fix is to allow the tunable to be set to any value >=2048.

* 3912990 (Tracking ID: 3912407)

SYMPTOM:
CFS hang on VxFS 7.2 while thread on a CFS node waits for EAU 
delegation:

__schedule at ffffffff8163b46d
schedule at ffffffff8163bb09
vx_svar_sleep_unlock at ffffffffa0a99eb5 [vxfs]
vx_event_wait at ffffffffa0a9a68f [vxfs]
vx_async_waitmsg at ffffffffa09aa680 [vxfs]
vx_msg_send at ffffffffa09aa829 [vxfs]
vx_get_dele at ffffffffa09d5be1 [vxfs]
vx_extentalloc_device at ffffffffa0923bb0 [vxfs]
vx_extentalloc at ffffffffa0925272 [vxfs]
vx_bmap_ext4 at ffffffffa0953755 [vxfs]
vx_bmap_alloc_ext4 at ffffffffa0953e14 [vxfs]
vx_bmap_alloc at ffffffffa0950a2a [vxfs]
vx_write_alloc3 at ffffffffa09c281e [vxfs]
vx_tran_write_alloc at ffffffffa09c3321 [vxfs]
vx_cfs_prealloc at ffffffffa09b4220 [vxfs]
vx_write_alloc2 at ffffffffa09c25ad [vxfs]
vx_write_alloc at ffffffffa0b708fb [vxfs]
vx_write1 at ffffffffa0b712ff [vxfs]
vx_write_common_slow at ffffffffa0b7268c [vxfs]
vx_write_common at ffffffffa0b73857 [vxfs]
vx_write at ffffffffa0af04a6 [vxfs]
vfs_write at ffffffff811ded3d
sys_write at ffffffff811df7df
system_call_fastpath at ffffffff81646b09

And a corresponding delegation receiver thread should be seen looping on CFS 
primary:
 
PID: 18958  TASK: ffff88006c776780  CPU: 0   COMMAND: "vx_msg_thread"
__schedule at ffffffff8163a26d
mutex_lock at ffffffff81638b42
vx_emap_lookup at ffffffffa0ecf0eb [vxfs]
vx_extchkmaps at ffffffffa0e9c7b4 [vxfs]
vx_searchau_downlevel at ffffffffa0ea0923 [vxfs]
vx_searchau at ffffffffa0ea0e22 [vxfs]
vx_dele_get_freespace at ffffffffa0f53b6d [vxfs]
vx_getedele_size at ffffffffa0f54c4b [vxfs]
vx_pri_getdele at ffffffffa0f54edc [vxfs]
vx_recv_getdele at ffffffffa0f5690d [vxfs]
vx_recvdele at ffffffffa0f59800 [vxfs]
vx_msg_process_thread at ffffffffa0f2aead [vxfs]
vx_kthread_init at ffffffffa105cba4 [vxfs]
kthread at ffffffff810a5aef
ret_from_fork at ffffffff81645858

DESCRIPTION:
In VxFS 7.2 release a performance optimization was done in the way we 
update the allocation state map. Prior to 7.2 the updates were done synchronously 
and 
in 7.2 the updates were made transactional. The hang happens when an EAU needs to be 
converted back to FREE state after all the allocation from it is freed. In such case 
if 
the corresponding EAU delegation may time out before we can complete the process, it 
can result in inconsistent state. Because of this inconsistency, in future when a 
node 
will try to get delegation of this EAU, the primary may loop forever resulting the 
secondary to wait infinitely for the AU delegation.

RESOLUTION:
Code fix is done to not allow the delegation to timeout till the free 
processing is complete.

* 3913004 (Tracking ID: 3911048)

SYMPTOM:
The LDH bucket validation failure message is logged and system hang.

DESCRIPTION:
When modifying a large directory, vxfs needs to find a new bucket in the LDH 
for this 
directory, and once the bucket is full, it will be be split to get more 
bucket to use. 
When the bucket is split to maximum amount, overflow bucket will be 
allocated. Under 
some condition, the available bucket lookup on overflow bucket will may got 
incorrect 
result and overwrite the existing bucket entry thus corrupt the LDH file. 
Another 
problem is that when the bucket invalidation failed, the bucket buffer is 
released 
without checking whether the buffer is already in a previous transaction, 
this may 
cause the transaction flush thread to hang and finally stuck the whole 
filesystem.

RESOLUTION:
Correct the LDH bucket entry change code to avoid the corrupt. And release 
the bucket 
buffer without throw it out of memory to avoid blocking the transaction 
flush.

* 3914384 (Tracking ID: 3915578)

SYMPTOM:
vxfs module failing to load after reaboot with dmesg log:
[    3.689385] systemd-sysv-generator[518]: Overwriting existing symlink
/run/systemd/generator.late/vxfs.service with real service.
[    4.738143] vxfs: Unknown symbol ki_get_boot (err 0)
[    4.793301] vxfs: Unknown symbol ki_get_boot (err 0)

DESCRIPTION:
The vxfs module is dependent on the veki module. During boot time, 
the veki is not loaded before vxfs. So the vxfs module fails to load.

RESOLUTION:
During boot the veki should start before loading vxfs module.

Patch ID: VRTSamf-7.2.0.100-RHEL7

* 3908392 (Tracking ID: 3896877)

SYMPTOM:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).

DESCRIPTION:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 
versions later than RHEL7 Update 2.

RESOLUTION:
Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 
3(RHEL7.3) is now introduced.

Patch ID: VRTSgab-7.2.0.100-RHEL7

* 3913424 (Tracking ID: 3896877)

SYMPTOM:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).

DESCRIPTION:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 
versions later than RHEL7 Update 2.

RESOLUTION:
Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 
3(RHEL7.3) is now introduced.

Patch ID: VRTSllt-7.2.0.200-RHEL7

* 3913423 (Tracking ID: 3896877)

SYMPTOM:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 
Update 3(RHEL7.3).

DESCRIPTION:
Veritas Infoscale Availability does not support Red Hat Enterprise Linux 
versions later than RHEL7 Update 2.

RESOLUTION:
Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 
3(RHEL7.3) is now introduced.

Patch ID: VRTSllt-7.2.0.100-RHEL7

* 3906300 (Tracking ID: 3905430)

SYMPTOM:
Application IO hangs in case of FSS with LLT over RDMA during heavy data transfer.

DESCRIPTION:
In case of FSS using LLT over RDMA, sometimes IO may hang because of race conditions in LLT 
code.

RESOLUTION:
LLT module is modified to fix the race conditions arising due to heavy load with multiple 
application threads.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch infoscale-rhel7.3_x86_64-Patch-7.2.0.100.tar.gz to /tmp
2. Untar infoscale-rhel7.3_x86_64-Patch-7.2.0.100.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/infoscale-rhel7.3_x86_64-Patch-7.2.0.100.tar.gz
    # tar xf /tmp/infoscale-rhel7.3_x86_64-Patch-7.2.0.100.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSinfoscale720P100 [<host1> <host2>...]

You can also install this patch together with 7.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.2 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
Manual installation is not supported.


REMOVING THE PATCH
------------------
Manual uninstallation is not supported.


KNOWN ISSUES
------------
* Tracking ID: 3910089

SYMPTOM: DB2 theads (db2sysc) hang, we can see from the crash dump:
Many of them get stuck in vx_dio_physio():
 - schedule
 - rwsem_down_failed_common
 - call_rwsem_down_read_failed
 - down_read
 - vx_dio_physio

And many of them get stuck in vx_rwlock():
 - schedule
 - rwsem_down_failed_common
 - call_rwsem_down_read_failed
 - down_read
 - vx_rwlock

WORKAROUND: No

* Tracking ID: 3910100

SYMPTOM: Oracle database start failure, with trace log like this:

ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 304 (block # 722821)
ORA-01110: data file 304: <file_name>
ORA-17500: ODM err:ODM ERROR V-41-4-2-231-28 No space left on device

WORKAROUND: No



SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE