File System on Linux, patch detail

To use SORT, JavaScript must be enabled. How to enable JavaScript.

fs-rhel7_x86_64-Patch-7.2.0.100 Obsolete Go to Download Center to download. The latest patch(es) : infoscale-rhel7_x86_64-Patch-7.2.0.500 infoscale-rhel7_x86_64-Patch-7.2.0.300 infoscale-rhel7.3_x86_64-Patch-7.2.0.100

Basic information

Release type:	Patch
Release date:	2017-04-24
OS update support:	None
Technote:	None
Documentation:	None
Popularity:	4144 viewed downloaded
Download size:	8.85 MB
Checksum:	1521891090

Applies to one or more of the following products:

InfoScale Enterprise 7.2 On RHEL7 x86-64
InfoScale Foundation 7.2 On RHEL7 x86-64
InfoScale Storage 7.2 On RHEL7 x86-64

Obsolete patches, incompatibilities, superseded patches, or other requirements:

This patch is obsolete. It is superseded by:	Release date
infoscale-rhel7_x86_64-Patch-7.2.0.500	2019-05-31
infoscale-rhel7_x86_64-Patch-7.2.0.300	2018-06-18
infoscale-rhel7.4_x86_64-Patch-7.2.0.200 (obsolete)	2017-09-17
infoscale-rhel7.3_x86_64-Patch-7.2.0.100	2017-04-18

Fixes the following incidents:

3909937, 3909938, 3909939, 3909940, 3909941, 3909943, 3909946, 3910083, 3910084, 3910085, 3910086, 3910088, 3910090, 3910093, 3910094, 3910096, 3910097, 3910098, 3910101, 3910103, 3910105, 3910356, 3911290, 3911718, 3911719, 3911732, 3911926, 3911968, 3912988, 3912989, 3912990, 3913004, 3914384

Patch ID:

VRTSvxfs-7.2.0.100-RHEL7

Readme file

* * * READ ME * * *
* * * Veritas File System 7.2 * * *
* * * Patch 100 * * *
Patch Date: 2017-04-19

This document provides the following information:

* PATCH NAME
* OPERATING SYSTEMS SUPPORTED BY THE PATCH
* PACKAGES AFFECTED BY THE PATCH
* BASE PRODUCT VERSIONS FOR THE PATCH
* SUMMARY OF INCIDENTS FIXED BY THE PATCH
* DETAILS OF INCIDENTS FIXED BY THE PATCH
* INSTALLATION PRE-REQUISITES
* INSTALLING THE PATCH
* REMOVING THE PATCH
* KNOWN ISSUES

PATCH NAME
----------
Veritas File System 7.2 Patch 100

OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
RHEL7 x86-64

PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxfs

BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
* Veritas InfoScale Foundation 7.2
* Veritas InfoScale Storage 7.2
* Veritas InfoScale Enterprise 7.2

SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: 7.2.0.100
* 3909937 (3908954) Some writes could be missed causing data loss.
* 3909938 (3905099) VxFS unmount panicked in deactive_super().
* 3909939 (3906548) Start up vxfs after local file system mounted with rw through systemd.
* 3909940 (3894712) ACL permissions are not inherited correctly on cluster
file system.
* 3909941 (3868609) High CPU usage seen because of vxfs thread
while applying Oracle redo logs
* 3909943 (3729030) The fsdedupschd daemon failed to start on RHEL7.
* 3909946 (3685391) Execute permissions for a file not honored correctly.
* 3910083 (3707662) Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap.
* 3910084 (3812330) slow ls -l across cluster nodes
* 3910085 (3779916) vxfsconvert fails to upgrade layout verison for a vxfs file
system with large number of inodes.
* 3910086 (3830300) Degraded CPU performance during backup of Oracle archive logs
on CFS vs local filesystem
* 3910088 (3855726) Panic in vx_prot_unregister_all().
* 3910090 (3790721) High cpu usage caused by vx_send_bcastgetemapmsg_remaus
* 3910093 (1428611) 'vxcompress' can spew many GLM block lock messages over the
LLT network.
* 3910094 (3879310) The file system may get corrupted after a failed vxupgrade.
* 3910096 (3757609) CPU usage going high because of contention over ODM_IO_LOCK
* 3910097 (3817734) Direct command to run fsck with -y|Y option was mentioned in
the message displayed to user when file system mount fails.
* 3910098 (3861271) Missing an inode clear operation when a Linux inode is being de-initialized on
SLES11.
* 3910101 (3846521) "cp -p" fails if modification time in nano seconds have 10
digits.
* 3910103 (3817734) Direct command to run fsck with -y|Y option was mentioned in
the message displayed to user when file system mount fails.
* 3910105 (3907902) System panic observed due to race between dalloc off thread
and getattr thread.
* 3910356 (3908785) System panic observed because of null page address in writeback
structure in case of
kswapd process.
* 3911290 (3910526) fsadm fails with error number 28 during resize operation
* 3911718 (3905576) CFS hang during a cluster wide freeze
* 3911719 (3909583) Disable partitioning of directory if directory size is greater than upper
threshold
value.
* 3911732 (3896670) Intermittent CFS hang like situation with many CFS pglock
grant messages pending on LLT layer
* 3911926 (3901318) VxFS module failed to load on RHEL7.3.
* 3911968 (3911966) VxFS module failed to load on SLES12 SP2.
* 3912988 (3912315) EMAP corruption while freeing up the extent
* 3912989 (3912322) vxfs tunable max_seqio_extent_size cannot be tuned to any
value less than 32768.
* 3912990 (3912407) CFS hang on VxFS 7.2 while thread on a CFS node waits for EAU
delegation.
* 3913004 (3911048) LDH corrupt and filesystem hang.
* 3914384 (3915578) vxfs module is failed to load after reboot.

DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: 7.2.0.100

* 3909937 (Tracking ID: 3908954)

SYMPTOM:
Whilst performing vectored writes, using writev(), where two iovec-writes write
to different offsets within the same 4K page-aligned range of a file, it is
possible to find null data at the beginning of the 4Kb range when reading
the
data back.

DESCRIPTION:
Whilst multiple processes are performing vectored writes to a file, using
writev(),

The following situation can occur

We have 2 iovecs, the first is 448 bytes and the second is 30000 bytes. The
first iovec of 448 bytes completes, however the second iovec finds that the
source page is no longer in memory. As it cannot fault-in during uiomove, it has
to undo both iovecs. It then faults the page back in and retries the second iovec
only. However, as the undo operation also undid the first iovec, the first 448
bytes of the page are populated with nulls. When reading the file back, it seems
that no data was written for the first iovec. Hence, we find nulls in the file.

RESOLUTION:
Code has been changed to handle the unwind of multiple iovecs accordingly in the
scenarios where certain amount of data is written out of a particular iovec and
some from other.

* 3909938 (Tracking ID: 3905099)

SYMPTOM:
VxFS unmount panicked in deactive_super(), the panic stack looks like
following:

#9 vx_fsnotify_flush [vxfs]
#10 vx_softcnt_flush [vxfs]
#11 vx_idrop [vxfs]
#12 vx_detach_fset [vxfs]
#13 vx_unmount [vxfs]
#14 generic_shutdown_super
#15 kill_block_super
#16 vx_kill_sb
#17 amf_kill_sb
#18 deactivate_super
#19 mntput_no_expire
#20 sys_umount
#21 system_call_fastpath

DESCRIPTION:
Suspected there is a a race between unmount, and a user-space notifier install
for root inode.

RESOLUTION:
Added diaged code and defensive check for fsnotify_flush in vx_softcnt_flush.

* 3909939 (Tracking ID: 3906548)

SYMPTOM:
Read-only file system errors reported while loading drivers to start up vxfs
through systemd.

DESCRIPTION:
On SLES system, vxfs starting up could be invoked by other systemd unit while
root
and local file system still not be mounted as read/write, in such case, Read-
only
file system errors could be reported during system start up.

RESOLUTION:
Start up vxfs after local file system mounted with rw through systemd, also
delay vxfs file system mounting by add "_netdev" mount option in /etc/fstab.

* 3909940 (Tracking ID: 3894712)

SYMPTOM:
ACL permissions are not inherited correctly on cluster file system.

DESCRIPTION:
The ACL counts stored on a directory inode gets reset every
time directory inodes
ownership is switched between the nodes. When ownership on directory inode
comes back to the node,
which previously abdicated it, ACL permissions were not getting inherited
correctly for the newly
created files.

RESOLUTION:
Modified the source such that the ACLs are inherited correctly.

* 3909941 (Tracking ID: 3868609)

SYMPTOM:
High CPU usage seen because of vxfs thread.

DESCRIPTION:
To avoid memory deadlocks, and to track exiting
threads with outstanding ODM requests, we need to hook into
he kernels memory management. While rescheduling happens for
Oracle threads, they hold the mmap_sem on which FDD threads
keep on waiting causing contention and high CPU usage.

RESOLUTION:
Remove the bouncing of the spinlock between the
CPUs, and so reduce the CPU spike.

* 3909943 (Tracking ID: 3729030)

SYMPTOM:
The fsdedupschd daemon failed to start on RHEL7.

DESCRIPTION:
The dedup service daemon failed to start because RHEL 7 changed the service
management mechanism. The daemon uses the new systemctl to start and stop
the service. For the systemctl to properly start, stop, or query the
service, it needs a service definition file under the
/usr/lib/systemd/system.

RESOLUTION:
The code is modified to create the fsdedupschd.service file while installing
the VRTSfsadv package.

* 3909946 (Tracking ID: 3685391)

SYMPTOM:
Execute permissions for a file not honored correctly.

DESCRIPTION:
The user was able to execute the file regardless of not having the execute permissions.

RESOLUTION:
The code is modified such that an error is reported when the execute permissions are not applied.

* 3910083 (Tracking ID: 3707662)

SYMPTOM:
Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap with the following stack::

vx_iunlock
vx_reorg_iunlock_rct_reorg
vx_reorg_emap
vx_extmap_reorg
vx_reorg
vx_aioctl_full
vx_aioctl_common
vx_aioctl
vx_ioctl
fop_ioctl
ioctl

DESCRIPTION:
When the timer expires (fsadm with -t option), vx_do_close() calls vx_reorg_clear() on local mount which performs cleanup on reorg rct inode. Another thread currently active in vx_reorg_emap() will panic due to null pointer dereference.

RESOLUTION:
When fop_close is called in alarm handler context, we defer the cleaning up untill the kernel thread performing reorg completes its operation.

* 3910084 (Tracking ID: 3812330)

SYMPTOM:
slow ls -l across cluster nodes.

DESCRIPTION:
When we issue "ls -l" , VOP getattr is issued by which in VxFS, we
update the necessary stats of an inode whose owner is some other node in the
cluster. Ideally this update process should be done through asynchronous message
passing mechanism which is not happening in this case.Instead the non-owner
node, where we are issuing "ls -l", tries to pull strong ownership towards itself
to update the inode stats.Hence a lot of time is consumed in this ping-pong of
ownership.

RESOLUTION:
Avoiding pulling strong ownership for the inode when doing ls from
some other node which is not current owner and doing inode stats update through
asynchronous message passing mechanism using a module parameter "vx_lazyisiz".

* 3910085 (Tracking ID: 3779916)

SYMPTOM:
vxfsconvert fails to upgrade layout verison for a vxfs file system with
large number of inodes. Error message will show some inode discrepancy.

DESCRIPTION:
vxfsconvert walks through the ilist and converts inode. It stores
chunks of inodes in a buffer and process them as a batch. The inode number
parameter for this inode buffer is of type unsigned integer. The offset of a
particular inode in the ilist is calculated by multiplying the inode number with
size of inode structure. For large inode numbers this product of inode_number *
inode_size can overflow the unsigned integer limit, thus giving wrong offset
within the ilist file. vxfsconvert therefore reads wrong inode and eventually
fails.

RESOLUTION:
The inode number parameter is defined as unsigned long to avoid
overflow.

* 3910086 (Tracking ID: 3830300)

SYMPTOM:
Heavy cpu usage while oracle archive process are running on a clustered
fs.

DESCRIPTION:
The cause of the poor read performance in this case was due to fragmentation,
fragmentation mainly happens when there are multiple archivers running on the
same node. The allocation pattern of the oracle archiver processes is

1. write header with O_SYNC
2. ftruncate-up the file to its final size ( a few GBs typically)
3. do lio_listio with 1MB iocbs

The problem occurs because all the allocations in this manner go through
internal allocations i.e. allocations below file size instead of allocations
past the file size. Internal allocations are done at max 8 Pages at once. So if
there are multiple processes doing this, they all get these 8 Pages alternately
and the fs becomes very fragmented.

RESOLUTION:
Added a tunable, which will allocate zfod extents when ftruncate
tries to increase the size of the file, instead of creating a hole. This will
eliminate the allocations internal to file size thus the fragmentation. Fixed
the earlier implementation of the same fix, which ran into
locking issues. Also fixed the performance issue while writing from secondary node.

* 3910088 (Tracking ID: 3855726)

SYMPTOM:
Panic happens in vx_prot_unregister_all(). The stack looks like this:

- vx_prot_unregister_all
- vxportalclose
- __fput
- fput
- filp_close
- sys_close
- system_call_fastpath

DESCRIPTION:
The panic is caused by a NULL fileset pointer, which is due to referencing the
fileset before it's loaded, plus, there's a race on fileset identity array.

RESOLUTION:
Skip the fileset if it's not loaded yet. Add the identity array lock to prevent
the possible race.

* 3910090 (Tracking ID: 3790721)

SYMPTOM:
High CPU usage on the vxfs thread process. The backtrace of such kind of threads
usually look like this:

schedule
schedule_timeout
__down
down
vx_send_bcastgetemapmsg_remaus
vx_send_bcastgetemapmsg
vx_recv_getemapmsg
vx_recvdele
vx_msg_recvreq
vx_msg_process_thread
vx_kthread_init
kernel_thread

DESCRIPTION:
The locking mechanism in vx_send_bcastgetemapmsg_process() is inefficient. So that
every
time vx_send_bcastgetemapmsg_process() is called, it will perform a series of
down-up
operation on a certain semaphore. This can result in a huge CPU cost when multiple
threads have contention on this semaphore.

RESOLUTION:
Optimize the locking mechanism in vx_send_bcastgetemapmsg_process(),
so that it only do down-up operation on the semaphore once.

* 3910093 (Tracking ID: 1428611)

SYMPTOM:
'vxcompress' command can cause many GLM block lock messages to be
sent over the network. This can be observed with 'glmstat -m' output under the
section "proxy recv", as shown in the example below -

bash-3.2# glmstat -m
message all rw g pg h buf oth
loop
master send:
GRANT 194 0 0 0 2 0 192
98
REVOKE 192 0 0 0 0 0 192
96
subtotal 386 0 0 0 2 0 384
194

master recv:
LOCK 193 0 0 0 2 0 191
98
RELEASE 192 0 0 0 0 0 192
96
subtotal 385 0 0 0 2 0 383
194

master total 771 0 0 0 4 0 767
388

proxy send:
LOCK 98 0 0 0 2 0 96
98
RELEASE 96 0 0 0 0 0 96
96
BLOCK_LOCK 2560 0 0 0 0 2560 0
0
BLOCK_RELEASE 2560 0 0 0 0 2560 0
0
subtotal 5314 0 0 0 2 5120 192
194

DESCRIPTION:
'vxcompress' creates placeholder inodes (called IFEMR inodes) to
hold the compressed data of files. After the compression is finished, IFEMR
inode exchange their bmap with the original file and later given to inactive
processing. Inactive processing truncates the IFEMR extents (original extents
of the regular file, which is now compressed) by sending cluster-wide buffer
invalidation requests. These invalidations need GLM block lock. Regular file
data need not be invalidated across the cluster, thus making these GLM block
lock requests unnecessary.

RESOLUTION:
Pertinent code has been modified to skip the invalidation for the
IFEMR inodes created during compression.

* 3910094 (Tracking ID: 3879310)

SYMPTOM:
The file system may get corrupted after the file system freeze during
vxupgrade. The full fsck gives the following errors:

UX:vxfs fsck: ERROR: V-3-20451: No valid device inodes found
UX:vxfs fsck: ERROR: V-3-20694: cannot initialize aggregate

DESCRIPTION:
The vxupgrade requires the file system to be frozen during its functional
operation. It may happen that the corruption can be detected while the freeze
is in progress and the full fsck flag can be set on the file system. However,
this doesn't stop the vxupgrade from proceeding.
At later stage of vxupgrade, after structures related to the new disk layout
are updated on the disk, vxfs frees up and zeroes out some of the old metadata
inodes. If any error occurs after this point (because of full fsck being set),
the file system needs to go back completely to the previous version at the tile
of full fsck. Since the metadata corresponding to the previous version is
already cleared, the full fsck cannot proceed and gives the error.

RESOLUTION:
The code is modified to check for the full fsck flag after freezing the file
system during vxupgrade. Also, disable the file system if an error occurs after
writing new metadata on the disk. This will force the newly written metadata to
be loaded in memory on the next mount.

* 3910096 (Tracking ID: 3757609)

SYMPTOM:
High CPU usage because of contention over ODM_IO_LOCK

DESCRIPTION:
While performing ODM IO, to update some of the ODM counters we take
ODM_IO_LOCK which leads to contention from multiple of iodones trying to update
these counters at the same time. This is results in high CPU usage.

RESOLUTION:
Code modified to remove the lock contention.

* 3910097 (Tracking ID: 3817734)

SYMPTOM:
If file system with full fsck flag set is mounted, direct command message
is printed to the user to clean the file system with full fsck.

DESCRIPTION:
When mounting file system with full fsck flag set, mount will fail
and a message will be printed to clean the file system with full fsck. This
message contains direct command to run, which if run without collecting file
system metasave will result in evidences being lost. Also since fsck will remove
the file system inconsistencies it may lead to undesired data being lost.

RESOLUTION:
More generic message is given in error message instead of direct
command.

* 3910098 (Tracking ID: 3861271)

SYMPTOM:
Due to the missing inode clear action, a page can also be in a strange state.
Also, inode is not fully quiescent which leads to races in the inode code.
Sometime this can cause panic from iput_final().

DESCRIPTION:
We're missing an inode clear operation when a Linux inode is being
de-initialized on SLES11.

RESOLUTION:
Add the inode clear operation on SLES11.

* 3910101 (Tracking ID: 3846521)

SYMPTOM:
cp -p is failing with EINVAL for files with 10 digit
modification time. EINVAL error is returned if the value in tv_nsec field is
greater than/outside the range of 0 to 999, 999, 999. VxFS supports the
update in usec but when copying in the user space, we convert the usec to
nsec. So here in this case, usec has crossed the upper boundary limit i.e
999, 999.

DESCRIPTION:
In a cluster, its possible that time across nodes might
differ.so
when updating mtime, vxfs check if it's cluster inode and if nodes mtime is
newer
time than current node time, then accordingly increment the tv_usec instead of
changing mtime to older time value. There might be chance that it, tv_usec
counter got overflowed here, which resulted in 10 digit mtime.tv_nsec.

RESOLUTION:
Code is modified to reset usec counter for mtime/atime/ctime when
upper boundary limit i.e. 999999 is reached.

* 3910103 (Tracking ID: 3817734)

SYMPTOM:
If file system with full fsck flag set is mounted, direct command message
is printed to the user to clean the file system with full fsck.

RESOLUTION:
More generic message is given in error message instead of direct
command.

* 3910105 (Tracking ID: 3907902)

SYMPTOM:
System panic observed due to race between dalloc off thread and
getattr
thread.

DESCRIPTION:
With 7.2 release of VxFS, dalloc states are now stored in a new
structure. when getting attributes of file, dalloc blocks are calculated and
stored into this new structure. If a dalloc off thread races with getattr
thread, there is a possibility of dereferencing of NULL dalloc structure by
getattr thread.

RESOLUTION:
Code changes has been done to take appropriate dalloc lock while
calculating dalloc blocks in getattr function to avoid the race.

* 3910356 (Tracking ID: 3908785)

SYMPTOM:
System panic observed because of null page address in writeback structure in case of
kswapd
process.

DESCRIPTION:
Secfs2/Encryptfs layers had used write VOP as a hook when Kswapd is triggered to
free page.
Ideally kswapd should call writepage() routine where writeback structure are correctly filled. When
write VOP is
called because of hook in secfs2/encrypts, writeback structures are cleared, resulting in null page
address.

RESOLUTION:
Code changes has been done to call VxFS kswapd routine only if valid page address is
present.

* 3911290 (Tracking ID: 3910526)

SYMPTOM:
In case of a full filesystem, fsadm resize operation to increase the filesystem
size may fail with following error :
Attempt to resize <volume-name> failed with errno 28

DESCRIPTION:
If there is no space available in the filesystem and resize operation gets
initiated, intent log extents are being used for metadata setup to continue the
resize operation. If resize is successful, we update superblock with the new
size and try to reallocate new intent log inode of same size. if resize size is
smaller than log inode size, we hit the ENOSPC when reallocating intent log
inode extents. Due to this failure, resize command fails and shrinks back the
volume to its original size but the superblock continue to have the new
filesystem size.

RESOLUTION:
The code has been modified to fail the resize operation if resize
size is less than intent log inode size.

* 3911718 (Tracking ID: 3905576)

SYMPTOM:
Cluster file system hangs. On one node, all worker threads are blocked due to file
system freeze. And there's another thread blocked with stack like this:

- __schedule
- schedule
- vx_svar_sleep_unlock
- vx_event_wait
- vx_olt_iauinit
- vx_olt_iasinit
- vx_loadv23_fsetinit
- vx_loadv23_fset
- vx_fset_reget
- vx_fs_reinit
- vx_recv_cwfa
- vx_msg_process_thread
- vx_kthread_init

DESCRIPTION:
The frozen CFS won't thaw because the mentioned thread is waiting for a work item
to be processed in vx_olt_iauinit(). Since all the worker threads are blocked,
there is no free thread to process this work item.

RESOLUTION:
Change the code in vx_olt_iauinit(), so that the work item will be processed even
with all worker threads blocked.

* 3911719 (Tracking ID: 3909583)

SYMPTOM:
Disable partitioning of directory if directory size is greater than upper threshold value.

DESCRIPTION:
If PD is enabled during mount, mount may take long time to complete. Because mount
tries to partition all the directories hence looks like hung. To avoid such hangs, a new upper
threshold
value for PD is added which will disable partitioning of directory if directory size is above that value.

RESOLUTION:
Code is modified to disable partitioning of directory if directory size is greater than
upper
threshold value.

* 3911732 (Tracking ID: 3896670)

SYMPTOM:
Intermittent CFS hang like situation with many CFS pglock grant
messages pending on LLT layer.

DESCRIPTION:
To optimize the CFS locking, VxFS may send greedy pglock grant
messages to speed up the upcoming write operations. In certain scenarios
created due to particular read and write pattern across nodes, one node can
send these greedy msgs far more faster than the speed of response. This may
cause build up lot of msgs on the CFS layer and may delay the response of
other msgs and cause slowness in CFS operations.

RESOLUTION:
Fix is to send the next greedy msg only after receiving the
response of the previous one. This way at a given time there will be only
one
pglock greedy msg will be in flight.

* 3911926 (Tracking ID: 3901318)

SYMPTOM:
VxFS module failed to load on RHEL7.3.

DESCRIPTION:
Since RHEL7.3 is new release therefore VxFS module failed to load
on it.

RESOLUTION:
Added VxFS support for RHEL7.3.

* 3911968 (Tracking ID: 3911966)

SYMPTOM:
VxFS module failed to load on SLES12 SP2.

DESCRIPTION:
Since SLES12 SP2 is new release therefore VxFS module failed to load
on it.

RESOLUTION:
Added VxFS support for SLES12 SP2.

* 3912988 (Tracking ID: 3912315)

SYMPTOM:
EMAP corruption while freeing up the extent.
Feb 4 15:10:45 localhost kernel: vxfs: msgcnt 2 mesg 056: V-2-56: vx_mapbad -
vx_smap_stateupd - file system extent allocation unit state bitmap number 0
marked bad

Feb 4 15:10:45 localhost kernel: Call Trace:
vx_setfsflags+0x103/0x140 [vxfs]
vx_mapbad+0x74/0x2d0 [vxfs]
vx_smap_stateupd+0x113/0x130 [vxfs]
vx_extmapupd+0x552/0x580 [vxfs]
vx_alloc+0x3d6/0xd10 [vxfs]
vx_extsub+0x0/0x5f0 [vxfs]
vx_semapclone+0xe1/0x190 [vxfs]
vx_clonemap+0x14d/0x230 [vxfs]
vx_unlockmap+0x299/0x330 [vxfs]
vx_smap_dirtymap+0xea/0x120 [vxfs]
vx_do_extfree+0x2b8/0x2e0 [vxfs]
vx_extfree1+0x22e/0x7c0 [vxfs]
vx_extfree+0x9f/0xd0 [vxfs]
vx_exttrunc+0x10d/0x2a0 [vxfs]
vx_trunc_ext4+0x65f/0x7a0 [vxfs]
vx_validate_ext4+0xcc/0x1a0 [vxfs]
vx_trunc_tran2+0xb7f/0x1450 [vxfs]
vx_trunc_tran+0x18f/0x1e0 [vxfs]
vx_trunc+0x66a/0x890 [vxfs]
vx_iflush_list+0xaee/0xba0 [vxfs]
vx_iflush+0x67/0x80 [vxfs]
vx_workitem_process+0x24/0x50 [vxfs]

DESCRIPTION:
The issue oocurs due to wrongly validating the smap and changing
the AU state from free to allocated.

RESOLUTION:
skip validation and change the EAU state only if it is a SMAP
update.

* 3912989 (Tracking ID: 3912322)

SYMPTOM:
The vxfs tunable max_seqio_extent_size cannot be tuned to any
value less than 32768.

DESCRIPTION:
In vxfs 7.1 the default for max_seqio_extent_size was changed from
2048 to 32768. Due to a bug it doesn't allow setting any value less than 32768
for this tunable.

RESOLUTION:
Fix is to allow the tunable to be set to any value >=2048.

* 3912990 (Tracking ID: 3912407)

SYMPTOM:
CFS hang on VxFS 7.2 while thread on a CFS node waits for EAU
delegation:

__schedule at ffffffff8163b46d
schedule at ffffffff8163bb09
vx_svar_sleep_unlock at ffffffffa0a99eb5 [vxfs]
vx_event_wait at ffffffffa0a9a68f [vxfs]
vx_async_waitmsg at ffffffffa09aa680 [vxfs]
vx_msg_send at ffffffffa09aa829 [vxfs]
vx_get_dele at ffffffffa09d5be1 [vxfs]
vx_extentalloc_device at ffffffffa0923bb0 [vxfs]
vx_extentalloc at ffffffffa0925272 [vxfs]
vx_bmap_ext4 at ffffffffa0953755 [vxfs]
vx_bmap_alloc_ext4 at ffffffffa0953e14 [vxfs]
vx_bmap_alloc at ffffffffa0950a2a [vxfs]
vx_write_alloc3 at ffffffffa09c281e [vxfs]
vx_tran_write_alloc at ffffffffa09c3321 [vxfs]
vx_cfs_prealloc at ffffffffa09b4220 [vxfs]
vx_write_alloc2 at ffffffffa09c25ad [vxfs]
vx_write_alloc at ffffffffa0b708fb [vxfs]
vx_write1 at ffffffffa0b712ff [vxfs]
vx_write_common_slow at ffffffffa0b7268c [vxfs]
vx_write_common at ffffffffa0b73857 [vxfs]
vx_write at ffffffffa0af04a6 [vxfs]
vfs_write at ffffffff811ded3d
sys_write at ffffffff811df7df
system_call_fastpath at ffffffff81646b09

And a corresponding delegation receiver thread should be seen looping on CFS
primary:

PID: 18958 TASK: ffff88006c776780 CPU: 0 COMMAND: "vx_msg_thread"
__schedule at ffffffff8163a26d
mutex_lock at ffffffff81638b42
vx_emap_lookup at ffffffffa0ecf0eb [vxfs]
vx_extchkmaps at ffffffffa0e9c7b4 [vxfs]
vx_searchau_downlevel at ffffffffa0ea0923 [vxfs]
vx_searchau at ffffffffa0ea0e22 [vxfs]
vx_dele_get_freespace at ffffffffa0f53b6d [vxfs]
vx_getedele_size at ffffffffa0f54c4b [vxfs]
vx_pri_getdele at ffffffffa0f54edc [vxfs]
vx_recv_getdele at ffffffffa0f5690d [vxfs]
vx_recvdele at ffffffffa0f59800 [vxfs]
vx_msg_process_thread at ffffffffa0f2aead [vxfs]
vx_kthread_init at ffffffffa105cba4 [vxfs]
kthread at ffffffff810a5aef
ret_from_fork at ffffffff81645858

DESCRIPTION:
In VxFS 7.2 release a performance optimization was done in the way we
update the allocation state map. Prior to 7.2 the updates were done synchronously
and
in 7.2 the updates were made transactional. The hang happens when an EAU needs to be
converted back to FREE state after all the allocation from it is freed. In such case
if
the corresponding EAU delegation may time out before we can complete the process, it
can result in inconsistent state. Because of this inconsistency, in future when a
node
will try to get delegation of this EAU, the primary may loop forever resulting the
secondary to wait infinitely for the AU delegation.

RESOLUTION:
Code fix is done to not allow the delegation to timeout till the free
processing is complete.

* 3913004 (Tracking ID: 3911048)

SYMPTOM:
The LDH bucket validation failure message is logged and system hang.

DESCRIPTION:
When modifying a large directory, vxfs needs to find a new bucket in the LDH
for this
directory, and once the bucket is full, it will be be split to get more
bucket to use.
When the bucket is split to maximum amount, overflow bucket will be
allocated. Under
some condition, the available bucket lookup on overflow bucket will may got
incorrect
result and overwrite the existing bucket entry thus corrupt the LDH file.
Another
problem is that when the bucket invalidation failed, the bucket buffer is
released
without checking whether the buffer is already in a previous transaction,
this may
cause the transaction flush thread to hang and finally stuck the whole
filesystem.

RESOLUTION:
Correct the LDH bucket entry change code to avoid the corrupt. And release
the bucket
buffer without throw it out of memory to avoid blocking the transaction
flush.

* 3914384 (Tracking ID: 3915578)

SYMPTOM:
vxfs module failing to load after reaboot with dmesg log:
[ 3.689385] systemd-sysv-generator[518]: Overwriting existing symlink
/run/systemd/generator.late/vxfs.service with real service.
[ 4.738143] vxfs: Unknown symbol ki_get_boot (err 0)
[ 4.793301] vxfs: Unknown symbol ki_get_boot (err 0)

DESCRIPTION:
The vxfs module is dependent on the veki module. During boot time,
the veki is not loaded before vxfs. So the vxfs module fails to load.

RESOLUTION:
During boot the veki should start before loading vxfs module.

INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch fs-rhel7_x86_64-Patch-7.2.0.100.tar.gz to /tmp
2. Untar fs-rhel7_x86_64-Patch-7.2.0.100.tar.gz to /tmp/hf
# mkdir /tmp/hf
# cd /tmp/hf
# gunzip /tmp/fs-rhel7_x86_64-Patch-7.2.0.100.tar.gz
# tar xf /tmp/fs-rhel7_x86_64-Patch-7.2.0.100.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
# pwd /tmp/hf
# ./installVRTSvxfs720P100 [<host1> <host2>...]

You can also install this patch together with 7.2 base release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.2 directory and invoke the installer script
with -patch_path option where -patch_path should point to the patch directory
# ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
# rpm -Uvh VRTSvxfs-7.2.0.100-RHEL7.x86_64.rpm

REMOVING THE PATCH
------------------
# rpm -e VRTSvxfs-7.2.0.100-RHEL7.x86_64.rpm

KNOWN ISSUES
------------
* Tracking ID: 3910089

SYMPTOM: DB2 theads (db2sysc) hang, we can see from the crash dump:
Many of them get stuck in vx_dio_physio():
- schedule
- rwsem_down_failed_common
- call_rwsem_down_read_failed
- down_read
- vx_dio_physio

And many of them get stuck in vx_rwlock():
- schedule
- rwsem_down_failed_common
- call_rwsem_down_read_failed
- down_read
- vx_rwlock

WORKAROUND: No

* Tracking ID: 3910100

SYMPTOM: Oracle database start failure, with trace log like this:

ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 304 (block # 722821)
ORA-01110: data file 304: <file_name>
ORA-17500: ODM err:ODM ERROR V-41-4-2-231-28 No space left on device

WORKAROUND: No

SPECIAL INSTRUCTIONS
--------------------
NONE

OTHERS
------
NONE