fs-aix-Patch-7.3.1.100

 Basic information
Release type: Patch
Release date: 2018-04-11
OS update support: None
Technote: None
Documentation: None
Popularity: 4386 viewed    downloaded
Download size: 20.42 MB
Checksum: 3753629761

 Applies to one or more of the following products:
InfoScale Enterprise 7.3.1 On AIX 7.1
InfoScale Enterprise 7.3.1 On AIX 7.2
InfoScale Foundation 7.3.1 On AIX 7.1
InfoScale Foundation 7.3.1 On AIX 7.2
InfoScale Storage 7.3.1 On AIX 7.1
InfoScale Storage 7.3.1 On AIX 7.2

 Obsolete patches, incompatibilities, superseded patches, or other requirements:
None.

 Fixes the following incidents:
3933810, 3933815, 3933819, 3933820, 3933824, 3933828, 3933843, 3934841, 3935903, 3936286, 3937536, 3939812, 3942458, 3942697

 Patch ID:
VRTSvxfs.bff

Readme file
                          * * * READ ME * * *
                 * * * Veritas File System 7.3.1 * * *
                         * * * Patch 100 * * *
                         Patch Date: 2018-04-11


This document provides the following information:

   * PATCH NAME
   * OPERATING SYSTEMS SUPPORTED BY THE PATCH
   * PACKAGES AFFECTED BY THE PATCH
   * BASE PRODUCT VERSIONS FOR THE PATCH
   * SUMMARY OF INCIDENTS FIXED BY THE PATCH
   * DETAILS OF INCIDENTS FIXED BY THE PATCH
   * INSTALLATION PRE-REQUISITES
   * INSTALLING THE PATCH
   * REMOVING THE PATCH
   * KNOWN ISSUES


PATCH NAME
----------
Veritas File System 7.3.1 Patch 100


OPERATING SYSTEMS SUPPORTED BY THE PATCH
----------------------------------------
AIX


PACKAGES AFFECTED BY THE PATCH
------------------------------
VRTSvxfs


BASE PRODUCT VERSIONS FOR THE PATCH
-----------------------------------
   * InfoScale Enterprise 7.3.1
   * InfoScale Foundation 7.3.1
   * InfoScale Storage 7.3.1


SUMMARY OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
Patch ID: 7.3.1.100
* 3933810 (3830300) Degraded CPU performance during backup of Oracle archive logs
on CFS vs local filesystem
* 3933815 (3902812) Mounting a vxfs file system on large memory AIX server 
may fail.
* 3933819 (3879310) The file system may get corrupted after a failed vxupgrade.
* 3933820 (3894712) ACL permissions are not inherited correctly on cluster 
file system.
* 3933824 (3908785) System panic observed because of null page address in writeback 
structure in case of 
kswapd process.
* 3933828 (3921152) Performance drop caused by vx_dalloc_flush().
* 3933843 (3926972) A recovery event can result in a cluster wide hang.
* 3934841 (3930267) Deadlock between fsq flush threads and writer threads.
* 3935903 (3933763) Oracle Hang in VxFS.
* 3936286 (3936285) fscdsconv command may fsil the conversion for disk layout version(DLV) 12 and above.
* 3937536 (3940516) File resize thread loops infinitely for file resize operation crossing 32 bit
boundary.
* 3939812 (3939224) Removing file on VxFS hangs on latest release of AIX TL.
* 3942458 (3912203) cafs module is not getting loaded automatically on VRTSvxfs patch installation
* 3942697 (3940846) vxupgrade fails while upgrading the filesystem from disk
layout version(DLV) 9 to DLV 10


DETAILS OF INCIDENTS FIXED BY THE PATCH
---------------------------------------
This patch fixes the following incidents:

Patch ID: 7.3.1.100

* 3933810 (Tracking ID: 3830300)

SYMPTOM:
Heavy cpu usage while oracle archive process are running on a clustered
fs.

DESCRIPTION:
The cause of the poor read performance in this case was due to fragmentation,
fragmentation mainly happens when there are multiple archivers running on the
same node. The allocation pattern of the oracle archiver processes is 

1. write header with O_SYNC
2. ftruncate-up the file to its final size ( a few GBs typically)
3. do lio_listio with 1MB iocbs

The problem occurs because all the allocations in this manner go through
internal allocations i.e. allocations below file size instead of allocations
past the file size. Internal allocations are done at max 8 Pages at once. So if
there are multiple processes doing this, they all get these 8 Pages alternately
and the fs becomes very fragmented.

RESOLUTION:
Added a tunable, which will allocate zfod extents when ftruncate
tries to increase the size of the file, instead of creating a hole. This will
eliminate the allocations internal to file size thus the fragmentation. Fixed
the earlier implementation of the same fix, which ran into
locking issues. Also fixed the performance issue while writing from secondary node.

* 3933815 (Tracking ID: 3902812)

SYMPTOM:
Mount command for vxfs file system fails with the following error on a AIX 
system having huge physical memory:

UX:vxfs mount: ERROR: V-3-20: Not enough space
UX:vxfs mount: ERROR: V-3-21256: cannot mount /dev/vx/dsk/datadg/vol1

DESCRIPTION:
For big memory systems the number of VMM buffers auto-tuned will be huge, 
which in turn will increase number of VMM buffers-per-PDT. This can lead to 
mount failures as it cant allocate huge chunk of memory from pinned heap. 
To alleviate this problem, number of PDTs in case of huge memory system 
needs to be increased to spread out the  VMM buffers per PDT.

RESOLUTION:
Tune up the number of PDTs based on amount of physical memory for big memory 
systems.

* 3933819 (Tracking ID: 3879310)

SYMPTOM:
The file system may get corrupted after the file system freeze during 
vxupgrade. The full fsck gives the following errors:

UX:vxfs fsck: ERROR: V-3-20451: No valid device inodes found
UX:vxfs fsck: ERROR: V-3-20694: cannot initialize aggregate

DESCRIPTION:
The vxupgrade requires the file system to be frozen during its functional 
operation. It may happen that the corruption can be detected while the freeze 
is in progress and the full fsck flag can be set on the file system. However, 
this doesn't stop the vxupgrade from proceeding.
At later stage of vxupgrade, after structures related to the new disk layout 
are updated on the disk, vxfs frees up and zeroes out some of the old metadata 
inodes. If any error occurs after this point (because of full fsck being set), 
the file system needs to go back completely to the previous version at the tile 
of full fsck. Since the metadata corresponding to the previous version is 
already cleared, the full fsck cannot proceed and gives the error.

RESOLUTION:
The code is modified to check for the full fsck flag after freezing the file 
system during vxupgrade. Also, disable the file system if an error occurs after 
writing new metadata on the disk. This will force the newly written metadata to 
be loaded in memory on the next mount.

* 3933820 (Tracking ID: 3894712)

SYMPTOM:
ACL permissions are not inherited correctly on cluster file system.

DESCRIPTION:
The ACL counts stored on a directory inode gets reset every 
time directory inodes 
ownership is switched between the nodes. When ownership on directory inode 
comes back to the node, 
which  previously abdicated it, ACL permissions were not getting inherited 
correctly for the newly 
created files.

RESOLUTION:
Modified the source such that the ACLs are inherited correctly.

* 3933824 (Tracking ID: 3908785)

SYMPTOM:
System panic observed because of null page address in writeback structure in case of 
kswapd 
process.

DESCRIPTION:
Secfs2/Encryptfs layers had used write VOP as a hook when Kswapd is triggered to 
free page. 
Ideally kswapd should call writepage() routine where writeback structure are correctly filled.  When 
write VOP is 
called because of hook in secfs2/encrypts, writeback structures are cleared, resulting in null page 
address.

RESOLUTION:
Code changes has been done to call VxFS kswapd routine only if valid page address is 
present.

* 3933828 (Tracking ID: 3921152)

SYMPTOM:
Performance drop. Core dump shows threads doing vx_dalloc_flush().

DESCRIPTION:
An implicit typecast error in vx_dalloc_flush() can cause this performance issue.

RESOLUTION:
The code is modified to do an explicit typecast.

* 3933843 (Tracking ID: 3926972)

SYMPTOM:
Once a node reboots or goes out of the cluster, the whole cluster can hang.

DESCRIPTION:
This is a three way deadlock, in which a glock grant could block the recovery while trying to 
cache the grant against an inode. But when it tries for ilock, if the lock is held by hlock revoke 
and 
waiting to get a glm lock, in our case cbuf lock, then it won't be able to get that because a 
recovery is in progress. The recovery can't proceed because glock grant thread blocked it.

Hence the whole cluster hangs.

RESOLUTION:
The fix is to avoid taking ilock in GLM context, if it's not available.

* 3934841 (Tracking ID: 3930267)

SYMPTOM:
Deadlock between fsq flush threads and writer threads.

DESCRIPTION:
In linux, under certain circumstances i.e. to account dirty pages, writer threads 
takes lock on inode 
and start flushing dirty pages which will need page lock. In this case, if fsq flush threads start 
flushing transaction 
on the same inode then it will need the inode lock which was held by writer thread. The page 
lock was taken by 
another writer thread which is waiting for transaction space which can be only freed by fsq 
flush thread. This 
leads to deadlock between these 3 threads.

RESOLUTION:
Code is modified to add a new flag which will skip dirty page accounting.

* 3935903 (Tracking ID: 3933763)

SYMPTOM:
Oracle was hung, plsql session and ssh both were hanging.

DESCRIPTION:
There is a case in reuse code path where we are hitting dead loop while processing
inactivation thread.

RESOLUTION:
To fix this loop, we are going to try only once to finish inactivation of inodes
before we allocate a new inode for structural/attribute inode processing.

* 3936286 (Tracking ID: 3936285)

SYMPTOM:
fscdsconv command may fail the conversion for disk layout version 12 and
above.After exporting file system for use on the specified target, it fails to
mount on that specified target with below error:

# /opt/VRTS/bin/mount <vol> <mount-point>
UX:vxfs mount: ERROR: V-3-20012: not a valid vxfs file system
UX:vxfs mount: ERROR: V-3-24996: Unable to get disk layout version

when importing file system on target for use on the same system, it asks for
'fullfsck' during mount. After 'fullfsck', file system mounted successfully. But
fsck gives below
meesages:

# /opt/VRTS/bin/fsck -y -o full /dev/vx/rdsk/mydg/myvol
log replay in progress
intent log does not contain valid log entries
pass0 - checking structural files
fileset 1 primary-ilist inode 34 (SuperBlock)
                failed validation clear? (ynq)y
pass1 - checking inode sanity and blocks
rebuild structural files? (ynq)y
pass0 - checking structural files
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
pass3 - checking reference counts
pass4 - checking resource maps
corrupted CUT entries, clear? (ynq)y
au 0 emap incorrect - fix? (ynq)y
OK to clear log? (ynq)y
flush fileset headers? (ynq)y
set state to CLEAN? (ynq)y

DESCRIPTION:
While checking for filesystem version in fscdsconv, the check for DLV 12 and above
was missing and that triggered this issue.

RESOLUTION:
Code changes have been done to handle filesystem version 12 and above for
fscdsconv command.

* 3937536 (Tracking ID: 3940516)

SYMPTOM:
The file resize thread loops infinitely if tried to resize file to a size 
greater than 4TB

DESCRIPTION:
Because of vx_u32_t typecast in vx_odm_resize function, resize threads gets stuck
inside an infinite loop.

RESOLUTION:
Removed vx_u32_t typcast in vx_odm_resize() to handle such scenarios.

* 3939812 (Tracking ID: 3939224)

SYMPTOM:
Removing file on VxFS hangs on latest release of AIX TL.

DESCRIPTION:
On latest version of aix TL (7.1 TL5sp1 and 7.2 latest TL), IBM has done some 
major changes in unlinking code path where they expect vnode list should be terminated with 
NULL. But VxFS initialize vnode's next to itest which was causing dead loop when removing file.

RESOLUTION:
Code is modified such that vnode's next is terminated with NULL.

* 3942458 (Tracking ID: 3912203)

SYMPTOM:
cafs module is not getting loaded automatically on VRTSvxfs 
patch installation

DESCRIPTION:
The changes required to load the "cafs" driver automatically after
 installation of a patch over VxFS 7.2 release is missing in postinstall 
script in AIX. As a result, "cafs" driver is not getting loaded automatically 
while installing patch over 7.2.0 release. we need to manually run
 vxkextadm to load it.

RESOLUTION:
Code changes have been done to install the cafs module 
automatically.

* 3942697 (Tracking ID: 3940846)

SYMPTOM:
While upgrading the filesystem from DLV 9 to DLV 10, vxupgrade fails
with following error :

# vxupgrade -n 10 <mount point>
UX:vxfs vxupgrade: ERROR: V-3-22567: cannot upgrade <volname> - Not owner

DESCRIPTION:
while upgrading from DLV 9 to DLV 10 or upwards, upgrade code path serches for
mkfs version in histlog. This changes has been introduced to perform some
specific operations while upgrading the filesystem. If mkfs has been done for
the filesystem with version 6 and then the upgrade has been performed then only
this issue pops up as the histlog conversion from DLV 6 to DLV 7 does not
propagate the mkfs version field. This issue will occur only on Infoscale 7.3.1
onwards.

RESOLUTION:
Code changes have been done to allow DLV upgrade even in cases where mkfs
version is not present in the histlog.



INSTALLING THE PATCH
--------------------
Run the Installer script to automatically install the patch:
-----------------------------------------------------------
Please be noted that the installation of this P-Patch will cause downtime.

To install the patch perform the following steps on at least one node in the cluster:
1. Copy the patch fs-aix-Patch-7.3.1.100.tar.gz to /tmp
2. Untar fs-aix-Patch-7.3.1.100.tar.gz to /tmp/hf
    # mkdir /tmp/hf
    # cd /tmp/hf
    # gunzip /tmp/fs-aix-Patch-7.3.1.100.tar.gz
    # tar xf /tmp/fs-aix-Patch-7.3.1.100.tar
3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.)
    # pwd /tmp/hf
    # ./installVRTSvxfs731P100 [<host1> <host2>...]

You can also install this patch together with 7.3.1 maintenance release using Install Bundles
1. Download this patch and extract it to a directory
2. Change to the Veritas InfoScale 7.3.1 directory and invoke the installer script
   with -patch_path option where -patch_path should point to the patch directory
    # ./installer -patch_path [<path to this patch>] [<host1> <host2>...]

Install the patch manually:
--------------------------
AIX maintenance levels and APARs can be downloaded from the
    IBM Web site:
        http://techsupport.services.ibm.com
    Install the VRTSvxfs.bff patch if VRTSvxfs is
    already installed at fileset level 7.3.1.000
    A system reboot is recommended after installing this patch.
    To apply the patch, first unmount all VxFS file systems,
    then enter these commands:
        # mount | grep vxfs
        # cd patch_location
        # installp -aXd VRTSvxfs.bff VRTSvxfs
        # reboot


REMOVING THE PATCH
------------------
Run the Uninstaller script to automatically remove the patch:
------------------------------------------------------------
To uninstall the patch perform the following step on at least one node in the cluster:
    # /opt/VRTS/install/uninstallVRTSvxfs731P100 [<host1> <host2>...]

Remove the patch manually:
-------------------------
If you need to remove the patch, first unmount all VxFS
    file systems, then enter these commands:
        # mount | grep vxfs
        # installp -r VRTSvxfs 7.3.1.100
        # reboot


KNOWN ISSUES
------------
* Tracking ID: 3933818

SYMPTOM: Oracle database start failure, with trace log like this:

ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 304 (block # 722821)
ORA-01110: data file 304: <file_name>
ORA-17500: ODM err:ODM ERROR V-41-4-2-231-28 No space left on device

WORKAROUND: No

* Tracking ID: 3943715

SYMPTOM: In case of checkpoint, if write happens on ZFOD(Zero Fill On Demand) extents on primary filesystem, garbage data gets pushed on clone inode.

WORKAROUND: No-workaround.



SPECIAL INSTRUCTIONS
--------------------
NONE


OTHERS
------
NONE