* * * READ ME * * * * * * Veritas File System 4.1 MP2 * * * * * * Rolling Patch 14 * * * Patch Date: 2014-09-03 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 4.1 MP2 Rolling Patch 14 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v2 (11.23) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas File System 4.1 MP2 * Veritas Storage Foundation for Oracle RAC 4.1 MP2 * Veritas Storage Foundation Cluster File System 4.1 MP2 * Veritas Storage Foundation 4.1 MP2 * Veritas Storage Foundation HA 4.1 MP2 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHCO_44049, PHKL_44154 * 3024001 (2858683) Reserve extent attributes changed after vxrestore, for files greater than 8192bytes. * 3188902 (2966277) Systems with high file system activity like read/write/open/lookup may panic the system. * 3520352 (3520349) When there is a huge number of dirty pages in the memory, and a sparse write is performed at a large offset of 4TB or above, on an existing file that is not null, the file system hangs. * 3520364 (1156791) The write(1M) operation on files with odd-size extents is slower compared to the write(1M) operation on files with even-size extents. * 3520368 (3107628) The vxdump(1M) utility incorrectly estimates the default tapesize. Patch ID: PHKL_43011 * 2768726 (1092933) VxFS 4.1 reports a "FALSE" write success on ThP LUN * 2768728 (2680946) panic in vx_itryhold+0x40/spinlock() - due to NULL d_childp in dnlc * 2778483 (2740939) Need UNOF for E1082892 on 4.1/hpux 11.23 DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: PHCO_44049, PHKL_44154 * 3024001 (Tracking ID: 2858683) SYMPTOM: The reserve-extent attributes are changed after the vxrestore(1M ) operation, for files that are greater than 8192 bytes. DESCRIPTION: A local variable is used to contain the number of the reserve bytes that are reused during the vxrestore(1M) operation, for further VX_SETEXT ioctl call for files that are greater than 8k. As a result, the attribute information is changed. RESOLUTION: The code is modified to preserve the original variable value till the end of the function. * 3188902 (Tracking ID: 2966277) SYMPTOM: Systems with high file-system activity like read/write/open/lookup may panic with the following stack trace due to a rare race condition: spinlock+0x21 ( ) -> vx_rwsleep_unlock() vx_ipunlock+0x40() vx_inactive_remove+0x530() vx_inactive_tran+0x450() vx_local_inactive_list+0x30() vx_inactive_list+0x420() -> vx_workitem_process() -> vx_worklist_process() vx_worklist_thread+0x2f0() kthread_daemon_startup+0x90() DESCRIPTION: ILOCK is released before doing a IPUNLOCK that causes a race condition. This results in a panicwhen an inode that has been set free is accessed. RESOLUTION: The code is modified so that the ILOCK is used to protect the inodes' memory from being set free, while the memory is being accessed. * 3520352 (Tracking ID: 3520349) SYMPTOM: When there is a huge number of dirty pages in the memory, and a sparse write is performed at a large offset of 4 TB or above, on an existing file that is not null, the file system hangs in the thread. The following stack trace is observed: fcache_buf_iowait() vx_fcache_buf_iowait() vx_io_wait() vx_alloc_getpage() vx_do_getpage() vx_getpage1() vx_getpage() preg_vn_fault() fcache_as_uiomove_rd() fcache_as_uiomove() vx_fcache_as_uiomove() vx_fcache_read() vx_read1() vx_rdwr() vn_rdwr() DESCRIPTION: When a sparse write is performed at an offset of 4TB or above, on a file that has ext4 extent orgtype with some blocks that are already allocated, this can result in a file system hang. This is caused due to a type casting bug in the offset calculation in the vxfs extent allocation code path. A sparse write should create a 'HOLE' between the last allocated offset and the current offset on which the write is requested. Due to the type-casting bug, VxFS may allocate the space between the last offset and the new offset, instead of creating a 'HOLE' in certain scenarios. This generates a huge number of dirty pages, and fills up the file system space incorrectly. The memory pressure due to the huge number of dirty pages causes the hang. The sparse write offset on which the problem occurs depends on the file system block size. For a file system with block size 1 KB, the problem can occur at the sparse write offset of 4TB. RESOLUTION: The code is modified so that the VxFS extent allocation code calculates the offset correctly, and does not allocate space for a sparse write. This resolves the type casting bug. * 3520364 (Tracking ID: 1156791) SYMPTOM: The write(1M) operation on files with odd-size extents is slower compared to the write(1M) operation on files with even-size extents. DESCRIPTION: The extent allocator can spend a long time in looking for a good match while looking for an odd-sized extent. It might need to examine every allocation unit to find a good match. On a large file system with a large number of small files, this could take a long time. RESOLUTION: The odd-size extent allocations are now rounded up to the nearest power of 2. For example, an extent of size 7 KB is rounded to 2^3 = 8 KB. * 3520368 (Tracking ID: 3107628) SYMPTOM: The vxdump(1M) utility incorrectly estimates the number of tapes required to complete the backup and prematurely prompts for the next tape. DESCRIPTION: The vxdump(1M) utility prematurely prompts for next tape. RESOLUTION: The code is modified to fix the initialization for track and tape variables. Patch ID: PHKL_43011 * 2768726 (Tracking ID: 1092933) SYMPTOM: The system may panic in the vx_fsync_chains() function when it tries to sleep in the interrupt context. The following stack trace is displayed: vx_event_wait vx_delay2 vx_fsync_chains vx_disable vx_dataioerr vx_pageio_done DESCRIPTION: While handling the external interrupt, the vx_pageio_done() function calls the function vx_fsync_chains() function. The vx_fsync_chains() function may sleep during the execution. The function vx_fsync_chains() is required in the case of Input/output (IO) errors. The function vx_fsync_chains() is called at a couple of places, when the I/O strategy fails. But, the error variable is overwritten improperly. RESOLUTION: The code is modified to save Input/Output errors so that the vx_fsync_chains() function can be called and the call to vx_fsync_chains() is removed from the vx_disable(). * 2768728 (Tracking ID: 2680946) SYMPTOM: The ls (1M), find(1M) or other lookup operations can trigger a panic with the following stack trace: spinlock+0x40 vx_itryhold+0x40 vx_dnlc_lookup+0x1b0 vx_cbdnlc_lookup+0x130 vx_fast_lookup+0x120 vx_lookup+0x3c0 lookuppnvp+0x2d0 lookuppn+0x90 lookupname+0x60 vn_open+0xa0 copen+0x170 open+0x80 syscall+0x920 DESCRIPTION: The panic is triggered because of a NULL pointer de-reference while inserting an entry in the Data Name Lookup Cache (DNLC) which has a NULL child pointer. RESOLUTION: The code is modified to include a global variable which is incremented when a DNLC with NULL child pointer gets inserted. A preventive code is added to avoid the panic and such occurrences are tracked using the global variable. * 2778483 (Tracking ID: 2740939) SYMPTOM: The unmounting of a file system can cause a Transfer of Control (TOC) on an HP 11.23 service guard environment with many threads. The following stack trace is displayed: spinunlock() vx_worklist_process() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: The TOC is caused because of a heavy spin-lock contention on a Veritas File System (VxFS) spin-lock. During the unmount of a VxFS file system, worker threads are created to scan the inode cache and remove the inodes related to the mount point. The number of worker threads is calculated based on the number of hash-lists in the inode cache. The number of hash-lists in the inode cache is calculated based on the system memory rather than the tuned vxfs_ninode value. If the system memory is huge and the tuned vxfs_ninode value is less, then the number of hash-lists in the inode cache would be unnecessarily more which can result in the spin-lock contention during the unmount. RESOLUTION: The code is modified to calculate the number of hash-lists based on the tuned vxfs_ninode value. INSTALLING THE PATCH -------------------- 1.Installing VxFS 4.1 MP2RP14 patch: a)If you install this patch on a CVM cluster, install it one system at a time so that all the nodes are not brought down simultaneously. b)VxFS 4.1(GA) must be installed before applying these patches. c)To verify the VERITAS file system level, enter: # swlist -l product | egrep -i 'VRTSvxfs' VRTSvxfs 4.1 VERITAS File System VRTSfsman 4.1 VERITAS File System Manuals d)All prerequisite/corequisite patches have to be installed.The Kernel patch requires a system reboot for both installation and removal. e)To install the patch, enter the following command: # swinstall -x autoreboot=true -s PHCO_44049 PHKL_44154 REMOVING THE PATCH ------------------ Removing VxFS 4.1-MP2RP14 patch: a)To remove the patch, enter the following command: # swremove -x autoreboot=true PHCO_44049 PHKL_44154 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ INCIDENTS FROM OLD PATCHES: --------------------------- 4.1MP2RP12 ============= 2556094 ff_vxfs ERROR: V-3-24347: program limit of 30701385 exceeded 2558524 Memory leak in internal buffercache after 497 days (lbolt wrap-over) 2558527 Threads stuck in vx_rwsleep_rec_lock_em 2559451 QXCR1001161039 on VxFS 11.23/4.1 : fsck_vxfs(1m) coredumps SEGV_ACCERR 2583011 File entry is displayed twice if find/ls run immediately after creation 2589977 LM Stress.s1 test hit an assert f:vx_flushsuper:ndebug. 4.1MP2RP11 ============= 2403838 VxFS 4.1/11.23 system panics after fdd:fdd_getdev() unable to find device 2405162 Access denied on files inheriting default group ACL from parent directory. 4.1MP2RP10 ============ 2195664 fsck fails to repair corrupt directory blocks having duplicate directory entries. 2237065 [VxFS/ODM][413-328-688][HP-OEM] HOTSITE: QXCR1001093516 - HPIT CFS deadlock 2244300 System panic in vx_iupdat_clustblks() 2246160 vx_rdwr should handle the EIO returned from vx_tranidflush 2246161 write() system call hangs for over 10 seconds on VxFS 3.5 on 11.23 2257645 LM stress aborted due to "run_fsck : Failed to full fsck cleanly". 2276076 fsck can make more than one lost+found entries 4.1MP2RP9 ========= 2092337 getacl shows no permission for user but user can still access file 2092347 vxfs mkfs miscomputes free blocks 2092348 vxfsstat does not reflect the change of vx_ninode 2093275 QXCR1000550182 issue, not ported to 4.1/11.23 4.1MP2RP8 =========== 1874187 Adding quota support for user 'nobody' on HP-UX 1960790 System panic in inctext: VTEXT not set and tcount > 0 1960805 VxFS4.1: VxFS doesnt specify limits on tuning on vxfs_bc_bufhwm fsman ----- 1960816 VxFS read ahead can cause stalled IO on all write operations. 4.1MP2RP7 =========== 1500766 HP-UX 11.11; JFS 3.3; poor performance writing to mmap'ed sparse file. 1703201 subtype command not returning correct value when run as non root user 1786028 quotacheck coredumps with more than 30 quota- enabled filesystems in /etc/fstab 1785994 vx_vn_brelse() incorrectly calls vn_rele() 1818786 If force unmount fails, then next normal unmount may become force unmount and can lead to panic in HO 11.2 3 fspro ------ 1791949 [VEA] SNMP traps for VxFS alerts fail.