* * * READ ME * * * * * * Veritas File System 5.0 MP2 * * * * * * Rolling Patch 9 * * * Patch Date: 2013-11-22 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 5.0 MP2 Rolling Patch 9 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v2 (11.23) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs VRTSfsman VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas File System 5.0 MP2 * Veritas Storage Foundation for Oracle RAC 5.0 MP2 * Veritas Storage Foundation Cluster File System 5.0 MP2 * Veritas Storage Foundation 5.0 MP2 * Veritas Storage Foundation High Availability 5.0 MP2 * Veritas Storage Foundation for Oracle 5.0 MP2 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHCO_43747, PHKL_43746, PHCO_4748 * 2166258 (2129455) The file system daemon process (vxfsd) takes a significant amount of CPU time after deleting some directories. * 2194615 (2178147) After a socket file is removed, the file system is marked for full fsck(1M) operation. * 2800280 (2670022) Duplicate file names can be seen in a directory. * 2810121 (2316793) After removing the files in a file system, the df(1M)command may take 10 seconds to complete. * 3024003 (2858683) Reserve extent attributes changed after vxrestore, for files greater than 8192bytes. * 3131809 (2966277) Systems with high file system activity like read/write/open/lookup may panic the system. * 3131958 (2899907) On CFS, some file-system operations like vxcompress utility and de-duplication fail to respond. * 3276125 (3244613) fsadm(1M) command hangs while the I/O load on regular vxfs filesystem and checkpoint. * 3326496 (1960815) VxFS read ahead can cause stalled IO on all the write operations during a re- tune operation. * 3326533 (3259634) A Cluster File System having more than 4G blocks gets corrupted. * 3326593 (2290800) A large hole at the end of the ILIST file is wrongly reported by the fsdb_vxfs (1M) command. * 3326823 (915234) The vxdump(1M) utility estimates the incorrect size of tape device. Patch ID: PHKL_43392, PHCO_43391 * 2834293 (2750860) Performance issue due to CFS fragmentation in CFS cluster * 2867635 (2867633) LM Noise.Fullfsck.N1 test hit an assert "vx_delxwri_reclaim:1a". * 2870307 (2693010) VxFS patches to not remove formatted/cached man pages * 2904377 (2599590) Expanding or shrinking a DLV5 file system using the fsadm(1M)command causes a system panic. * 2988680 (2988678) VX_SETEXT ioctl doesn't require full license on 11.31 DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: PHCO_43747, PHKL_43746, PHCO_4748 * 2166258 (Tracking ID: 2129455) SYMPTOM: The top (1m) command shows that after some directories are deleted, the file system daemon process (vxfsd) consumes a significant amount of CPU time. DESCRIPTION: The file system daemon spawns multiple threads internally to process the file removals. The number of threads depends on the number of CPUs available on the system. A lot of threads can be created on a system with many CPUs. This consumes a significant amount of CPU time. Also, the count of the number of threads created is not updated accurately in some rare cases. RESOLUTION: The code is modified to limit the number of threads to half the number of CPUs available on the system. As a result, the count gets updated accurately. * 2194615 (Tracking ID: 2178147) SYMPTOM: If a socket file is removed, the file system is marked for full fsck(1M) operation. The following error message is displayed in the system log: vmunix: vxfs: WARNING: msgcnt 1 mesg 087: V-2-87: vx_dotdot_manipulate: - / file system 2437 inode 541 dotdot inode error DESCRIPTION: During the socket file creation, the attribute inode for the parent directory is not updated. Hence, the error occurs when the socket file is removed. RESOLUTION: The code is modified to update the socket file linkage during creation, thus avoiding the error message. * 2800280 (Tracking ID: 2670022) SYMPTOM: Duplicate file names can be seen in a directory. DESCRIPTION: Veritas File System (VxFS) maintains an internal Directory Name Lookup Cache (DNLC) to improve the performance of directory lookups. A race condition occurs in the DNLC lists manipulation code during lookup/creation of file names that have more than 32 characters (which further affects other file creations). This causes the DNLC to have a stale entry for an existing file in the directory. A lookup of such a file through DNLC does not find the file and allows another duplicate file with the same name in the directory. RESOLUTION: The code is modified to fix the race condition by protecting the DNLC lists through proper locks. * 2810121 (Tracking ID: 2316793) SYMPTOM: After removing the files in a file system, the df(1M)command which uses the statfs(2) function may take 10 seconds to complete. DESCRIPTION: To obtain an up-to- date and valid free block count in a file system a delay and retry loop delays for one second and retries 10 times. This excessive retrying causes a 10 second delay per file system while executing the df(1M) command. RESOLUTION: The code is modified to reduce the original 10 retries with one second delay each, to one retry after a 20 millisecond delay. * 3024003 (Tracking ID: 2858683) SYMPTOM: The reserve-extent attributes are changed after the vxrestore(1M ) operation, for files that are greater than 8192 bytes. DESCRIPTION: A local variable is used to contain the number of the reserve bytes that are reused during the vxrestore(1M) operation, for further VX_SETEXT ioctl call for files that are greater than 8k. As a result, the attribute information is changed. RESOLUTION: The code is modified to preserve the original variable value till the end of the function. * 3131809 (Tracking ID: 2966277) SYMPTOM: Systems with high file-system activity like read/write/open/lookup may panic with the following stack trace due to a rare race condition: spinlock+0x21 ( ) -> vx_rwsleep_unlock() vx_ipunlock+0x40() vx_inactive_remove+0x530() vx_inactive_tran+0x450() vx_local_inactive_list+0x30() vx_inactive_list+0x420() -> vx_workitem_process() -> vx_worklist_process() vx_worklist_thread+0x2f0() kthread_daemon_startup+0x90() DESCRIPTION: ILOCK is released before doing a IPUNLOCK that causes a race condition. This results in a panicwhen an inode that has been set free is accessed. RESOLUTION: The code is modified so that the ILOCK is used to protect the inodes' memory from being set free, while the memory is being accessed. * 3131958 (Tracking ID: 2899907) SYMPTOM: Some file-system operations on a Cluster File System (CFS) may hang with the following stack trace. vxg_svar_sleep_unlock vxg_grant_sleep vxg_cmn_lock vxg_api_lock vx_glm_lock vx_mdele_hold vx_extfree1 vx_exttrunc vx_trunc_ext4 vx_trunc_tran2 vx_trunc_tran vx_cfs_trunc vx_trunc vx_inactive_remove vx_inactive_tran vx_cinactive_list vx_workitem_process vx_worklist_process vx_worklist_thread vx_kthread_init kernel_thread DESCRIPTION: In CFS, a node can lock a mdelelock for an extent map while holding a mdelelock for a different extent map locked. This can result in a deadlock between different nodes in the cluster. RESOLUTION: The code is modified to prevent the deadlock between different nodes in the cluster. * 3276125 (Tracking ID: 3244613) SYMPTOM: A file-system extent operation by using the fsadm(1M) command may hang with the following stack trace: vx_event_wait(inlined) vx_delay2+0x2a0 cold_vx_active_common_flush+0x80 vx_close+0x70 vn_close(inlined) vno_close+0xe0 closef(inlined) DESCRIPTION: During a resize operation, the fsadm(1M) command freezes the file system. In an error case, the fsadm(1M) command exits without thawing the file system. This results in a hang. RESOLUTION: The code is modified to thaw the file system, before the fsadm(1M) command exits in the error case. * 3326496 (Tracking ID: 1960815) SYMPTOM: The vxtunefs (1M) command manpage does not mention the performance impact of tuning any of the Veritas File System (VxFS) parameters on the system with heavy load. DESCRIPTION: The performance impact that the system face on tuning any of the VxFS parameters needs to be mentioned in the vxtunefs (1M) command manpage. RESOLUTION: The manpage is updated to mention the performance impact of tuning any of the VxFS parameters on the system with heavy load. * 3326533 (Tracking ID: 3259634) SYMPTOM: A CFS that has more than 4 GB blocks is corrupted due to some file system metadata being zeroed out incorrectly. The blocks which get zeroed out may contain any metadata or file data and can be located anywhere on the disk. The problem occurs only with the following file system size and the FS block size combinations: 1kb block size and FS size > 4TB 2kb block size and FS size > 8TB 4kb block size and FS size > 16TB 8kb block size and FS size > 32TB DESCRIPTION: When a CFS is mounted for the first time on the secondary node, a per-node- intent log is created. When the intent log is created, the blocks newly allocated to it are zeroed out. The start offset and the length to be cleared is passed to the block that clears the routine. Due to a miscalculation a wrong start offset is passed. This results in the disk content at that offset getting zeroed out incorrectly. This content can be file system metadata or file data. If it is the metadata, this corruption is detected when the metadata is accessed and the file system is marked for full fsck(1M). RESOLUTION: The code is modified so that the correct start offset is passed to the block that clears the routine. * 3326593 (Tracking ID: 2290800) SYMPTOM: When the fsdb_vxfs(1M) command is used to look at the bmap of an ILIST file ("mapall" command), a large hole at the end of the ILIST file is wrongly reported. DESCRIPTION: While reading the bmap of an ILIST file, if a hole is found at the end of indirect extents, the fsdb_vxfs(1M) command may incorrectly mark the hole as the last extent in the bmap, causing the "mapall" command within the filesystem debugger to show a large hole till the end of the file. RESOLUTION: The code has been modified to read an ILIST file's bmap correctly when holes are found at the end of the indirect extents. * 3326823 (Tracking ID: 915234) SYMPTOM: The vxdump(1M) utility estimates the incorrect size of tape device. This results in prompting of the wrong next tape device. DESCRIPTION: The tracks information for the tape device is saved in a variable that is not initialized to zero. This leads to premature prompting for the next tape. RESOLUTION: The code is modified so that the tracks variable is initialized to zero. Patch ID: PHKL_43392, PHCO_43391 * 2834293 (Tracking ID: 2750860) SYMPTOM: On a large file system(4TB or greater), the performance of the write(1) operation with many small request sizes may degrade, and many threads may be found sleeping with the following stack trace: real_sleep sleep_one vx_sleep_lock vx_lockmap vx_getemap vx_extfind vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_uplevel vx_searchau vx_extentalloc_device vx_extentalloc vx_te_bmap_alloc vx_bmap_alloc_typed vx_bmap_alloc vx_write_alloc3 vx_recv_prealloc vx_recv_rpc vx_msg_recvreq vx_msg_process_thread kthread_daemon_startup DESCRIPTION: For a cluster-mounted file system, the free-extend-search algorithm is not optimized for a large file system (4TB or greater), and for instances where the number of free Allocation Units (AUs) available can be very large. RESOLUTION: The code is modified to optimize the free-extend-search algorithm by skipping certain AUs. This reduces the overall search time. * 2867635 (Tracking ID: 2867633) SYMPTOM: The internal noise test on a locally mounted file system hits an "vx_delxwri_reclaim:1a" assert. DESCRIPTION: The vx_delxwri_reclaim() function is called from the vx_write_alloc() function only if the fault flag is not set and the error type is as follows: no space, volume disabled, or quota exceeded. This condition occurs due to the wrong conditions set when the vx_delxwri_reclaim() function is called from the vx_write_alloc() function. RESOLUTION: The code is modified to correct the conditions when the vx_delxwri_reclaim() function is called from the vx_write_alloc() function. * 2870307 (Tracking ID: 2693010) SYMPTOM: The formatted uncompressed files created by the catman(1M) command for Veritas File System (VxFS) man pages remain available even after the removal of the VxFS package. DESCRIPTION: Currently, the formatted compressed files created by the catman(1M) command for VxFS man pages are not removed during the removal/uninstall of patches. RESOLUTION: The code is modified in the postremove and preinstall/postinstall packaging scripts to cleanup any remaining formatted uncompressed files created by the catman(1M) command. * 2904377 (Tracking ID: 2599590) SYMPTOM: Expansion of a 100% full file system may panic the machine with the following stack trace. bad_kern_reference() $cold_vfault() vm_hndlr() bubbledown() vx_logflush() vx_log_sync1() vx_log_sync() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: When 100% full file system is expanded intent log of the file system is truncated and blocks freed up are used during the expansion. Due to a bug the block map of the replica intent log inode was not getting updated thereby causing the block maps of the two inodes to differ. This caused some of the in- core structures of the intent log to go NULL. The machine panics while de- referencing one of this structure. RESOLUTION: Updated the block map of the replica intent log inode correctly. 100% full file system now can be expanded only If the last extent in the intent log contains more than 32 blocks, otherwise fsadm will fail. To expand such a file-system, some of the files should be deleted manually and resize be retried. * 2988680 (Tracking ID: 2988678) SYMPTOM: The manual page for the setext(1) command does not mention the license requirement correctly. DESCRIPTION: A full license is not required for the setext(1) operation. This information is not mentioned in the manual page. RESOLUTION: The manual page is modified to mention that a full license is not required for the setext(1) operation. INSTALLING THE PATCH -------------------- To install the VxFS 5.0-MP2RP9 patch: a) To install this patch on a CVM cluster, install it one system at a time so that all the nodes are not brought down simultaneously. b) The VxFS 5.0(GA) must be installed before applying these patches. c) To verify the VERITAS file system level, execute: # swlist -l product | egrep -i 'VRTSvxfs' VRTSvxfs 5.0.01.04 VERITAS File System Note: VRTSfsman is a corequisite for VRTSvxfs. So, VRTSfsman also needs to be installed with VRTSvxfs. # swlist -l product | egrep -i 'VRTS' VRTSvxfs 5.0.01.04 Veritas File System VRTSfsman 5.0.01.02 Veritas File System Manuals d) All prerequisite/corequisite patches must be installed. The Kernel patch requires a system reboot for both installation and removal. e) To install the patch, execute the following command: # swinstall -x autoreboot=true -s PHCO_43747 PHKL_43746 PHCO_43748 If the patch is not registered, you can register it using the following command: # swreg -l depot The is the absolute path where the patch resides. REMOVING THE PATCH ------------------ To remove the VxFS 5.0-MP2RP9 patch: a) Execute the following command: # swremove -x autoreboot=true PHCO_43747 PHKL_43746 PHCO_43748 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE