* * * READ ME * * * * * * Veritas File System 5.0 MP2 * * * * * * Rolling Patch 11 * * * Patch Date: 2014-12-01 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 5.0 MP2 Rolling Patch 11 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v2 (11.23) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas File System 5.0 MP2 * Veritas Storage Foundation 5.0 MP2 * Veritas Storage Foundation Cluster File System 5.0 MP2 * Veritas Storage Foundation for Oracle 5.0 MP2 * Veritas Storage Foundation for Oracle RAC 5.0 MP2 * Veritas Storage Foundation HA 5.0 MP2 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHKL_44213 * 3650772 (1198384) Kernel may panic while working in vx_lookup() function of VxFS while working on Directory Name Lookup Cache(DNLC. * 3650778 (529071) ENOSPC error occurs while processing VX_IEDOTDOTRM (removal of dotdot entry), extended operation may result in marking the inode as bad. Patch ID: PHKL_44051 * 3023997 (2768505) While reading the Dynamic Name Lookup Cache (DNLC) entries using the pstat_getmpathname(2) function, the system might panic. * 3131944 (3099638) The vxfs_ifree_timelag(5) tunable when tuned, displays incorrect minimum value. * 3338084 (2175113) Internal noise test on Cluster File System hits the debug assert, when the snapshot related message is sent. * 3428222 (1244756) A lookup operation on a VxFS file system may fail with a stack trace. * 3502809 (2874172) Network File System (NFS) file creation thread might loop continuously with the stack trace. Patch ID: PHKL_43746 * 2166258 (2129455) vxfsd is taking a lot of CPU time after deleting some directories * 2194615 (2178147) [VxFS]Link of IFSOC file does not call vx_dotdot_op resulting in a corrupted inode * 2800280 (2670022) Duplicate file names can be seen in a directory. * 2810121 (2316793) After removing files df command takes 10 seconds to complete * 3131809 (2966277) Systems with high file system activity like read/write/open/lookup may panic the system. * 3131958 (2899907) On CFS, some file-system operations like vxcompress utility and de-duplication fail to respond. * 3276125 (3244613) fsadm(1M) command hangs while the I/O load on regular vxfs filesystem and checkpoint. * 3326533 (3259634) A Cluster File System having more than 4G blocks gets corrupted because the blocks containing some file system metadata get eliminated. Patch ID: PHKL_43392 * 2834293 (2750860) Performance issue due to CFS fragmentation in CFS cluster * 2867635 (2867633) LM Noise.Fullfsck.N1 test hit an assert "vx_delxwri_reclaim:1a". * 2904377 (2599590) Expanding or shrinking a DLV5 file system using the fsadm(1M)command causes a system panic. Patch ID: PHKL_43107 * 2114163 (2091103) CFS hangs in cluster * 2551549 (2428964) Invoke "increase_tunable" without -i option in postinstall * 2564150 (2383225) Machine panics with the following message Panic: "pfd_unlock: bad lock state!" * 2587020 (2251223) df -h after removing files takes 10 seconds * 2587026 (2561739) Class perm changed to "rwx" after adding user ACL entry with null perm. * 2587032 (2492304) File entry is displayed twice if find/ls run immediately after creation * 2800276 (2566875) A write(2) operation exceeding the quota limit fails with an EDQUOT error. * 2800290 (2696067) VxFS 11.31/5.0 : vx_daccess() does not observe GROUP_OBJ permissions * 2800329 (2680946) panic in vx_itryhold+0x40/spinlock() - due to NULL d_childp in dnlc * 2800330 (1092933) VxFS 4.1 reports a "FALSE" write success on ThP LUN * 2834283 (2740939) Unmounting of a file system can cause a Transfer of Control (TOC) * 2834289 (2830513) ls hangs in vxglm:vxg_grant_sleep. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: PHKL_44213 * 3650772 (Tracking ID: 1198384) SYMPTOM: Kernel may panic while working on Directory Name Lookup Cache(DNLC) in vx_lookup () function. The following stack trace is observed. Spinlock() vx_itryhold () vx_dnlc_lookup () vx_cbdnlc_lookup () vx_fast_lookup () vx_lookup () DESCRIPTION: The panic occurs as inode pointer in DNLC entry becomes zero due to race case conditions in accessing this pointer. RESOLUTION: Code is modified to redesign DNLC to keep the inode number in DNLC entry instead of pointer of the inode. * 3650778 (Tracking ID: 529071) SYMPTOM: ENOSPC error occurs while processing VX_IEDOTDOTRM (removal of dotdot entry), extended operation may result in marking the inode as bad. DESCRIPTION: ENOSPC error occurs while processing VX_IEDOTDOTRM (removal of dotdot entry), extended operation may result in marking the inode as bad, and the file system for fullfsck, because the new attributes inodes cannot be allocated on the already full file system. RESOLUTION: Code is modified such that the new attribute inode is not required while processing VX_IEDOTDOTRM (removal of dotdot entry) extended operation. Patch ID: PHKL_44051 * 3023997 (Tracking ID: 2768505) SYMPTOM: While reading the Dynamic Name Lookup Cache (DNLC) entries using the pstat_getmpathname(2) function, the system might panic and display the following stack trace: vx_dnlc_appened_dnlcnodes vx_dnlc_getentries pstat_mpathname $cold_pstat syscall DESCRIPTION: While reading all the DNLC entries, the VxFS pointer is derived from the VNODE pointer. The VNODE pointer can be reused and de-referencing the VxFS pointer might panic the system. RESOLUTION: The code is modified to pass the VxFS pointer directly to the function instead of deriving it from the VNODE pointer. * 3131944 (Tracking ID: 3099638) SYMPTOM: When the vxfs_ifree_timelag(5) tunable is tuned the following error message is displayed: # kctune vxfs_ifree_timelag=400 ERROR: mesg 095: V-2-95: Setting vxfs_ifree_timelag to 450 since the specified value for vxfs_ifree_timelag is less than the recommended minimum value of 1035 DESCRIPTION: In the vxfs_ifree_timelag(5) tunable man page, the minimum value is set to "None". The error message is displayed when the vxfs_ifree_timelag(5) tunable is set to a value which is less than 450. In the error message, a garbage value is displayed as the recommended minimum value. The error occurs because a single argument is passed for the error message that has two format specifier's. RESOLUTION: The code is modified to set the correct minimum value of the vxfs_ifree_timelag (5) tunable, and display the correct error message. * 3338084 (Tracking ID: 2175113) SYMPTOM: Internal noise test on Cluster File System (CFS) hits the debug assert, when the snapshot related message is sent. DESCRIPTION: The priority level of the message related to the snapshot conflicts with the recovery message.Since both the messages are processed simultaneously and with same priority level, the internal debug assert is triggered. RESOLUTION: The code is modified to change the priority level of the message related to the snapshot. * 3428222 (Tracking ID: 1244756) SYMPTOM: A lookup operation on a VxFS file system may fail with the following stack trace: vx_cbdnlc_purge_iplist vx_inode_free_list vx_ifree_scan_list vx_workitem_process vx_worklist_process DESCRIPTION: Due to a race condition between the Directory Name Lookup Cache (DNLC) lookup and the DNLC get functions, there is an attempt to move a DNLC entry to the tail of the freelist in the lookup function when it has already been removed from the freelist by the DNLC get function. This leads to a null pointer de- reference. RESOLUTION: The code is modified to verify that the DNLC entry is present on the freelist before it is moved to the tail by the DNLC get function. * 3502809 (Tracking ID: 2874172) SYMPTOM: Network File System (NFS) file creation thread might loop continuously with the following stack trace: vx_getblk_cmn(inlined) vx_getblk+0x3a0 vx_exh_allocblk+0x3c0 vx_exh_hashinit+0xa50 vx_dexh_create+0x240 vx_dexh_init+0x8b0 vx_do_create+0x1e0 vx_create1+0x1d0 vx_create0+0x270 vx_create+0x40 rfs3_create+0x420 common_dispatch+0xb40 rfs_dispatch+0x40 svc_getreq+0x250 svc_run+0x310 svc_do_run+0xd0 nfssys+0x6a0 hpnfs_nfssys+0x60 coerce_scall_args+0x130 syscall+0x590 DESCRIPTION: The Veritas File System (VxFS) file creation vnode operation (VOP) routine expects the parent vnode to be a directory vnode pointer. But, the NFS layer passes a stale file vnode pointer by default. This might cause unexpected results such as hang during VOP handling. RESOLUTION: The code is modified to check for the vnode type of the parent vnode pointer at the beginning of the create VOP call and return an error if it is not a directory vnode pointer. Patch ID: PHKL_43746 * 2166258 (Tracking ID: 2129455) SYMPTOM: Lots of vxfs threads seen doing inactive processing. DESCRIPTION: There were 2 issues which can cause lots of vxfs threads doing inactive processing: 1. We used to spawn one inactive processing thread per inactive list. On high end machines, we could see lots of threads doing inactive processing. 2. vx_inactive_started was bumped wrongly in vx_icache_process() instead of vx_inactive_process() which could cause lots of inactive processing threads in corner case. RESOLUTION: For the first issue, we can change that to max(ncpu/2, 8) number of threads at one time that will do inactive processing. For the second issue, it gets fixed by bumping vx_inactive_started in vx_inactive_process(). * 2194615 (Tracking ID: 2178147) SYMPTOM: If a socket file is removed, the file system is marked for full fsck (1M) operation. The following error message is displayed in the system log: vmunix: vxfs: WARNING: msgcnt 1 mesg 087: V-2-87: vx_dotdot_manipulate: - / file system 2437 inode 541 dotdot inode error DESCRIPTION: During the socket file creation, the attribute inode for the parent directory is not updated. Hence, the error occurs when the socket file is removed. RESOLUTION: The code is modified to update the socket file linkage during creation, thus avoiding the error message. * 2800280 (Tracking ID: 2670022) SYMPTOM: Duplicate file names can be seen in a directory. DESCRIPTION: Veritas File System (VxFS) maintains an internal Directory Name Lookup Cache (DNLC) to improve the performance of directory lookups. A race condition occurs in the DNLC lists manipulation code during lookup/creation of file names that have more than 32 characters (which further affects other file creations). This causes the DNLC to have a stale entry for an existing file in the directory. A lookup of such a file through DNLC does not find the file and allows another duplicate file with the same name in the directory. RESOLUTION: The code is modified to fix the race condition by protecting the DNLC lists through proper locks. * 2810121 (Tracking ID: 2316793) SYMPTOM: After removing the files in a file system, the df(1M)command which uses the statfs(2)function may take 10 seconds to complete. DESCRIPTION: To obtain an up-to- date and valid free block count in a file system a delay and retry loop delays for one second and retries 10 times. This excessive retrying causes a 10 second delay per file system while executing the df(1M) command. RESOLUTION: The code is modified to reduce the original 10 retries with one second delay each, to one retry after a 20 millisecond delay. * 3131809 (Tracking ID: 2966277) SYMPTOM: Systems with high file-system activity like read/write/open/lookup may panic with the following stack trace due to a rare race condition: spinlock+0x21 ( ) -> vx_rwsleep_unlock() vx_ipunlock+0x40() vx_inactive_remove+0x530() vx_inactive_tran+0x450() vx_local_inactive_list+0x30() vx_inactive_list+0x420() -> vx_workitem_process() -> vx_worklist_process() vx_worklist_thread+0x2f0() kthread_daemon_startup+0x90() DESCRIPTION: ILOCK is released before doing a IPUNLOCK that causes a race condition. This results in a panic when an inode that has been set free is accessed. RESOLUTION: The code is modified so that the ILOCK is used to protect the inodes' memory from being set free, while the memory is being accessed. * 3131958 (Tracking ID: 2899907) SYMPTOM: Some file-system operations on a Cluster File System (CFS) may hang with the following stack trace. vxg_svar_sleep_unlock vxg_grant_sleep vxg_cmn_lock vxg_api_lock vx_glm_lock vx_mdele_hold vx_extfree1 vx_exttrunc vx_trunc_ext4 vx_trunc_tran2 vx_trunc_tran vx_cfs_trunc vx_trunc vx_inactive_remove vx_inactive_tran vx_cinactive_list vx_workitem_process vx_worklist_process vx_worklist_thread vx_kthread_init kernel_thread DESCRIPTION: In CFS, a node can lock a mdelelock for an extent map while holding a mdelelock for a different extent map locked. This can result in a deadlock between different nodes in the cluster. RESOLUTION: The code is modified to prevent the deadlock between different nodes in the cluster. * 3276125 (Tracking ID: 3244613) SYMPTOM: A file-system extent operation by using the fsadm(1M) command may hang with the following stack trace: vx_event_wait(inlined) vx_delay2+0x2a0 cold_vx_active_common_flush+0x80 vx_close+0x70 vn_close(inlined) vno_close+0xe0 closef(inlined) DESCRIPTION: During a resize operation, the fsadm(1M) command freezes the file system. In an error case, the fsadm(1M) command exits without thawing the file system. This results in a hang. RESOLUTION: The code is modified to thaw the file system, before the fsadm(1M) command exits in the error case. * 3326533 (Tracking ID: 3259634) SYMPTOM: In CFS each node that has the file system cluster mounted has its own intent-log in the file system. A cluster file system that has more than 4, 294, 967, 296 file system blocks can zero out an incorrect location due to an incorrect typecasting, for example 65536 file system blocks at block offset of 1, 537, 474, 560 [fs blocks] can be incorrectly zeroed out using a 8Kb fs block size and an intent-log of size 65536 fs blocks. This issue can only occur if an intent-log is located above an offset of 4, 294, 967, 296 file system blocks. This situation can occur when adding a new node to the cluster and mounting an additional CFS secondary for the first time, which needs to create and zero a new intent-log. This situation can also be triggered if the file system or intent log is resized and an intent-log needs to be cleared. The problem occurs only with the following file system size and the FS block size combinations: 1kb block size and FS size > 4TB 2kb block size and FS size > 8TB 4kb block size and FS size > 16TB 8kb block size and FS size > 32TB The message log can contain the following messages: full fsck flag is set on a file system with the following type of messages: 2013 Apr 17 14:52:22 sfsys kernel: vxfs: msgcnt 5 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/sfsdg/vol1 file system fullfsck flag set - vx_ierror 2013 Apr 17 14:52:22 sfsys kernel: vxfs: msgcnt 6 mesg 017: V-2-17: vx_attr_iget - /dev/vx/dsk/sfsdg/vol1 file system inode 13675215 marked bad incore 2013 Jul 17 07:41:22 sfsys kernel: vxfs: msgcnt 47 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/sfsdg/vol1 file system fullfsck flag set - vx_ierror 2013 Jul 17 07:41:22 sfsys kernel: vxfs: msgcnt 48 mesg 017: V-2-17: vx_dirbread - /dev/vx/dsk/sfsdg/vol1 file system inode 55010476 marked bad incore DESCRIPTION: In CFS each node that has the file system cluster mounted has its own intent-log in the file system.An intent-log is created when an additional node mounts the file system as a CFS Secondary. Note that intent-logs are never removed, they are reused. Whilst clearing an intent log, an incorrect block number is passed to the log clearing routine resulting in zeroing out an incorrect location. The incorrect location might point to file data or file system metadata, or the incorrect location might be part of the file system's available freespace. This is silent corruption. If file system metadata is corrupted it will be detected when the corrupt metadata is subsequently accessed and the file system will be marked for full fsck. RESOLUTION: The code is modified so that the correct block number is passed to the log clearing routine. Patch ID: PHKL_43392 * 2834293 (Tracking ID: 2750860) SYMPTOM: On a large file system(4TB or greater), the performance of the write(1) operation with many small request sizes may degrade, and many threads may be found sleeping with the following stack trace: real_sleep sleep_one vx_sleep_lock vx_lockmap vx_getemap vx_extfind vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_uplevel vx_searchau vx_extentalloc_device vx_extentalloc vx_te_bmap_alloc vx_bmap_alloc_typed vx_bmap_alloc vx_write_alloc3 vx_recv_prealloc vx_recv_rpc vx_msg_recvreq vx_msg_process_thread kthread_daemon_startup DESCRIPTION: For a cluster-mounted file system, the free-extend-search algorithm is not optimized for a large file system (4TB or greater), and for instances where the number of free Allocation Units (AUs) available can be very large. RESOLUTION: The code is modified to optimize the free-extend-search algorithm by skipping certain AUs. This reduces the overall search time. * 2867635 (Tracking ID: 2867633) SYMPTOM: The internal noise test on a locally mounted file system hits an "vx_delxwri_reclaim:1a" assert. DESCRIPTION: The vx_delxwri_reclaim() function is called from the vx_write_alloc() function only if the fault flag is not set and the error type is as follows: no space, volume disabled, or quota exceeded. This condition occurs due to the wrong conditions set when the vx_delxwri_reclaim() function is called from the vx_write_alloc() function. RESOLUTION: The code is modified to correct the conditions when the vx_delxwri_reclaim() function is called from the vx_write_alloc() function. * 2904377 (Tracking ID: 2599590) SYMPTOM: Expansion of a 100% full file system may panic the machine with the following stack trace. bad_kern_reference() $cold_vfault() vm_hndlr() bubbledown() vx_logflush() vx_log_sync1() vx_log_sync() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: When 100% full file system is expanded intent log of the file system is truncated and blocks freed up are used during the expansion. Due to a bug the block map of the replica intent log inode was not getting updated thereby causing the block maps of the two inodes to differ. This caused some of the in- core structures of the intent log to go NULL. The machine panics while de- referencing one of this structure. RESOLUTION: Updated the block map of the replica intent log inode correctly. 100% full file system now can be expanded only If the last extent in the intent log contains more than 32 blocks, otherwise fsadm will fail. To expand such a file-system, some of the files should be deleted manually and resize be retried. Patch ID: PHKL_43107 * 2114163 (Tracking ID: 2091103) SYMPTOM: CFS hangs with thread appears to be looping while allocating the space for file. DESCRIPTION: Currently in function vx_searchau() if VX_ERETRY error is received, it keeps on retrying indefinitely, resulting in hang. RESOLUTION: Code is changed to limit number of retries to 3. * 2551549 (Tracking ID: 2428964) SYMPTOM: Value of kernel tunable max_thread_proc gets incremented by 1 after every software maintenance related activity (install, remove etc.) of VRTSvxfs package. DESCRIPTION: In the postinstall script for VRTSvxfs package, value of kernel tunable max_thread_proc is wrongly increment by 1. RESOLUTION: From postinstall script increment operation of max_thread_proc tunable is removed. * 2564150 (Tracking ID: 2383225) SYMPTOM: During an internal testing of write operations using direct I/O, the system panics with the panic string "pfd_unlock: bad lock state!" and the following stack is displayed: vx_dio_rdwri+0xdc vx_write_direct+0x2ec vx_write1+0x13a8 vx_rdwr+0xa88 vno_rw+0x64 rwuio+0x11c aio_rw_child_thread+0x178 aio_exec_req_thread+0x258 Problem DESCRIPTION: The routine used to lock the user buffer while performing a Direct I/O does not handle the ENOSPC error correctly and passes an incorrect return value. This leads to the retrial of the I/O and an invalid User I/O structure, resulting in the panic. RESOLUTION: The code is modified so that the routine is updated to handle the ENOSPC errors. * 2587020 (Tracking ID: 2251223) SYMPTOM: The df(1M) command with the -h option takes 10 seconds to execute and reports an inaccurate free block count, shortly after a large number of files are removed. DESCRIPTION: When removing files, some of the file data blocks are released and counted in the total free block count instantly. However, all the blocks may not be freed immediately as Veritas File System (VxFS) can sometimes delay the releasing of blocks. Therefore, the displayed free block count, at any given time, is the total of the free blocks and the 'delayed' free blocks. Once a 'file remove' transaction is done, its 'delayed' free blocks are eliminated and the free block count increases accordingly. However, some functions which process certain transactions, for example a metadata update, can also alter the free block count, but ignore the current 'delayed' free blocks. As a result, if the 'file remove' transactions have not finished updating their free blocks and their 'delayed' free blocks information, the free space count can occasionally show more than the real disk space. Therefore, to obtain an up-to-date and valid free block count for a file system, a delay and retry loop delays 1 second before each retry and loops 10 times before giving up. Thus, the df(1M) command with the -h option sometimes takes 10 seconds to execute. But even if the file system waits for 10 seconds, there is no guarantee that the output displayed will be accurate or valid. RESOLUTION: The code is modified so that the delayed free block count is recalculated accurately when transactions are created and metadata is flushed to the disk. * 2587026 (Tracking ID: 2561739) SYMPTOM: When the file is created and the if the parent has default ACL entry then that entry is not taken into account for calculating the class entry of that file. When a separate dummy entry added we take into account the default entry from parent as well. e.g. $ getacl . # file: . # owner: root # group: sys user::rwx group::rwx class:rwx other:rwx default:user:user_one:r-x $ touch file1 $ getacl file1 # file: try1 # owner: root # group: sys user::rw- user:user_one:r-x group::rw- class:rw- <------ other:rw- The class entry here should be rwx. DESCRIPTION: We were not taking into account the default entry of parent. We share the attribute inode with parent and do not create new attribute inode for newly created file. But when an ACL entry is explicitly made we create the separate attribute inode so the default entry also get copied in new inode and taken into consideration while returning the class entry of file. RESOLUTION: Now before returning the ACL entry buffer we calculate the class entry again and consider all the entries. * 2587032 (Tracking ID: 2492304) SYMPTOM: "find" command displays duplicate directory entries. DESCRIPTION: Whenever the directory entries can fit in the inode's immediate area VxFS doesn't allocate new directory blocks. As new directory entries get added to the directory this immediate area gets filled and all the directory entries are then moved to a newly allocated directory block. The directory blocks have space reserved at the start of the block to hold the block hash information which is used for fast lookup of entries in that block. Offset of the directory entry, which was at say x bytes in the inode's immediate area, when moved to the directory block, will be at (x + y) bytes. "y" is the size of the block hash. During this transition phase from immediate area to directory blocks, a readdir() can report a directory entry more than once. RESOLUTION: Directory entry offsets returned to the "readdir" call are adjusted so that when the entries move to a new block, they will be at the same offsets. * 2800276 (Tracking ID: 2566875) SYMPTOM: The write(2) operation exceeding the quota limit fails with an EDQUOT error ("Disc quota exceeded") before the user quota limit is reached. DESCRIPTION: When a write request exceeds a quota limit, the EDQUOT error is handled so that Veritas File System (VxFS) can allocate space up to the hard quota limit to proceed with a partial write. However, VxFS does not handle this error and an erroris returned without performing a partial write. RESOLUTION: The code is modified to handle the EDQUOT error from the extent allocation routine. * 2800290 (Tracking ID: 2696067) SYMPTOM: When a getaccess() command is issued on a file which inherits the default Access Control List (ACL) entries from the parent, it shows incorrrect group object permissions. DESCRIPTION: If a newly created file leverages the ACL entries of its parent directory, the vx_daccess() function does not fabricate any GROUP_OBJ entry unlike the vx_do_getacl() function. RESOLUTION: The code is modified to fabricate a GROUP_OBJ entry. * 2800329 (Tracking ID: 2680946) SYMPTOM: The ls (1M), find(1M) or other lookup operations can trigger a panic with the following stack trace: spinlock+0x40 vx_itryhold+0x40 vx_dnlc_lookup+0x1b0 vx_cbdnlc_lookup+0x130 vx_fast_lookup+0x120 vx_lookup+0x3c0 lookuppnvp+0x2d0 lookuppn+0x90 lookupname+0x60 vn_open+0xa0 copen+0x170 open+0x80 syscall+0x920 DESCRIPTION: The panic is triggered because of a NULL pointer de-reference while inserting an entry in the Data Name Lookup Cache (DNLC) which has a NULL child pointer. RESOLUTION: The code is modified to include a global variable which is incremented when a DNLC with NULL child pointer gets inserted. A preventive code is added to avoid the panic and such occurrences are tracked using the global variable. * 2800330 (Tracking ID: 1092933) SYMPTOM: The system may panic in the vx_fsync_chains() function when it tries to sleep in the interrupt context. The following stack trace is displayed: vx_event_wait vx_delay2 vx_fsync_chains vx_disable vx_dataioerr vx_pageio_done DESCRIPTION: While handling the external interrupt, the vx_pageio_done() function calls the function vx_fsync_chains() function. The vx_fsync_chains() function may sleep during the execution. The function vx_fsync_chains() is required in the case of Input/output (IO) errors. The function vx_fsync_chains() is called at a couple of places, when the I/O strategy fails. But, the error variable is overwritten improperly. RESOLUTION: The code is modified to save Input/Output errors so that the vx_fsync_chains() function can be called and the call to vx_fsync_chains() is removed from the vx_disable(). * 2834283 (Tracking ID: 2740939) SYMPTOM: The unmounting of a file system can cause a Transfer of Control (TOC) on an HP 11.23 service guard environment with many threads. The following stack trace is displayed: spinunlock() vx_worklist_process() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: The TOC is caused because of a heavy spin-lock contention on a Veritas File System (VxFS) spin-lock. During the unmount of a VxFS file system, worker threads are created to scan the inode cache and remove the inodes related to the mount point. The number of worker threads is calculated based on the number of hash-lists in the inode cache. The number of hash-lists in the inode cache is calculated based on the system memory rather than the tuned vxfs_ninode value. If the system memory is huge and the tuned vxfs_ninode value is less, then the number of hash-lists in the inode cache would be unnecessarily more which can result in the spin-lock contention during the unmount. RESOLUTION: The code is modified to calculate the number of hash-lists based on the tuned vxfs_ninode value. * 2834289 (Tracking ID: 2830513) SYMPTOM: The Cluster File System (CFS) hangs while performing file removal operations and the following stack trace is displayed: vxglm::vxg_grant_sleep+0x110 () vxglm::vxg_cmn_lock+0x5a0 () vxglm::vxg_api_lock+0x310 () vx_glm_lock+0x70 () vx_mdelelock+0x70 () vx_mdele_hold+0xe0 () vx_extfree+0x700 () DESCRIPTION: The CFS hangs due to a missing unlock call for the file removal operations. RESOLUTION: The code is modified to unlock the file removal operations. INSTALLING THE PATCH -------------------- To install the VxFS 5.0-MP2RP11 patch: a) To install this patch on a CVM cluster, install it one system at a time so that all the nodes are not brought down simultaneously. b) The VxFS 5.0(GA) must be installed before applying these patches. c) To verify the VERITAS file system level, execute: # swlist -l product | egrep -i 'VRTSvxfs' VRTSvxfs 5.0.01.04 VERITAS File System Note: VRTSfsman is a corequisite for VRTSvxfs. So, VRTSfsman also needs to be installed with VRTSvxfs. # swlist -l product | egrep -i 'VRTS' VRTSvxfs 5.0.01.04 Veritas File System VRTSfsman 5.0.01.02 Veritas File System Manuals d) All prerequisite/corequisite patches must be installed. The Kernel patch requires a system reboot for both inst allation and removal. e) To install the patch, execute the following command: # swinstall -x autoreboot=true -s PHKL_44213 If the patch is not registered, you can register it using the following command: # swreg -l depot The is the absolute path where the patch resides. REMOVING THE PATCH ------------------ To remove the VxFS 5.0-MP2RP11 patch: a) Execute the following command: # swremove -x autoreboot=true PHKL_44213 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE