* * * READ ME * * * * * * Veritas File System 5.0.1RP3P8 * * * * * * P-patch * * * Patch Date: 2013-03-13 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 5.0.1RP3P8 P-patch OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v3 (11.31) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas File System 5.0.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHKL_43475, PHCO_43476 * 2984718 (2970219) Panic in fcache_as_map+0x70 due to null v_vmdata. * 3023946 (2616622) The performance of the mmap() function is slow when the file system block size is 8KB and the page size is 4KB. * 3023953 (2750860) Performance issue due to CFS fragmentation in CFS cluster * 3024008 (2858683) Reserve extent attributes changed after vxrestore, only for files greater than 8192bytes * 3039824 (3031226) Panic in vx_dnlc_getpathname during SRP testing * 3046923 (3046920) "Panic string : wait_for_lock: Already owns lock: vn_h_sl_pool" * 3069179 (2966277) VX_IPUNLOCK panic at vx_inactive_remove() as inode already freed * 3069181 (3010444) On a NFS filesystem cksum(1m) fails with cksum: read error on : Bad address * 3069189 (3049408) vx_bcrecycle_timelag_factor causes perf problem. * 3069236 (2874172) Infinite looping in vx_exh_hashinit() * 3069242 (2964018) VxFS 11.31/5.0 : statfsdev causes heavy spinlock contention on snode_table_lock * 3069265 (2830513) ls hangs in vxglm:vxg_grant_sleep. * 3092230 (2439261) When vx_fiostats_tunable is changed from zero to non-zero, the system panics Patch ID: PHKL_43260, PHCO_43261 * 1935624 (1903977) On a Cluster File System (CFS) the write operation may fail * 1935635 (1742707) switchout fsck needs to be invoked for CFS with 2 separate args: "-o" and "mounted" * 1954685 (1934537) Panic in vx_free() due to NULL pointer * 2798208 (2767579) The system hangs during a lookup operation * 2822988 (2822984) extendfs(1m) command fails for volumes greater than 2Tb * 2852520 (2850738) Allocating memory with NOWAIT in callback routine during low memory condition. Patch ID: PHKL_43062 * 2036217 (2019793) panic in vx_set_tunefs when starting local zone * 2043627 (2028782) Q_SETQUOTA does not set current usage using quotactl API * 2410793 (1466351) mount hang looping in vx_bc_binval_cookie() * 2429335 (2337470) In the process of shrink fs, the fs out of inodes, fs version is 5.0MP4HF* * 2730965 (2730759) poor sequential read performance * 2796940 (2599590) Expanding or shrinking a DLV5 file system using the fsadm(1M)command causes a system panic. * 2798203 (2316793) After removing files df command takes 10 seconds to complete * 2806468 (2806466) "fsadm -R" resulting in panic at LVM layer due to vx_ts.ts_length set to 2GB. * 2831287 (1703223) run_fsck : vxupgrade First full fsck failed, exiting. * 2832560 (2829708) memory leak in worklist code path * 2847808 (2845175) vx_do_getacl() panic. Patch ID: PHKL_42892 * 1946124 (1797955) In some cases we miss calling vx_metaioerr() when fm_badwrite is set. * 1995390 (1985626) [VxFS][411-295-064][AGF Asset Management] Panic in VxFS due to NULL vx_inode i_fsext pointer * 2036843 (2026799) [VxFS][411-720-615][Fujitsu] fsppadm enforce hang on DST with FCL * 2084004 (2040647) [P2][412-023-546][CITI GB GBP OPERATING ORG] CFS does not enforce vxquota hard-limit * 2092072 (2029556) [VxFS][411-885-855][Barclays Capital] Two panics in mutex_exit: not owner * 2194629 (2161379) [VxFS][412-413-096][Allstate] repeated hangs in vx_event_wait() * 2222508 (2192895) VxFS 5.0MP3RP4 Panic while set/get acls - possible race condition * 2370061 (2370046) readahead with read nstream misses early blocks * 2720002 (1396859) global lock for DIO buffer header list has contention on very large systems * 2722869 (1244756) VxFS 5.0MP1 panic in vx_dnlc_purge_ip() due to deref a NULL d_flhead pointer * 2722958 (2696067) VxFS 11.31/5.0 : vx_daccess() does not observe GROUP_OBJ permissions * 2726001 (2371710) user quota information corrupts * 2730957 (2651922) Performance degradation of 'll' and high SYS% CPU in vx_ireuse() * 2733704 (2421901) LM "stress.bl1" test hit an assert "f:vx_getinoquota:1a" DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: PHKL_43475, PHCO_43476 * 2984718 (Tracking ID: 2970219) SYMPTOM: When CPUs are added to the system, the system may panic with the following stack trace: fcache_as_map+0x70 () vx_fcache_map+0x1d0 () vx_write_default+0x340 () vx_write1+0xea0 () vx_rdwr+0x1130 () rfs3_write+0x5b0 () common_dispatch+0xc10 () rfs_dispatch+0x40 () svc_getreq+0x250 () svc_run+0x310 () svc_do_run+0xd0 () nfssys+0x7c0 () hpnfs_nfssys+0x60 () coerce_scall_args+0x130 () syscall+0x590 () DESCRIPTION: The issue occurs because of the race condition between the vnode-map initialization and deinitialization. RESOLUTION: The code is modified to add debug messages that will confirm if a race condition exists between the vnode-map initialization and deinitialization. The debug messages will help gather information if the problem occurs again. * 3023946 (Tracking ID: 2616622) SYMPTOM: The performance of the mmap() function is slow when the file system block size is 8 KB and the page size is 4 KB. DESCRIPTION: When the file system block size is 8 KB, the page size is 4 KB, and the mmap() function is performed on an 8 KB file, the file gets represented in memory as two pages (0 and 1). When the memory at offset 0 in the mapping is modified, a page fault occurs for page 0 in the file. When that disk block is allocated and marked as valid, the page mentioned in the fault request is expected to get flushed out to the disk and therefore, it is left uninitialized on the disk by default. Only that particular page is cleaned in memory and left modified so that it is known that the data in memory ismore recent than the data on disk. However, the other half of the block (which could eventually be mapped to page 1) gets cleared with a synchronous write because such a fault may not occur. This synchronous clearing of the other half of 8 KB block causes performance degradation. RESOLUTION: The code is modified to expand the range of the fault to cover the entire 8 KB block. The message from the OS asking for only one page is ignored and two pages are given to cover the entire file system block to save the separate synchronous clearing of the other half of 8 KB block. * 3023953 (Tracking ID: 2750860) SYMPTOM: On a large file system(4TB or greater), the performance of the write(1) operation with many small request sizes may degrade, and many threads may be found sleeping with the following stack trace: real_sleep sleep_one vx_sleep_lock vx_lockmap vx_getemap vx_extfind vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_uplevel vx_searchau vx_extentalloc_device vx_extentalloc vx_te_bmap_alloc vx_bmap_alloc_typed vx_bmap_alloc vx_write_alloc3 vx_recv_prealloc vx_recv_rpc vx_msg_recvreq vx_msg_process_thread kthread_daemon_startup DESCRIPTION: For a cluster-mounted file system, the free-extend-search algorithm is not optimized for a large file system (4TB or greater), and for instances where the number of free Allocation Units (AUs) available can be very large. RESOLUTION: The code is modified to optimize the free-extend-search algorithm by skipping certain AUs. This reduces the overall search time. * 3024008 (Tracking ID: 2858683) SYMPTOM: The reserve-extent attributes are changed after the vxrestore(1M ) operation, for files that are greater than 8192 bytes. DESCRIPTION: A local variable is used to contain the number of the reserve bytes that are reused during the vxrestore(1M) operation, for further VX_SETEXT ioctl call for files that are greater than 8k. As a result, the attribute information is changed. RESOLUTION: The code is modified to preserve the original variable value till the end of the function. * 3039824 (Tracking ID: 3031226) SYMPTOM: During an internal SRP testing the system panics with the following stack trace: vx_dnlc_getpathname+0xa10 pfs_vop_dnlc_getpathname+0x68 secfs_dnlc_getpathname+0x10 vfs_stack_vop_dnlc_getpathname+0x6c audit_get_pathname_from_dnlc+0x1a0 audit_clean_path+0x114 audit_build_full_dir_name+0x90 change_p_cdir+0x2c ncf_srp_chdir+0x7c chdir+0x94 syscall+0x318 $syscallrtn+0x0 DESCRIPTION: The panic occurs due to dereferencing the DNLC entry which is set as NULL in the vx_dnlc_getpathname() function. RESOLUTION: The code is modified in the vx_dnlc_getpathname() function to check for the validity of the DNLC entry, before dereferencing DNLC. * 3046923 (Tracking ID: 3046920) SYMPTOM: The system may panic with the following panic string "wait_for_lock: Already owns lock: vn_h_sl_pool". The panic stack observed is as follows: panic+0x410 wait_for_lock+0xa60 vfs_stack_lock_vp+0x90 vfs_teardown_stack+0x30 vx_inode.c:8133 vx_inactive(inlined) vx_vn_inactive+0xd70 pfs_vop_inactive+0xc0 sec_file_rules:secfs_inactive+0x30 vfs_stack_vop_inactive+0xb0 $cold_vn_rele_inactive+0x10 unmapvnode+0x450 dispreg+0x5c0 exit_post_single_threaded_notify+0x2d0 exit+0x4e0 pm_issig.c:1797 psig_core(inlined) $cold_psig+0x2a0 hl_ivt.c:597 post_hndlr(inlined) vm_hndlr+0x840 bubbleup+0x880 DESCRIPTION: There was an error when the lock is held at the VxFS layer and the vfs_teardown_stack is called to destroy the vnode. Hence, the system panics. RESOLUTION: The code is modified to release the lock held at the VxFS layer before the vfs_teardown_stack is called. * 3069179 (Tracking ID: 2966277) SYMPTOM: Systems with high file-system activity like read/write/open/lookup may panic with the following stack trace due to a rare race condition: spinlock+0x21 ( ) -> vx_rwsleep_unlock() vx_ipunlock+0x40() vx_inactive_remove+0x530() vx_inactive_tran+0x450() vx_local_inactive_list+0x30() vx_inactive_list+0x420() -> vx_workitem_process() -> vx_worklist_process() vx_worklist_thread+0x2f0() kthread_daemon_startup+0x90() DESCRIPTION: ILOCK is released before doing a IPUNLOCK that causes a race condition. This results in a panic when an inode that has been set free is accessed. RESOLUTION: The code is modified so that the ILOCK is used to protect the inodes' memory from being set free, while the memory is being accessed. * 3069181 (Tracking ID: 3010444) SYMPTOM: On a Network File System (NFS) mounted file system, the operations which read the file via the cksum (1m) command may fail with the following error message: cksum: read error on : Bad address The following error messages would also be seen in the syslog vmunix: WARNING: Synchronous Page I/O error DESCRIPTION: When the read-vnode operation (VOP_RDWR) is performed, certain requests are converted to direct the I/O for optimisation. However, the NFS buffers passed during the read requests are not the user buffers'. As a result, there is an error. RESOLUTION: The code is modified to convert the I/O requests to the direct I/O, only if the buffer passed during the I/O is the user buffer. * 3069189 (Tracking ID: 3049408) SYMPTOM: When the system is under the file-cache pressure, the find(1) command takes time to operate. DESCRIPTION: The Veritas File System (VxFS) does not grow the metadata-buffer cache under system or file-cache memory pressure. When the vx_bcrecycle_timelag factor drops to zero, the metadata buffers are reused immediately after they are accessed. As a result, a large-directory scan takes many physical I/Os to scan the directory. The end result is that VxFS ends up performing excessive re- reads for the same data, into the metadata-buffer cache. However, the file- cache memory pressure is normal. There is no need to shrink the metadata-buffer cache, just because there is a file-cache memory pressure. RESOLUTION: The code is modified to unlink the metadata-buffer cache behaviour from the file-cache memory pressure. * 3069236 (Tracking ID: 2874172) SYMPTOM: Network File System (NFS) file creation thread might loop continuously with the following stack trace: vx_getblk_cmn(inlined) vx_getblk+0x3a0 vx_exh_allocblk+0x3c0 vx_exh_hashinit+0xa50 vx_dexh_create+0x240 vx_dexh_init+0x8b0 vx_do_create+0x1e0 vx_create1+0x1d0 vx_create0+0x270 vx_create+0x40 rfs3_create+0x420 common_dispatch+0xb40 rfs_dispatch+0x40 svc_getreq+0x250 svc_run+0x310 svc_do_run+0xd0 nfssys+0x6a0 hpnfs_nfssys+0x60 coerce_scall_args+0x130 syscall+0x590 DESCRIPTION: The Veritas File System (VxFS) file creation vnode operation (VOP) routine expects the parent vnode to be a directory vnode pointer. But, the NFS layer passes a stale file vnode pointer by default. This might cause unexpected results such as hang during VOP handling. RESOLUTION: The code is modified to check for the vnode type of the parent vnode pointer at the beginning of the create VOP call and return an error if it is not a directory vnode pointer. * 3069242 (Tracking ID: 2964018) SYMPTOM: On a high end machine with about 125 CPUs operations using the lstat64(2) function, may seem to be hung and the following stack trace is observed: spinlock+0xe0 rwspin_wrlock+0x30 specvp+0x510 vx_lookup+0x8a0 -> lookuppnvp(inlined) -> lookuppn(inlined) DESCRIPTION: The statvfsdev search calls the devnm() function to search the whole /dev/ directory for reverse- pathname RESOLUTION: The code is modified such that a new fs_load() function is implemented to make use of the incoming file descriptor, if it is already a character device. However, the devnm() function is still needed if the incoming file descriptor is a block device. * 3069265 (Tracking ID: 2830513) SYMPTOM: The Cluster File System (CFS) hangs while performing file removal operations and the following stack trace is displayed: vxglm::vxg_grant_sleep+0x110 () vxglm::vxg_cmn_lock+0x5a0 () vxglm::vxg_api_lock+0x310 () vx_glm_lock+0x70 () vx_mdelelock+0x70 () vx_mdele_hold+0xe0 () vx_extfree+0x700 () DESCRIPTION: The CFS hangs due to a missing unlock call for the file removal operations. RESOLUTION: The code is modified to unlock the file removal operations. * 3092230 (Tracking ID: 2439261) SYMPTOM: When vx_fiostats_tunable is changed from zero to non-zero, the system panics with the following stack trace: panic_save_regs_switchstack+0x110 () panic+0x410 () bad_kern_reference+0xa0 () $cold_pfault+0x5c0 () vm_hndlr+0x370 () bubbleup+0x880 () vx_fiostats_do_update+0x140 () vx_fiostats_update+0x170 () vx_read1+0x10e0 () vx_rdwr+0x790 () vno_rw+0xd0 () rwuio+0x32f () pread+0x121 () syscall+0x590 () in ?? () DESCRIPTION: When vx_fiostats_tunable is changed from zero to non-zero, all the incore-inode fiostats attributes are set to NULL. When these attributes are accessed, the system panics due to the NULL pointer dereference. RESOLUTION: The code has been modified such that when vx_fiostats_tunable is changed from zero to non-zero, it is verified if the fiostats attributes of inode are NULL or not. This will prevent the panic. Patch ID: PHKL_43260, PHCO_43261 * 1935624 (Tracking ID: 1903977) SYMPTOM: On a Cluster File System (CFS) the write operation may fail and the system panics with the following stack trace: vx_active_common_flush+0x74() vx_rwlock+0x14() DESCRIPTION: When a check is done to verify if the VX_DLSYNCFREE flag is set inside the function vx_getblk_clust(), there is no lock to protect the check. As a result, there could be a race condition leading to inconsistent data, this causes the system to panic. RESOLUTION: The code is modified to synchronize the VX_DLSYNCFREE flag updates using the VX_FSQ_LOCK. * 1935635 (Tracking ID: 1742707) SYMPTOM: Mounting a Cluster File System (CFS) fails with the following usage message: UX:vxfs fsck_logv: INFO: V-3-20896: Usage: fsck [-V] [-F vxfs] [-mnNyY] [-o fu'll, nolog, mounted, p] special [...] DESCRIPTION: The "switchout fsck" command needs to be invoked for CFS with two separate options: "-o" and "mounted". However, the "switchout fsck" command gets invoked with the "-o mounted" option. As a result, the error occurs. RESOLUTION: The code is modified so that the "switchout fsck" command gets invoked with two different options: "-o" and "mounted", instead of being invoked as the "-o mounted" option. * 1954685 (Tracking ID: 1934537) SYMPTOM: The reverse-name-lookup operation on an inode may panic the machine with the following stack trace: vx_free vx_traverse_tree+0x4a0 vx_dir_lookup+0x1e4 vx_rev_namelookup+0x294 vx_aioctl_common+0xac4 vx_aioctl+0x12c vx_ioctl+0xe0 DESCRIPTION: During the lookup operation if the memory allocation fails, the user still goes ahead and adds it to the used memory, and later tries to free that memory. This results in the panic. RESOLUTION: The code is modified to update the usage counters correctly, and skip updating the count during the error condition. An additional check is added to free only the non-null buffers. * 2798208 (Tracking ID: 2767579) SYMPTOM: The system hangs during a lookup operation with the following stack trace: vx_dnlc_pathname_realloc+0x80 () vx_dnlc_getpathname+0xcf0 () audit_get_pathname_from_dnlc+0x370 () audit_dnlc_path_name+0x70 ftruncate+0x480 () syscall+0x590 () DESCRIPTION: The system hangs because of an infinite loop that gets triggered when an inode with the negative DNLC entry is encountered, during a reverse name DNLC lookup. RESOLUTION: The code is modified to add an avoidance fix to prevent the infinite loop to occur. * 2822988 (Tracking ID: 2822984) SYMPTOM: When the extendfs(1m) command extends the file system that is greater than 2TB the extendfs(1m) command fails and the following error message is displayed: "UX:vxfs fsck: ERROR: V-3-25315: could not seek to block offset" DESCRIPTION: This is a typecasting problem. When the extendfs(1m) command tries to extend the file system, the bs_bseek() function is invoked. The bs_bseek() function's return type is a 32 bit integer value. This value gets negative for offsets greater than 2TB and results in failure. RESOLUTION: The code is modified to resolve the typecasting problem. * 2852520 (Tracking ID: 2850738) SYMPTOM: The system may hang with the following stack trace during the low memory condition: swtch_to_thread(inlined) slpq_swtch_core+0x520 real_sleep(inlined) sleep+0x400 mrg_reserve_swapmem(inlined) $cold_steal_swap+0x460 $cold_kalloc_nolgpg+0x4b0 kalloc_internal(inlined) $cold_kmem_arena_refill+0x650 kmem_arena_varalloc+0x280 vx_alloc(inlined)vx_worklist_enqueue+0x40 vx_buffer_kmcache_callback+0x160 kmem_gc_arena(inlined) foreach_arena_ingroup+0x840 kmem_garbage_collect_group(inlined) kmem_garbage_collect+0x390 kmem_arena_gc+0x240 kthread_daemon_startup+0x90 DESCRIPTION: The VxFS kernel memory callback() routine allocates memory with M_WAITOK flag. This results in the system hang during the low memory condition as the callback () routine waits for memory allocation. RESOLUTION: The code is modified to allocate memory without waiting in the VxFS kernel memory callback() routine. Patch ID: PHKL_43062 * 2036217 (Tracking ID: 2019793) SYMPTOM: While umounting the file system the system may panic and the following stack trace is displayed: vx_set_tunefs+0x264() vx_aioctl_full+0xc7c() vx_aioctl_common+0x738() vx_aioctl+0x13c() vx_ioctl+0xe4() syscall+0xcc() DESCRIPTION: Due to a race condition during the umount operation to update fse_fs and the fse_zombie pointers, there may be a small window where both the pointers are out of sync, which results in the panic. RESOLUTION: The code is modified to set the fsext->fse_zombie pointer before the fsext- >fse_fs pointer to keep them consistent. * 2043627 (Tracking ID: 2028782) SYMPTOM: For a file managed by Hierarchical Storage Management (HSM), application file quota gets doubled after an HSM migrate or recall process.DESCRIPTION: HSM file quota get doubled after HSM migrate/recall process. DESCRIPTION: A file has a set of quota when it is created. For efficient storage, the data is moved from the high speed disk to the low speed tape using HSM. This process is called HSM migrate. When the data is recalled from the tape to the disk, the HSM quota for the file gets updated and doubled. RESOLUTION: The code is modified to handle quota updates for files managed by HSM correctly. * 2410793 (Tracking ID: 1466351) SYMPTOM: Mount hangs in vx_bc_binval_cookie like the following stack, delay vx_bc_binval_cookie vx_blkinval_cookie vx_freeze_flush_cookie vx_freeze_all vx_freeze vx_set_tunefs1 vx_set_tunefs vx_aioctl_full vx_aioctl_common vx_aioctl vx_ioctl genunix:ioctl unix:syscall_trap32 DESCRIPTION: The hanging process is waiting for a buffer to be unlocked. But that buffer can only be released if its associated cloned map writes get flushed. But a necessary flush is missed. RESOLUTION: Add code to synchronize cloned map writes so that all the cloned maps will be cleared and the buffers associated with them will be released. * 2429335 (Tracking ID: 2337470) SYMPTOM: The Cluster File System (CFS) can unexpectedly and prematurely report a 'file system out of inodes' error when attempting to create a new file. The following error message is displayed: vxfs: msgcnt 1 mesg 011: V-2-11: vx_noinode - /dev/vx/dsk/dg/vol file system out of inodes. DESCRIPTION: While allocating new index nodes (inodes) in a CFS, Veritas File System (VxFS) searches for an available free inode in the Inode Allocation Units (IAUs) that are delegated to the local node. If none are available, it searches the IAUs that are not delegated to any node, or revokes an IAU delegated to another node. Gaps may be created in the IAU structures as a side effect of the CFS delegation processing. However, while searching for an available free inode, if VxFS ignores any gaps, new IAUs cannot be created if the maximum size of the metadata structures reaches (2^31). Therefore, one of the gaps must be populated and used for the allocation of the new inode. If the gaps are ignored, VxFS may prematurely report the "file system out of inodes" error message even though there is enough free space in the VxFS file system to create new inodes. RESOLUTION: The code is modified to allocate new inodes from the gaps in the IAU structures created as a part of the CFS delegation processing. * 2730965 (Tracking ID: 2730759) SYMPTOM: The sequential read performance is poor because of the read-ahead issues. DESCRIPTION: The read-ahead on sequential reads performed incorrectly because of wrong read- advisory and the read-ahead pattern offsets are used to detect and perform the read-ahead. Also, more sync reads are performed which can affect the performance. RESOLUTION: The code is modified and the read-ahead pattern offsets are updated correctly to detect and perform the read-ahead at the required offsets. The read-ahead detection is also modified to reduce the sync reads. * 2796940 (Tracking ID: 2599590) SYMPTOM: Expansion of a 100% full file system may panic the machine with the following stack trace. bad_kern_reference() $cold_vfault() vm_hndlr() bubbledown() vx_logflush() vx_log_sync1() vx_log_sync() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: When 100% full file system is expanded intent log of the file system is truncated and blocks freed up are used during the expansion. Due to a bug the block map of the replica intent log inode was not getting updated thereby causing the block maps of the two inodes to differ. This caused some of the in- core structures of the intent log to go NULL. The machine panics while de- referencing one of this structure. RESOLUTION: Updated the block map of the replica intent log inode correctly. 100% full file system now can be expanded only If the last extent in the intent log contains more than 32 blocks, otherwise fsadm will fail. To expand such a file-system, some of the files should be deleted manually and resize be retried. * 2798203 (Tracking ID: 2316793) SYMPTOM: After removing the files in a file system, the df(1M)command which uses the statfs(2)function may take 10 seconds to complete. DESCRIPTION: To obtain an up-to- date and valid free block count in a file system a delay and retry loop delays for one second and retries 10 times. This excessive retrying causes a 10 second delay per file system while executing the df(1M) command. RESOLUTION: The code is modified to reduce the original 10 retries with one second delay each, to one retry after a 20 millisecond delay. * 2806468 (Tracking ID: 2806466) SYMPTOM: A reclaim operation on a filesystem mounted on a Logical Volume Manager (LVM) volume using the fsadm(1M) command with the 'R' option may panic the system and the following stack trace is displayed: vx_dev_strategy+0xc0() vx_dummy_fsvm_strategy+0x30() vx_ts_reclaim+0x2c0() vx_aioctl_common+0xfd0() vx_aioctl+0x2d0() vx_ioctl+0x180() DESCRIPTION: Thin reclamation is supported only on the file systems mounted on a Veritas Volume Manager (VxVM) volume. RESOLUTION: The code is modified to error out gracefully if the underlying volume is LVM. * 2831287 (Tracking ID: 1703223) SYMPTOM: The internal local mount test exits due to full fsck operation failure. DESCRIPTION: While testing, if some directory inode is set with the extended VX_IEREMOVE operation it must mark the files from the directory with the extended operation. If all the files inside that directory are not set with the VX_IEREMOVE operation, it results in a number of unreferenced files and the test fails. RESOLUTION: The code is modified to remove the file entries if the directory inode has VX_IEREMOVE operation set. * 2832560 (Tracking ID: 2829708) SYMPTOM: On a locally mounted Veritas File System (VxFS) machine, the system hangs due to low memory during an internal test. DESCRIPTION: The system hangs because the memory allocated to the structure used to enqueue or dequeue work items in batches, is not freed. RESOLUTION: The code is modified to free the memory. * 2847808 (Tracking ID: 2845175) SYMPTOM: When the Access Control List (ACL) feature is enabled, the system may panic with "Data Key Miss Fault in KERNEL mode" error message in the vx_do_getacl() function and the following stack trace is displayed: vx_do_getacl+0x840 () vx_getacl+0x70 () acl+0x480 () DESCRIPTION: In the vx_do_getacl() function, a local variable is accessed without being initializing as a result leading to a panic. RESOLUTION: The code is modified to initialize the local variable to NULL before using it. Patch ID: PHKL_42892 * 1946124 (Tracking ID: 1797955) SYMPTOM: In a setup that involves files with large extents (greater than 32 MB), which have encountered state map corruptions previously, the file system is disabled and the following message is displayed in the syslog: WARNING: msgcnt 3 mesg 037: V-2-37: vx_metaioerr - vx_tflush_1 - /dev/vx/dsk// file system meta data write error in dev/block ?/???? DESCRIPTION: The file system is disabled after an existing corruption is discovered. According to the VxFS I/O error policy, the file system must be disabled only when a read/write operation fails with an I/O error to prevent further corruption and not when an existing corruption is discovered. RESOLUTION: The code has been modified such that the file system is not disabled in case of a discovered corruption. * 1995390 (Tracking ID: 1985626) SYMPTOM: The system panics during multiple parallel unmount operations. The following stack trace is displayed: vx_freeze_idone_list+24c vx_workitem_process+10 vx_worklist_process+344 vx_worklist_thread+94 DESCRIPTION: During simultaneous unmount operations, a race condition occurs between the INODE_DEINIT() and FREEZE_IDONE_LIST() functions. While the INODE_DEINIT() function uses locks, the FREEZE_IDONE_LIST() function does not use locks while updating a pointer value. Hence, there is a possibility of a pointer having a NULL value, which gets de-referenced. This results in a panic. RESOLUTION: The code is modified to serialize the inode de-initialization operation. * 2036843 (Tracking ID: 2026799) SYMPTOM: A hang occurs while enforcing the policy on File Change Log (FCL) inode and the following stack trace is displayed: lwp_park() cond_wait_queue () cond_wait () pthread_cond_wait() ts_drain_masters() do_walk_ilist() do_process() do_enforce() main() _start() DESCRIPTION: While enforcing the policy on FCL, the file system is frozen. After enforcing the policy on FCL, a function is called which checks whether the file system is frozen. If it is frozen, the function returns the EACTIVERETRY error. This error gets percolated to further functions. The AIOCTL_COMMON() function performs the retry operation endlessly, resulting in a hang. RESOLUTION: The code is modified to return the EACTIVERETRY error only if the file system is not in the frozen state. * 2084004 (Tracking ID: 2040647) SYMPTOM: Quota on cluster mounted system does not enforce hard limit. DESCRIPTION: The information on the hard limit is stored in an unsigned integer. When a larger value is subtracted from it, it wraps around and stores an incorrect value. RESOLUTION: The code is modified to handle the wrap around and special code is added to identify the quota soft limit in the Cluster File System (CFS) environment. * 2092072 (Tracking ID: 2029556) SYMPTOM: When the file system is full, it is not possible to remove a file which has more than 13 hard links. This may lead to a panic with the following stack trace: panicsys() vpanic() panic() mutex_panic() vx_iunlock() vx_remove_tran() vx_do_remove() vx_remove1() vx_remove() vn_remove() unlink() syscall_trap32() DESCRIPTION: While removing or updating an inode, a new attribute inode is allocated which is modified. When the file system is full, no new attribute inodes can be allocated and the above operation may panic the system. RESOLUTION: The code is modified such that the file system can modify the attribute inode rather than allocating new attribute inode. * 2194629 (Tracking ID: 2161379) SYMPTOM: In a Cluster File System (CFS) environment, various file system operations hang with the following stack trace: T1: vx_event_wait() vx_async_waitmsg() vx_msg_send() vx_iread_msg() vx_rwlock_getdata() vx_glm_cbfunc() vx_glmlist_thread() T2: vx_ilock() vx_assume_iowner() vx_hlock_getdata() vx_glm_cbfunc() vx_glmlist_thread() DESCRIPTION: Due to improper handling of the ENOTOWNER error in the ireadreceive() function, the operation is retried repeatedly while holding an inode lock. All the other threads are blocked, thus causing a deadlock. RESOLUTION: The code is modified to release the inode lock on the ENOTOWNER error and acquire it again, thus resolving the deadlock. * 2222508 (Tracking ID: 2192895) SYMPTOM: A system panic occurs when executing File Change Log (FCL) commands and the following stack trace is displayed: panicsys() panic_common() panic() vmem_xalloc() vmem_alloc() segkmem_xalloc() segkmem_alloc_vn() vmem_xalloc() vmem_alloc() kmem_alloc() vx_getacl() vx_getsecattr() fop_getsecattr() cacl() acl() syscall_trap32() DESCRIPTION: The Access Control List (ACL) count in the inode can be corrupted due to a race condition. For example, the setacl() function can change the ACL count when the getacl() function is processing the same inode. This results in an incorrect ACL count. RESOLUTION: The code is modified to add protection to the vulnerable ACL count to avoid corruption. * 2370061 (Tracking ID: 2370046) SYMPTOM: Read ahead operations miss to read early blocks of data when the value of "read_nstream" tunable is not set to 1. PROBLEM DESCRIPTION: The read ahead operation reads the file on demand and does not read the portion of the file which is to be read in advance. This occurs because the parameter that determines the next read ahead offset is incorrectly reset. RESOLUTION: The code is modified so that the read ahead length is set correctly. * 2720002 (Tracking ID: 1396859) SYMPTOM: The internal spin watcher tools show heavy contention on the buffer freelist lock, bc_freelist_lock, even when the I/O loads are predominantly direct I/Os. DESCRIPTION: Direct I/O data need not be cached in file system buffers. But Veritas File System (VxFS) maintains empty buffers to track these requests. Hence, contention is seen even when I/O is direct in nature. RESOLUTION: The code is modified to have a separate arena to track direct I/O buffers so that the contention on the buffer freelist lock is reduced. * 2722869 (Tracking ID: 1244756) SYMPTOM: A lookup operation on a VxFS file system may fail with the following stack trace: vx_cbdnlc_purge_iplist+0x64 vx_inode_free_list+0x160 vx_ifree_scan_list+0xf8 vx_workitem_process+0x10 vx_worklist_process+0x17c vx_worklist_thread+0x94 DESCRIPTION: Due to a race condition between the Directory Name Lookup Cache (DNLC) lookup and the DNLC get functions, there is an attempt to move a DNLC entry to the tail of the freelist in the lookup function when it has already been removed from the freelist by the DNLC get function. This leads to a null pointer de- reference. RESOLUTION: The code is modified to verify that the DNLC entry is present on the freelist before it is moved to the tail by the DNLC get function. * 2722958 (Tracking ID: 2696067) SYMPTOM: When a getaccess() command is issued on a file which inherits the default Access Control List (ACL) entries from the parent, it shows incorrrect group object permissions. DESCRIPTION: If a newly created file leverages the ACL entries of its parent directory, the vx_daccess() function does not fabricate any GROUP_OBJ entry unlike the vx_do_getacl() function. RESOLUTION: The code is modified to fabricate a GROUP_OBJ entry. * 2726001 (Tracking ID: 2371710) SYMPTOM: User quota file gets corrupted when DELICACHE feature is enabled and the current usage of inodes of a user becomes negative after frequent file creations and deletions. If the quota information is checked using the vxquota command with the '-vu username ' option, the number of files is "-1". For example: # vxquota -vu testuser2 Disk quotas for testuser2 (uid 500): Filesystem usage quota limit timeleft files quota limit timeleft /vol01 1127809 8239104 8239104 -1 0 0 DESCRIPTION: This issue is introduced by the inode DELICACHE feature which is a performance enhancement to optimize the updates done to the inode map during file creation and deletion operations. The feature is enabled by default, and can be changed by the vxtunefs(1M) command. When DELICACHE is enabled and the quota is set for Veritas File System (VxFS), there is an extra quota update for the inodes on the inactive list during the removal process. Since this quota has been updated already before being put on the DELICACHE list, the current number of user files gets decremented twice. RESOLUTION: The code is modified to add a flag to identify the inodes which have been moved to the inactive list from the DELICACHE list. This flag is used to prevent decrementing the quota again during the removal process. * 2730957 (Tracking ID: 2651922) SYMPTOM: On a local VxFS file system, the ls(1M) command with the '-l' option runs slowly and high CPU usage is observed. DESCRIPTION: Currently, Cluster File System (CFS) inodes are not allowed to be reused as local inodes to avoid Global Lock Manager (GLM) deadlo`ck issue when Veritas File System (VxFS) reconfiguration is in process. Hence, if a VxFS local inode is needed, all the inode free lists need to be traversed to find a local inode if the free lists are almost filled up with CFS inodes. RESOLUTION: The code is modified to add a global variable, 'vxi_icache_cfsinodes' to count the CFS inodes in inode cache. The condition is relaxed for converting a cluster inode to a local inode when the number of in-core CFS inodes is greater than the 'vx_clreuse_threshold' threshold and reconfiguration is not in progress. * 2733704 (Tracking ID: 2421901) SYMPTOM: Internal stress test on locally mounted VxFS file system resulted in the following assert: f:vx_getinoquota:1a DESCRIPTION: While reusing the inode from inactive inode list, the inode field that contains quota information is expected to be NULL. While moving the inode from delicache list to inactive inode list, quota information field is not set to NULL. This results in the assert. RESOLUTION: The code is modified to reset the quota information field of inode while moving it from delicache list to inactive inode list. INSTALLING THE PATCH -------------------- To install the VxFS 5.0.1-11.31 patch: a) To install this patch on a CVM cluster, install it one system at a time so that all the nodes are not brought down simultaneously. b) The VxFS 11.31 pkg with revision 5.0.31.5 must be installed before applying the patch. c) To verify the VERITAS file system level, execute: # swlist -l product | egrep -i 'VRTSvxfs' VRTSvxfs 5.0.31.5 VERITAS File System Note: VRTSfsman is a corequisite for VRTSvxfs. So, VRTSfsman also needs to be installed with VRTSvxfs. # swlist -l product | egrep -i 'VRTS' VRTSvxfs 5.0.31.5 Veritas File System VRTSfsman 5.0.31.5 Veritas File System Manuals d)All prerequisite/corequisite patches have to be installed.The Kernel patch requires a system reboot for both installation and removal. e) To install the patch, execute the following command: # swinstall -x autoreboot=true -s PHKL_43475 PHCO_43476 If the patch is not registered, you can register it using the following command: # swreg -l depot The is the absolute path where the patch resides. REMOVING THE PATCH ------------------ To remove the VxFS 5.0.1-11.31 patch: a) Execute the following command: # swremove -x autoreboot=true PHKL_43475 PHCO_43476 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE