* * * READ ME * * * * * * Veritas File System 5.0.1 RP3 * * * * * * P-patch 13 * * * Patch Date: 2015-07-09 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 5.0.1 RP3 P-patch 13 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- HP-UX 11i v3 (11.31) PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSfsman VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas File System 5.0.1 * Veritas Storage Foundation 5.0.1 * Veritas Storage Foundation Cluster File System 5.0.1 * Veritas Storage Foundation for Oracle 5.0.1 * Veritas Storage Foundation for Oracle RAC 5.0.1 * Veritas Storage Foundation HA 5.0.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: PHKL_44288, PHCO_44287, PHCO_44286 * 3706699 (3703176) The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available. * 3706718 (3703176) The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available. * 3716683 (3673599) The effective user permission is incorrectly displayed in the getacl(1M) command output. * 3751300 (2439108) System crashes when the read_preferred_io tunable is set to a non-page aligned size. * 3770412 (3469644) The system panics in the vx_logbuf_clean() function. * 3781186 (3781085) The extendfs(1M) command cannot extend a file system beyond 4 TB. * 3794221 (1482790) System panics when VxFS DMAPI is used. * 3796746 (3784126) The mmap pages are invalidated during the file system freeze. Patch ID: PHKL_44201 * 2958890 (2038736) In case of a clone, the Bmap (block map) operation that allocates or truncates the file change log (FCL) file may panic the system. * 3243391 (2907816) The internal kernel conformance test hit an assert on the cluster file system. * 3597566 (3597482) The pwrite(2) function fails with the EOPNOTSUPP error. * 3606381 (894230) In case of clones and Cluster File System (CFS), the clone removal operation may panic the system. * 3614866 (3604750) The kernel loops during the extent re-org. * 3647852 (3650820) System panic occurs, due to the NULL pointer dereference, when a checkpoint is mounted. Patch ID: PHKL_44166 * 3563002 (3560187) The kernel may panic when the buffer is freed in the vx_dexh_preadd_space() function with the message "Data Key Miss Fault in kernel mode". * 3583272 (3520349) When there is a huge number of dirty pages in the memory, and a sparse write is performed at a large offset of 4TB or above, on an existing file that is not null, the file system hangs. * 3589463 (3066116) When adb t vx_inactive_throttling and vx_inactive_process_throttling tunables are set to 1, the system panics due to NULL pointer dereference. Patch ID: PHKL_43852 * 2800296 (2715028) The fsadm(1M) command with the '-d' option may hang when compacting a directory if it is run on the Cluster File System (CFS) secondary node while the find(1) command is running on any other node. * 2857489 (2106036) On a VxFS file system the mount(1M) operation may hang. * 3110574 (2332314) The internal noise.fullfsck test with Oracle Disk Manager (ODM) enabled, hits the "fdd_odm_aiodone:3" assert. * 3276103 (3226462) On a cluster mounted file-system with unequal CPUs, a node may panic while doing a lookup operation. * 3276111 (2552095) While the file system is re-organized by using the fsadm (1M) command the system may panic. * 3404249 (2972183) The fsppadm(1M) enforce command takes a long time on the secondary nodes compared to the primary nodes. * 3404252 (3072036) Read operations from secondary node in CFS can sometimes fail with the ENXIO error code. * 3404256 (3153919) The fsadm (1M) command may hang when the structural file set re-organization is in progress. * 3404257 (3252983) On a high-end system greater than or equal to 48 CPUs, some file system operations may hang. * 3404258 (3259634) A Cluster File System (CFS) with blocks larger than 4GB may become corrupt. * 3404506 (2745357) Performance enhancements are made for the read/write operation on Veritas File System (VxFS) structural files. * 3433778 (3433777) A single CPU machine panics due to the safety-timer check when the inodes are re-tuned. Patch ID: PHKL_43600 * 2043644 (1991446) Mount(1m) fails with error : ERROR: V-3-21272: mount option(s) incompatible with file system. * 2196876 (2184114) The CFSMount and the CVMVolDg agents times out and makes the resources offline. * 2414966 (2387609) The quota usage is set to ZERO while performing a mount or umount of the file- system though files owned by users exist. * 2437800 (2403126) On the CFS, the primary node does not recover in time, after a slave leaves the cluster. * 2555203 (2555198) On VxFS sendfile() does not create DMAPI events for HSM. * 2564419 (2515459) VxFS files-system with clones mounted, tuning some file-system parameters may hang. * 3099686 (2565400) The read performance is poor with DSMC (TSM) backup on CFS file-systems with memory more than or equal to 80 GB. * 3131949 (3099638) The vxfs_ifree_timelag(5) tunable when tuned, displays incorrect minimum value. * 3131962 (2899907) On CFS, some file-system operations like vxcompress utility and de-duplication fail to respond. * 3150372 (3150368) vx_writesuper() function causes the system to panic in evfsevol_strategy(). * 3220528 (2486589) Multiple threads may hang on systems with heavy file-system activities (viz. create, delete, lookup etc.). * 3254315 (3244613) fsadm(1M) command hangs while the I/O load on regular vxfs filesystem and checkpoint. Patch ID: PHKL_43475 * 2984718 (2970219) Panic in fcache_as_map+0x70 due to null v_vmdata. * 3023946 (2616622) The performance of the mmap() function is slow when the file system block size is 8KB and the page size is 4KB. * 3023953 (2750860) Performance issue due to CFS fragmentation in CFS cluster * 3039824 (3031226) Panic in vx_dnlc_getpathname during SRP testing * 3046923 (3046920) "Panic string : wait_for_lock: Already owns lock: vn_h_sl_pool" * 3069179 (2966277) VX_IPUNLOCK panic at vx_inactive_remove() as inode already freed * 3069181 (3010444) On a NFS filesystem cksum(1m) fails with cksum: read error on : Bad address * 3069189 (3049408) vx_bcrecycle_timelag_factor causes perf problem. * 3069236 (2874172) Infinite looping in vx_exh_hashinit() * 3069265 (2830513) ls hangs in vxglm:vxg_grant_sleep. * 3092230 (2439261) When vx_fiostats_tunable is changed from zero to non-zero, the system panics Patch ID: PHKL_43260 * 1935624 (1903977) On a Cluster File System (CFS) the write operation may fail * 1954685 (1934537) Panic in vx_free() due to NULL pointer * 2798208 (2767579) The system hangs during a lookup operation * 2852520 (2850738) Allocating memory with NOWAIT in callback routine during low memory condition. Patch ID: PHKL_43062 * 2036217 (2019793) panic in vx_set_tunefs when starting local zone * 2043627 (2028782) Q_SETQUOTA does not set current usage using quotactl API * 2410793 (1466351) mount hang looping in vx_bc_binval_cookie() * 2429335 (2337470) In the process of shrink fs, the fs out of inodes, fs version is 5.0MP4HF* * 2730965 (2730759) poor sequential read performance * 2796940 (2599590) Expanding or shrinking a DLV5 file system using the fsadm(1M)command causes a system panic. * 2798203 (2316793) After removing files df command takes 10 seconds to complete * 2806468 (2806466) "fsadm -R" resulting in panic at LVM layer due to vx_ts.ts_length set to 2GB. * 2831287 (1703223) run_fsck : vxupgrade First full fsck failed, exiting. * 2832560 (2829708) memory leak in worklist code path * 2847808 (2845175) vx_do_getacl() panic. Patch ID: PHKL_42892 * 1946124 (1797955) In some cases we miss calling vx_metaioerr() when fm_badwrite is set. * 1995390 (1985626) Panic in VxFS due to NULL vx_inode i_fsext pointer * 2036843 (2026799) fsppadm enforce hang on DST with FCL * 2084004 (2040647) CFS does not enforce vxquota hard-limit * 2092072 (2029556) Two panics in mutex_exit: not owner * 2194629 (2161379) repeated hangs in vx_event_wait() * 2222508 (2192895) VxFS 5.0MP3RP4 Panic while set/get acls - possible race condition * 2370061 (2370046) readahead with read nstream misses early blocks * 2720002 (1396859) global lock for DIO buffer header list has contention on very large systems * 2722869 (1244756) VxFS 5.0MP1 panic in vx_dnlc_purge_ip() due to deref a NULL d_flhead pointer * 2722958 (2696067) VxFS 11.31/5.0 : vx_daccess() does not observe GROUP_OBJ permissions * 2726001 (2371710) user quota information corrupts * 2730957 (2651922) Performance degradation of 'll' and high SYS% CPU in vx_ireuse() * 2733704 (2421901) LM "stress.bl1" test hit an assert "f:vx_getinoquota:1a" DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: PHKL_44288, PHCO_44287, PHCO_44286 * 3706699 (Tracking ID: 3703176) SYMPTOM: The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available . DESCRIPTION: The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available through the kctune(1M) interface. RESOLUTION: The code is modified so that the tunable to enable or disable VxFS inactive- thread throttling, and VxFS inactive-thread process throttling is available through the kctune(1M) interface with the relevant man page info. * 3706718 (Tracking ID: 3703176) SYMPTOM: The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available . DESCRIPTION: The tunable to enable or disable VxFS inactive-thread throttling, and VxFS inactive-thread process throttling was not available through the kctune(1M) interface. RESOLUTION: The code is modified so that the tunable to enable or disable VxFS inactive- thread throttling, and VxFS inactive-thread process throttling is available through the kctune(1M) interface with the relevant man page info. * 3716683 (Tracking ID: 3673599) SYMPTOM: For VxFS files inheriting some ACL entry from the parent directory having a default ACL entry, the initial class permission is not set correctly to align with the file mode creation mask and the umask setting at file creation time. DESCRIPTION: When a file is created, the ACL inheritance needs to take place before applying the file mode creation mask and the umask setting, so that the latter is honored . RESOLUTION: The code is modified to honour the file mode creation mask and the umask setting when creating files with ACL inheritance. * 3751300 (Tracking ID: 2439108) SYMPTOM: Due to the page alignment issues in the VxFS code, the system panics when the read_preferred_io tunable is set to a non-page aligned size. The following stack trace is observed: fcache_buf_create() vx_fcache_buf_create() vx_io_setup() vx_io_ext() vx_alloc_getpage() vx_do_getpage() vx_getpage1() vx_getpage() preg_vn_fault() fcache_as_fault() vx_fcache_as_fault() vx_do_read_ahead() vx_read_ahead() vx_fcache_read() vx_read1() vx_rdwr() DESCRIPTION: VxFS ends up consuming an extra page, when the preferred read I/O size is not a multiple of the page size, and runs out of the allocated pages before the getpage() function call could finish. This results in the panic. RESOLUTION: The code is modified to use the read_preferred_io tunable size only after rounding it by the page size. * 3770412 (Tracking ID: 3469644) SYMPTOM: The system panics in the vx_logbuf_clean() function, when it traverses the chain of transactions off the intent-log buffer. The stack trace is as follows: vx_logbuf_clean () vx_logadd () vx_log() vx_trancommit() vx_exh_hashinit () vx_dexh_create () vx_dexh_init () vx_pd_rename () vx_rename1_pd() vx_do_rename () vx_rename1 () vx_rename () vx_rename_skey () DESCRIPTION: The system panics in the vx_logbug_clean() function, when it tries to access an already freed transaction from the transaction chain to flush-it-to log. RESOLUTION: The code is modified to ensure that the transaction gets flushed to the log before it is freed. * 3781186 (Tracking ID: 3781085) SYMPTOM: When you work on a file system just short of 4 TB and attempt to extend the file system beyond 4 TB, the command fails with the following error: UX:vxfs extendfs: ERROR: V-3-23773: With existing block size 1K, cannot extend fs beyond 4296539903 sectors. The above error should not occur on disk layout versions (DLVs) higher than DLV 5. DESCRIPTION: For DLV 5, VxFS has a check for whether the file system size is greater than the limits set for DLV 5. However, the extendfs(1M) command uses the same check for all the DLVs, instead of DLV 5. This results in an error message. RESOLUTION: The code is modified to determine the maximum file system size, according to the corresponding DLVs. * 3794221 (Tracking ID: 1482790) SYMPTOM: System may panic when the mknod() operation is performed on the file system, when VxFS data management API (DMAPI) is used. The following stack trace is observed: vx_hsm_createandmkdir() vx_create() vns_create() vn_create() mknod() mknod() syscall() DESCRIPTION: DMAPI feature is always enabled in VxFS for hierarchical storage management. As part of VxFS DMAPI code, snode for the device are handled as VxFS inodes for the mknod () operation. This results in the inappropriate memory access and panic. RESOLUTION: The code is modified such that the inode type is checked and the further processing is done only if it is a VxFS inode. * 3796746 (Tracking ID: 3784126) SYMPTOM: The application experienced delays showing high vfault counters, as process text pages in memory are invalidated and need to be reloaded. For a Serviceguard cluster, the heartbeat communication is slowed down considerably because cmclds TEXT pages need to be paged in again. This results in a SG INIT failure. DESCRIPTION: When a freeze operation is handled, VxFS flushes and invalidates dirty pages of a file system. Due to a bug in the code, even read-only mmap pages, which are typically process TEXT pages, also get invalidated unnecessarily. Consider a freeze on /usr or some other file systems that host the program executables, such page invalidation can cause delays to applications as TEXT pages need to be faulted in again. RESOLUTION: The code is modified to skip the read only mmap pages invalidation during the file system freeze. Patch ID: PHKL_44201 * 2958890 (Tracking ID: 2038736) SYMPTOM: In case of a clone, the Bmap operation that allocates or truncates the file change log (FCL) file may panic the system. DESCRIPTION: The extent size can be truncated during the Bmap operation on a clone FCL file. This truncated buffer size is used to allocate buffer that does not match with the real buffer size of the extent. Use of this truncated buffer may result in a panic. RESOLUTION: The code is modified to use the real extent size instead of the truncated extent size during the buffer allocation of the FCL file. * 3243391 (Tracking ID: 2907816) SYMPTOM: The internal kernel conformance test hit an assert on the cluster file system. DESCRIPTION: While reading pages to memory in cluster file system, unlocking of cache- coherency lock was missing in one of the corner case, resulting into an assert for the internal test. RESOLUTION: The code is modified so that the missing unlock for the cache-coherency lock is added for the corner case. * 3597566 (Tracking ID: 3597482) SYMPTOM: The pwrite(2) function fails with EOPNOTSUPP error when the write range is in two indirect extents. DESCRIPTION: The ZFOD extent that belongs to DB2 pre-allocated files and the other DATA extent belongs to the adjacent INDIR, fail with the EOPNOTSUPP error. Because the range of the pwrite() function falls between two indirect extents. VxFS tries to coalesce the extents which belong to different indirect-address extents as a part of the transaction. This kind of meta-data change consumes a lot of transaction resources. However, the VxFS transaction engine is unable to support the current implementation and fails with an error message. RESOLUTION: The code is modified to retry the write transaction without combining the extents. * 3606381 (Tracking ID: 894230) SYMPTOM: In case of clones and Cluster File System (CFS), the clone removal operation may panic the system. DESCRIPTION: During the clone removal process, the removal flag on file-set is set without holding the proper locks. When the operation is performed on clones, this may result in a race condition on the file-set removal flag. Subsequently, this may panic the system, if the structure of the removed file-set is accessed. RESOLUTION: The code is modified to use the appropriate locks to synchronize setting the removal of the flag on the file-set. * 3614866 (Tracking ID: 3604750) SYMPTOM: The kernel loops during the extent re-org with the following stack trace: vx_bmap_enter() vx_reorg_enter_zfod() vx_reorg_emap() vx_extmap_reorg() vx_reorg() vx_aioctl_full() $cold_vx_aioctl_common() vx_aioctl() vx_ioctl() vno_ioctl() ioctl() syscall() DESCRIPTION: The extent re-org minimizes the file system fragmentation. When the re-org request is issued for an inode with a lot of ZFOD extents, it reallocates the extents of the original inode to the re-org inode. During this, the ZFOD extent are preserved and enter the re-org inode in a transaction. If the extent allocated is big, the transaction that enters the ZFOD extents becomes big and returns an error. Even when the transaction is retried the same issue occurs. As a result, the kernel loops during the extent re-org. RESOLUTION: The code is modified to enter the Bmap (block map) of the allocated extent and then perform the ZFOD processing. If you get a committable error during the ZFOD enter, then commit the transaction and continue with the ZFOD enter. * 3647852 (Tracking ID: 3650820) SYMPTOM: If a file system has empty named stream directories, then when a checkpoint is mounted, the system panics with the following stack trace: panicsys() vpanicunix:panic() die() trap() ktl0() vx_getblk_cmn() vx_getblk() vx_iupdat_local() vx_iupdatvxfs:vx_ireclaim_flush() vx_ireuse_cleanvxfs:vx_ilist_chunkclea() vx_inode_free_list() vx_ifree_scan_list() vx_workitem_processvxfs:vx_worklist_process() thread_start() DESCRIPTION: The problem occurs when several empty named-data stream directories are created. These empty named attribute directories are deleted, when ls-@ is called on the parent directory. However, if there are checkpoints on the file system, no provisions are made to push these changes down to them. As a result, the panic occurs when the checkpoint is mounted. RESOLUTION: The code is modified, to prevent the empty directories from being deleted, when ls-@ is called on the parent directory. Patch ID: PHKL_44166 * 3563002 (Tracking ID: 3560187) SYMPTOM: The kernel may panic when the buffer is freed in the vx_dexh_preadd_space() function with the message "Data Key Miss Fault in kernel mode". The following stack trace is observed: kmem_arena_free() vx_free() vx_dexh_preadd_space() vx_dopreamble() vx_dircreate_tran() vx_do_create() vx_create1() vx_create0() vx_create() vn_open() DESCRIPTION: The buffers in the extended-hash structure are allocated, zeroed outside, and freed outside the transaction retry loop. For some error scenarios, the transaction is re-executed from the beginning. Since the buffers are zeroed outside of the transaction retry loop, during the transaction retry the extended-hash structure may have some stale buffers from the last try. As a result, some stale parts of the structure are freed incorrectly.This results in panic. RESOLUTION: The code is modified to zero-out the extended-hash structure within the retry loop, so that the stale values are not used during retry. * 3583272 (Tracking ID: 3520349) SYMPTOM: When there is a huge number of dirty pages in the memory, and a sparse write is performed at a large offset of 4 TB or above, on an existing file that is not null, the file system hangs in the thread. The following stack trace is observed: fcache_buf_iowait() vx_fcache_buf_iowait() vx_io_wait() vx_alloc_getpage() vx_do_getpage() vx_getpage1() vx_getpage() preg_vn_fault() fcache_as_uiomove_rd() fcache_as_uiomove() vx_fcache_as_uiomove() vx_fcache_read() vx_read1() vx_rdwr() vn_rdwr() DESCRIPTION: When a sparse write is performed at an offset of 4TB or above, on a file that has ext4 extent orgtype with some blocks that are already allocated, this can result in a file system hang. This is caused due to a type casting bug in the offset calculation in the vxfs extent allocation code path. A sparse write should create a HOLE between the last allocated offset and the current offset on which the write is requested. Due to the type-casting bug, VxFS may allocate the space between the last offset and the new offset, instead of creating a HOLE in certain scenarios. This generates a huge number of dirty pages, and fills up the file system space incorrectly. The memory pressure due to the huge number of dirty pages causes the hang. The sparse write offset on which the problem occurs depends on the file system block size. For a file system with block size 1 KB, the problem can occur at the sparse write offset of 4TB. RESOLUTION: The code is modified so that the VxFS extent allocation code calculates the offset correctly, and does not allocate space for a sparse write. This resolves the type casting bug. * 3589463 (Tracking ID: 3066116) SYMPTOM: When adb vx_inactive_throttling and vx_inactive_process_throttling tunables are set to 1, the system panics due to NULL pointer dereference with the following stack trace: ... bubbleup vx_worklist_process vx_worklist_thread ... IMPORTANT: These adb tunables are for internal use and should remain at 0. Only under very special occasion will HP/Symantec advise a customer to set these tunables. DESCRIPTION: To prevent too many running inactive threads, two adb tunables "vx_inactive_throttling" and "vx_inactive_process_throttling" are introduced to fix the issue of vxfsd taking lot of CPU time after deleting some large directories. A bug in the code increments a local counter from 0 to 1. This in turn affects inactive work item dispatch. As a result, the empty work items are added to the local batch of work items. The system panics while processing this empty work item. RESOLUTION: The code is modified so that the counter is not incremented. Patch ID: PHKL_43852 * 2800296 (Tracking ID: 2715028) SYMPTOM: The fsadm(1M) command with the '-d' option may hang when compacting a directory if it is run on the Cluster File System (CFS) secondary node while the find(1) command is running on any other node. DESCRIPTION: During the compacting of a directory, the CFS secondary node has ownership of the inode of the directory. To complete the compacting of directory, the truncation message needs to be processed on the CFS primary node. For this to occur, the CFS primary node needs to have ownership of the inode of the directory. This causes a deadlock. RESOLUTION: The code is modified to force the processing of the truncation message on the CFS secondary node which initiated the compacting of directory. * 2857489 (Tracking ID: 2106036) SYMPTOM: On a VxFS file system the mount(1M) operation may hang with the following stack trace: vx_bc_binval_cookie() vx_blkinval_cookie() vx_freeze_flush_cookie() vx_freeze_all() vx_freeze() vx_set_tunefs1() vx_set_tunefs() vx_aioctl_full() vx_aioctl_common() vx_aioctl() vx_ioctl() DESCRIPTION: The hung process for a local mount waits for a buffer to be unlocked. However, the buffer is released only if the associated cloned map write gets flushed. However, the necessary flush does not occur when the file system is in the frozen state. RESOLUTION: The code is modified so that the flush maps the clones, and does not create new ones when it is in a state of being frozen or is frozen. * 3110574 (Tracking ID: 2332314) SYMPTOM: The internal noise.fullfsck test with Oracle Disk Manager (ODM) enabled, hits the "fdd_odm_aiodone:3" assert. DESCRIPTION: In case of failed I/Os in the fdd_write_clone_end() function, the error is not set on the buffer. This causes the assert to be triggered. RESOLUTION: The code is modified to set the error on the buffer in case of I/O failures in the fdd_write_clone_end() function. * 3276103 (Tracking ID: 3226462) SYMPTOM: On a cluster mounted file-system with unequal CPUs, while doing a lookup operation, a node may panic with the stack trace: vx_dnlc_recent_cookie vx_dnlc_getpathname audit_get_pathname_from_dnlc audit_clean_path $cold_audit_build_full_dir_name inline change_p_cdir DESCRIPTION: The cause of the panic is of out-of-bounds access in the counters[] array whose size is defined by the vx_max_cpu variable. The value of vx_max_cpu can differ between the CFS nodes, if the nodes have different number of processors. However, the code assumes this value is the same across the cluster. When propagating inode cookies across the cluster, the counter[] array is allocated based on the vx_max_cpu of the current CFS node. If the cookie is populated via vx_cbdnlc_populate_cookie(), having a CPU ID from another CFS node exceeding the local vx_max_cpu, the function vx_dnlc_recent_cookie() would access locations beyond the counter[] array allocated. RESOLUTION: The code is modified to detect the out-of-bound access at vx_dnlc_recent_cookie () and return the ENOENT error. * 3276111 (Tracking ID: 2552095) SYMPTOM: While the file system is re-organized by using the fsadm (1M) command the system may panic with the following stack trace: vx_iget() vx_aioctl_full() vx_aioctl_common() vx_aioctl() vx_ioctl() vx_ioctl_skey() DESCRIPTION: Due to a race condition in the vx_inactive() and vx_iget() function, an inode which is on the free list is given erroneously. This results in panic. RESOLUTION: The code is modified to take the necessary measures when the inode pointer is updated. Thereby the race condition is avoided. * 3404249 (Tracking ID: 2972183) SYMPTOM: "fsppadm enforce" takes longer than usual time force update the secondary nodes than it takes to force update the primary nodes. DESCRIPTION: The ilist is force updated on secondary node. As a result the performance on the secondary becomes low. RESOLUTION: Force update the ilist file on Secondary nodes only on error condition. * 3404252 (Tracking ID: 3072036) SYMPTOM: Reads from secondary node in CFS can sometimes fail with ENXIO (No such device or address). DESCRIPTION: The incore attribute ilist on secondary node is out of sync with that of the primary. RESOLUTION: The code is modified such that incore attribute ilist on secondary node is force updated with data from primary node. * 3404256 (Tracking ID: 3153919) SYMPTOM: The fsadm(1M) command may hang when the structural file set re-organization is in progress. The following stack trace is observed: vx_event_wait vx_icache_process vx_switch_ilocks_list vx_cfs_icache_process vx_switch_ilocks vx_fs_reinit vx_reorg_dostruct vx_extmap_reorg vx_struct_reorg vx_aioctl_full vx_aioctl_common vx_aioctl vx_ioctl vx_compat_ioctl compat_sys_ioctl DESCRIPTION: During the structural file set re-organization, due to some race condition, the VX_CFS_IOWN_TRANSIT flag is set on the inode. At the final stage of the structural file set re-organization, all the inodes are re-initialized. Since, the VX_CFS_IOWN_TRANSIT flag is set improperly, the re-initialization fails to proceed. This causes the hang. RESOLUTION: The code is modified such that VX_CFS_IOWN_TRANSIT flag is cleared. * 3404257 (Tracking ID: 3252983) SYMPTOM: On a high-end system greater than or equal to 48 CPUs, some file-system operations may hang with the following stack trace: vx_ilock() vx_tflush_inode() vx_fsq_flush() vx_tranflush() vx_traninit() vx_tran_iupdat() vx_idelxwri_done() vx_idelxwri_flush() vx_delxwri_flush() vx_workitem_process() vx_worklist_process() vx_worklist_thread() DESCRIPTION: The function to get an inode returns an incorrect error value if there are no free inodes available in incore, this error value allocates an inode on-disk instead of allocating it to the incore. As a result, the same function is called again resulting in a continuous loop. RESOLUTION: The code is modified to return the correct error code. * 3404258 (Tracking ID: 3259634) SYMPTOM: In CFS, each node with mounted file system cluster has its own intent log in the file system. A CFS with more than 4, 294, 967, 296 file system blocks can zero out an incorrect location resulting from an incorrect typecasting. For example, that kind of CFS can incorrectly zero out 65536 file system blocks at the block offset of 1, 537, 474, 560 (file system blocks) with a 8-Kb file system block size and an intent log with the size of 65536 file system blocks. This issue can only occur if an intent log is located above an offset of 4, 294, 967, 296 file system blocks. This situation can occur when you add a new node to the cluster and mount an additional CFS secondary for the first time, which needs to create and zero a new intent log. This situation can also happen if you resize a file system or intent log and clear an intent log. The problem occurs only with the following file system size and the FS block size combinations: 1kb block size and FS size > 4TB 2kb block size and FS size > 8TB 4kb block size and FS size > 16TB 8kb block size and FS size > 32TB For example, the message log can contain the following messages: The full fsck flag is set on a file system with the following type of messages: 2013 Apr 17 14:52:22 sfsys kernel: vxfs: msgcnt 5 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/sfsdg/vol1 file system fullfsck flag set - vx_ierror 2013 Apr 17 14:52:22 sfsys kernel: vxfs: msgcnt 6 mesg 017: V-2-17: vx_attr_iget - /dev/vx/dsk/sfsdg/vol1 file system inode 13675215 marked bad incore 2013 Jul 17 07:41:22 sfsys kernel: vxfs: msgcnt 47 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/sfsdg/vol1 file system fullfsck flag set - vx_ierror 2013 Jul 17 07:41:22 sfsys kernel: vxfs: msgcnt 48 mesg 017: V-2-17: vx_dirbread - /dev/vx/dsk/sfsdg/vol1 file system inode 55010476 marked bad incore DESCRIPTION: In CFS, each node with mounted file system cluster has its own intent log in the file system. When an additional node mounts the file system as a CFS Secondary, the CFS creates an intent log. Note that intent logs are never removed, they are reused. When you clear an intent log, Veritas File System (VxFS) passes an incorrect block number to the log clearing routine, which zeros out an incorrect location. The incorrect location might point to the file data or file system metadata. Or, the incorrect location might be part of the file system's available free space. This is silent corruption. If the file system metadata corrupts, VxFS can detect the corruption when it subsequently accesses the corrupt metadata and marks the file system for full fsck. RESOLUTION: The code is modified so that VxFS can pass the correct block number to the log clearing routine. * 3404506 (Tracking ID: 2745357) SYMPTOM: Performance enhancements are made for the read/write operation on Veritas File System (VxFS) structural files. DESCRIPTION: The read/write performance of VxFS structural files is affected when the piggy back data in the vx_populate_bpdata() function is ignored. This occurs if the buffer type is not mentioned properly, consequently requiring another disk I/O to get the same data. RESOLUTION: The code is modified so that the piggy back data is not ignored if it is of type VX_IOT_ATTR in the vx_populate_bpdata() function, thus leading to an improvement in the performance of the read/write to the VxFS structural files. * 3433778 (Tracking ID: 3433777) SYMPTOM: A single CPU machine panics due to the safety-timer check when the inodes are re-tuned. The following stack trace is observed: spinunlock() vx_ilist_chunkclean() vx_inode_free_list() vx_retune_ninode() vx_do_inode_kmcache_callback() vx_worklist_thread () kthread_daemon_startup ( ) DESCRIPTION: When the inode cache list is traversed, the vxfsd daemon schedules a "vx_do_inode_kmcache_callback" which does not free the CPU between the iterations. Thereby, the other threads cannot get access to the CPU. This results in panic. RESOLUTION: The code is modified to use the sched_yield() function for every iteration in "vx_inode_free_list" to free the CPU, so that the other threads get a chance to be scheduled. Patch ID: PHKL_43600 * 2043644 (Tracking ID: 1991446) SYMPTOM: While attempting to mount a cluster file system, the following error is displayed: Mount Error : UX:vxfs mount: ERROR: V-3-21272: mount option(s) incompatible with file system DESCRIPTION: In a cluster file system (CFS), when a secondary node attempts to mount a cluster file system, the primary node, which already has mounted checks the secondary node's mount options against its own. If they are not the same, it rejects the secondary node's mount attempt. By default, the file system is mounted with the large file option. When we change the option to nolargefiles using the file system administration command, then mount option flag stored in volume reservation table is not in sync with primary. RESOLUTION: The mount option flag is updated properly when converting fs to largefiles or nolargefiles through fsadm(1M). * 2196876 (Tracking ID: 2184114) SYMPTOM: The CFSMount and the CVMVolDg agents times out and makes the resources offline. DESCRIPTION: In case of a large file system, the stat operation on a cluster mount takes long time to respond. The fsckptadm(1M) utility calls the vx_local_statfset() function, which reports the statistics of a fileset .It freezes the file-system while it reads and re- calculates the requested information. This can take a very long time for larger file-system and time-out its callers. RESOLUTION: The code is modified not to call the vx_fsetstat() function instead call other function which returns similar information without freezing the file system. * 2414966 (Tracking ID: 2387609) SYMPTOM: The quota usage is set to ZERO while performing a mount or umount of the file system though files owned by users exist. This issue may occur after some file creations and deletions. Checking the quota usage using the "vxrepquota" command and the output is as follows: # vxrepquota -uv /vx/sofs1/ /dev/vx/dsk/sfsdg/sofs1 (/vx/sofs1): Block limits File limits User used soft hard timeleft used soft hard timeleft testuser1 -- 0 3670016 4194304 0 0 0 testuser2 -- 0 3670016 4194304 0 0 0 testuser3 -- 0 3670016 4194304 0 0 0 Additionally the quota usage may not be updated after inode/block usage reaches ZERO. DESCRIPTION: The issue occurs when Veritas File System (VxFS) merges external per node quota files with internal quota file. The block offset within external quota file can be calculated wrongly in some cases. When a hole is found in the per node quota file, the file offset such that it points the next non-hole offset is modified but the block offset is not changed accordingly which points to the next available quota record in a block. VxFS updates the per node quota records only when the global internal quota file shows either of some bytes or inode usage. Otherwise it does not copy the usage from global quota file to per node quota file. But when the quota usage in external quota files becomes zero and both bytes and inode usage in global file becomes zero, per node quota records are not updated and left with incorrect usage. It must also check bytes or inodes usage in per node quota record. It must skip coping records only when bytes and inodes usage in both global quota file and per node quota file is zero. RESOLUTION: The code is modified to correctly calculate the block offset when any hole is found in per node quota file. Also the blocks or inodes usage in per node quota record are checked while updating the user quota usage. * 2437800 (Tracking ID: 2403126) SYMPTOM: On a cluster mounted file-system certain file system activities may seem to be non-progressive when one of the nodes in the system leave/reboots. A stack trace similar to the following is displayed: e_sleep_thread() vx_event_wait() vx_async_waitmsg() vx_msg_send() vx_send_rbdele_resp() vx_recv_rbdele+00029C () vx_recvdele+000100 () vx_msg_recvreq+000158 () vx_msg_process_thread+0001AC () vx_thread_base+00002C () threadentry+000014 (??, ??, ??, ??) DESCRIPTION: When a node in the cluster leaves, a reconfiguration happens and all the resources that are held by the leaving nodes are consolidated. This is done on one node of the cluster called the primary node. Each node sends a message to the primary node about the resources it is currently holding. During the reconfiguration, sometimes Veritas File System (VxFS) may incorrectly calculate the message length which is larger than what the Veritas Group Membership and Atomic Broadcast (GAB) layer can handle. As a result the message is lost. The sender assumes that the message is sent and waits for the acknowledgement. Since the message is dropped at sender, the master node waits for this message forever. As a result the reconfiguration at the primary node never completes, causing the cluster to hang. RESOLUTION: The code is modified so that the message length calculation is done properly and GAB can handle the messages. * 2555203 (Tracking ID: 2555198) SYMPTOM: On HPUX 11.31 binary mode, File Transfer Protocol (FTP) transfer uses the sendfile() interface, which does not create the DMAPI events for Hierarchical Storage Management (HSM). DESCRIPTION: The sendfile() interface does not call the Veritas File System (VxFS) read() function that creates the DMAPI events. It uses the HP Unified File Cache(UFC) interface. The UFC interface is not aware of the HSM application. As a result, the DMAPI events are not generated. RESOLUTION: The code is modified to set a flag in the vfs structure during the mount time, to indicate if the file system is configured under HSM. This flag information is used by the UFC interface to generate the DMAPI events. * 2564419 (Tracking ID: 2515459) SYMPTOM: On a VxFS files-system with clones mounted, tuning some file-system parameters may hang with the following stack trace vx_bc_binval_cookie vx_blkinval_cookie vx_freeze_flush_cookie vx_freeze_all vx_freeze vx_set_tunefs1 vx_set_tunefs vx_aioctl_full vx_aioctl_common vx_aioctl vx_ioctl DESCRIPTION: The hanging process for local mount waits for a buffer to be unlocked. But that buffer can only be released if its associated cloned map writes get flushed. But a necessary flush is missed. RESOLUTION: The code is modified to synchronize cloned map writes so that all the cloned maps are cleared and the buffers associated with them are released. * 3099686 (Tracking ID: 2565400) SYMPTOM: On systems with physical memory equal to or more than 80 GB, the sequential buffered I/O reads degrade up to 100 times. DESCRIPTION: The read-ahead of sequential buffered I/O reads do not occur because the read- ahead size of the file system is calculated incorrectly. RESOLUTION: The code is modified to fix the incorrect typecast of the read-ahead size of the file system. * 3131949 (Tracking ID: 3099638) SYMPTOM: When the vxfs_ifree_timelag(5) tunable is tuned the following error message is displayed: # kctune vxfs_ifree_timelag=400 ERROR: mesg 095: V-2-95: Setting vxfs_ifree_timelag to 450 since the specified value for vxfs_ifree_timelag is less than the recommended minimum value of 1035 DESCRIPTION: In the vxfs_ifree_timelag(5) tunable man page, the minimum value is set to "None". The error message is displayed when the vxfs_ifree_timelag(5) tunable is set to a value which is less than 450. In the error message, a garbage value is displayed as the recommended minimum value. The error occurs because a single argument is passed for the error message that has two format specifier's. RESOLUTION: The code is modified to set the correct minimum value of the vxfs_ifree_timelag (5) tunable, and display the correct error message. * 3131962 (Tracking ID: 2899907) SYMPTOM: Some file-system operations on a Cluster File System (CFS) may hang with the following stack trace. vxg_svar_sleep_unlock vxg_grant_sleep vxg_cmn_lock vxg_api_lock vx_glm_lock vx_mdele_hold vx_extfree1 vx_exttrunc vx_trunc_ext4 vx_trunc_tran2 vx_trunc_tran vx_cfs_trunc vx_trunc vx_inactive_remove vx_inactive_tran vx_cinactive_list vx_workitem_process vx_worklist_process vx_worklist_thread vx_kthread_init kernel_thread DESCRIPTION: In CFS, a node can lock a mdelelock for an extent map while holding a mdelelock for a different extent map locked. This can result in a deadlock between different nodes in the cluster. RESOLUTION: The code is modified to prevent the deadlock between different nodes in the cluster. * 3150372 (Tracking ID: 3150368) SYMPTOM: A periodic sync operation on an Encrypted Volume and File System (EVFS) configuration may cause the system to panic with the following stack trace: evfsevol_strategy() io_invoke_devsw() vx_writesuper() vx_fsetupdate() vx_sync1() vx_sync0() $cold_vx_do_fsext() vx_workitem_process() vx_worklist_process() vx_walk_fslist_threaded() vx_walk_fslist() vx_sync_thread() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: In the EVFS environment, EVFS may get the STALE or garbage value of b_filevp, which is not initialized by Veritas File System (VxFS) causing the system to panic. RESOLUTION: The code is modified to initialize the b_filevp. * 3220528 (Tracking ID: 2486589) SYMPTOM: On systems with heavy file-system activity (viz. create, delete, lookup), multiple threads may seem to be hung with the following stack trace: vx_ireuse_steal() vx_ireuse() vx_iget() DESCRIPTION: In cases where there is a heavy file system activity Veritas File System (VxFS) may run out of inodes since it has already reached the max inode limit set based of the system's memory configuration. Currently, an inode is attempted endlessly to be allocated which causes the system to hang. RESOLUTION: The code is modified to return ENOINODE instead of retrying to get the inodes continuously. * 3254315 (Tracking ID: 3244613) SYMPTOM: A file-system extent operation by using the fsadm(1M) command may hang with the following stack trace: vx_event_wait(inlined) vx_delay2+0x2a0 cold_vx_active_common_flush+0x80 vx_close+0x70 vn_close(inlined) vno_close+0xe0 closef(inlined) DESCRIPTION: During a resize operation, the fsadm(1M) command freezes the file system. In an error case, the fsadm(1M) command exits without thawing the file system. This results in a hang. RESOLUTION: The code is modified to thaw the file system, before the fsadm(1M) command exits in the error case. Patch ID: PHKL_43475 * 2984718 (Tracking ID: 2970219) SYMPTOM: When CPUs are added to the system, the system may panic with the following stack trace: fcache_as_map+0x70 () vx_fcache_map+0x1d0 () vx_write_default+0x340 () vx_write1+0xea0 () vx_rdwr+0x1130 () rfs3_write+0x5b0 () common_dispatch+0xc10 () rfs_dispatch+0x40 () svc_getreq+0x250 () svc_run+0x310 () svc_do_run+0xd0 () nfssys+0x7c0 () hpnfs_nfssys+0x60 () coerce_scall_args+0x130 () syscall+0x590 () DESCRIPTION: The issue occurs because of the race condition between the vnode-map initialization and deinitialization. RESOLUTION: The code is modified to add debug messages that will confirm if a race condition exists between the vnode-map initialization and deinitialization. The debug messages will help gather information if the problem occurs again. * 3023946 (Tracking ID: 2616622) SYMPTOM: The performance of the mmap() function is slow when the file system block size is 8 KB and the page size is 4 KB. DESCRIPTION: When the file system block size is 8 KB, the page size is 4 KB, and the mmap() function is performed on an 8 KB file, the file gets represented in memory as two pages (0 and 1). When the memory at offset 0 in the mapping is modified, a page fault occurs for page 0 in the file. When that disk block is allocated and marked as valid, the page mentioned in the fault request is expected to get flushed out to the disk and therefore, it is left uninitialized on the disk by default. Only that particular page is cleaned in memory and left modified so that it is known that the data in memory ismore recent than the data on disk. However, the other half of the block (which could eventually be mapped to page 1) gets cleared with a synchronous write because such a fault may not occur. This synchronous clearing of the other half of 8 KB block causes performance degradation. RESOLUTION: The code is modified to expand the range of the fault to cover the entire 8 KB block. The message from the OS asking for only one page is ignored and two pages are given to cover the entire file system block to save the separate synchronous clearing of the other half of 8 KB block. * 3023953 (Tracking ID: 2750860) SYMPTOM: On a large file system(4TB or greater), the performance of the write(1) operation with many small request sizes may degrade, and many threads may be found sleeping with the following stack trace: real_sleep sleep_one vx_sleep_lock vx_lockmap vx_getemap vx_extfind vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_downlevel vx_searchau_uplevel vx_searchau vx_extentalloc_device vx_extentalloc vx_te_bmap_alloc vx_bmap_alloc_typed vx_bmap_alloc vx_write_alloc3 vx_recv_prealloc vx_recv_rpc vx_msg_recvreq vx_msg_process_thread kthread_daemon_startup DESCRIPTION: For a cluster-mounted file system, the free-extend-search algorithm is not optimized for a large file system (4TB or greater), and for instances where the number of free Allocation Units (AUs) available can be very large. RESOLUTION: The code is modified to optimize the free-extend-search algorithm by skipping certain AUs. This reduces the overall search time. * 3039824 (Tracking ID: 3031226) SYMPTOM: During an internal SRP testing the system panics with the following stack trace: vx_dnlc_getpathname+0xa10 pfs_vop_dnlc_getpathname+0x68 secfs_dnlc_getpathname+0x10 vfs_stack_vop_dnlc_getpathname+0x6c audit_get_pathname_from_dnlc+0x1a0 audit_clean_path+0x114 audit_build_full_dir_name+0x90 change_p_cdir+0x2c ncf_srp_chdir+0x7c chdir+0x94 syscall+0x318 $syscallrtn+0x0 DESCRIPTION: The panic occurs due to dereferencing the DNLC entry which is set as NULL in the vx_dnlc_getpathname() function. RESOLUTION: The code is modified in the vx_dnlc_getpathname() function to check for the validity of the DNLC entry, before dereferencing DNLC. * 3046923 (Tracking ID: 3046920) SYMPTOM: The system may panic with the following panic string "wait_for_lock: Already owns lock: vn_h_sl_pool". The panic stack observed is as follows: panic+0x410 wait_for_lock+0xa60 vfs_stack_lock_vp+0x90 vfs_teardown_stack+0x30 vx_inode.c:8133 vx_inactive(inlined) vx_vn_inactive+0xd70 pfs_vop_inactive+0xc0 sec_file_rules:secfs_inactive+0x30 vfs_stack_vop_inactive+0xb0 $cold_vn_rele_inactive+0x10 unmapvnode+0x450 dispreg+0x5c0 exit_post_single_threaded_notify+0x2d0 exit+0x4e0 pm_issig.c:1797 psig_core(inlined) $cold_psig+0x2a0 hl_ivt.c:597 post_hndlr(inlined) vm_hndlr+0x840 bubbleup+0x880 DESCRIPTION: There was an error when the lock is held at the VxFS layer and the vfs_teardown_stack is called to destroy the vnode. Hence, the system panics. RESOLUTION: The code is modified to release the lock held at the VxFS layer before the vfs_teardown_stack is called. * 3069179 (Tracking ID: 2966277) SYMPTOM: Systems with high file-system activity like read/write/open/lookup may panic with the following stack trace due to a rare race condition: spinlock+0x21 ( ) -> vx_rwsleep_unlock() vx_ipunlock+0x40() vx_inactive_remove+0x530() vx_inactive_tran+0x450() vx_local_inactive_list+0x30() vx_inactive_list+0x420() -> vx_workitem_process() -> vx_worklist_process() vx_worklist_thread+0x2f0() kthread_daemon_startup+0x90() DESCRIPTION: ILOCK is released before doing a IPUNLOCK that causes a race condition. This results in a panic when an inode that has been set free is accessed. RESOLUTION: The code is modified so that the ILOCK is used to protect the inodes' memory from being set free, while the memory is being accessed. * 3069181 (Tracking ID: 3010444) SYMPTOM: On a Network File System (NFS) mounted file system, the operations which read the file via the cksum (1m) command may fail with the following error message: cksum: read error on : Bad address The following error messages would also be seen in the syslog vmunix: WARNING: Synchronous Page I/O error DESCRIPTION: When the read-vnode operation (VOP_RDWR) is performed, certain requests are converted to direct the I/O for optimisation. However, the NFS buffers passed during the read requests are not the user buffers'. As a result, there is an error. RESOLUTION: The code is modified to convert the I/O requests to the direct I/O, only if the buffer passed during the I/O is the user buffer. * 3069189 (Tracking ID: 3049408) SYMPTOM: When the system is under the file-cache pressure, the find(1) command takes time to operate. DESCRIPTION: The Veritas File System (VxFS) does not grow the metadata-buffer cache under system or file-cache memory pressure. When the vx_bcrecycle_timelag factor drops to zero, the metadata buffers are reused immediately after they are accessed. As a result, a large-directory scan takes many physical I/Os to scan the directory. The end result is that VxFS ends up performing excessive re- reads for the same data, into the metadata-buffer cache. However, the file- cache memory pressure is normal. There is no need to shrink the metadata-buffer cache, just because there is a file-cache memory pressure. RESOLUTION: The code is modified to unlink the metadata-buffer cache behaviour from the file-cache memory pressure. * 3069236 (Tracking ID: 2874172) SYMPTOM: Network File System (NFS) file creation thread might loop continuously with the following stack trace: vx_getblk_cmn(inlined) vx_getblk+0x3a0 vx_exh_allocblk+0x3c0 vx_exh_hashinit+0xa50 vx_dexh_create+0x240 vx_dexh_init+0x8b0 vx_do_create+0x1e0 vx_create1+0x1d0 vx_create0+0x270 vx_create+0x40 rfs3_create+0x420 common_dispatch+0xb40 rfs_dispatch+0x40 svc_getreq+0x250 svc_run+0x310 svc_do_run+0xd0 nfssys+0x6a0 hpnfs_nfssys+0x60 coerce_scall_args+0x130 syscall+0x590 DESCRIPTION: The Veritas File System (VxFS) file creation vnode operation (VOP) routine expects the parent vnode to be a directory vnode pointer. But, the NFS layer passes a stale file vnode pointer by default. This might cause unexpected results such as hang during VOP handling. RESOLUTION: The code is modified to check for the vnode type of the parent vnode pointer at the beginning of the create VOP call and return an error if it is not a directory vnode pointer. * 3069265 (Tracking ID: 2830513) SYMPTOM: The Cluster File System (CFS) hangs while performing file removal operations and the following stack trace is displayed: vxglm::vxg_grant_sleep+0x110 () vxglm::vxg_cmn_lock+0x5a0 () vxglm::vxg_api_lock+0x310 () vx_glm_lock+0x70 () vx_mdelelock+0x70 () vx_mdele_hold+0xe0 () vx_extfree+0x700 () DESCRIPTION: The CFS hangs due to a missing unlock call for the file removal operations. RESOLUTION: The code is modified to unlock the file removal operations. * 3092230 (Tracking ID: 2439261) SYMPTOM: When vx_fiostats_tunable is changed from zero to non-zero, the system panics with the following stack trace: panic_save_regs_switchstack+0x110 () panic+0x410 () bad_kern_reference+0xa0 () $cold_pfault+0x5c0 () vm_hndlr+0x370 () bubbleup+0x880 () vx_fiostats_do_update+0x140 () vx_fiostats_update+0x170 () vx_read1+0x10e0 () vx_rdwr+0x790 () vno_rw+0xd0 () rwuio+0x32f () pread+0x121 () syscall+0x590 () in ?? () DESCRIPTION: When vx_fiostats_tunable is changed from zero to non-zero, all the incore-inode fiostats attributes are set to NULL. When these attributes are accessed, the system panics due to the NULL pointer dereference. RESOLUTION: The code has been modified such that when vx_fiostats_tunable is changed from zero to non-zero, it is verified if the fiostats attributes of inode are NULL or not. This will prevent the panic. Patch ID: PHKL_43260 * 1935624 (Tracking ID: 1903977) SYMPTOM: On a Cluster File System (CFS) the write operation may fail and the system panics with the following stack trace: vx_active_common_flush+0x74() vx_rwlock+0x14() DESCRIPTION: When a check is done to verify if the VX_DLSYNCFREE flag is set inside the function vx_getblk_clust(), there is no lock to protect the check. As a result, there could be a race condition leading to inconsistent data, this causes the system to panic. RESOLUTION: The code is modified to synchronize the VX_DLSYNCFREE flag updates using the VX_FSQ_LOCK. * 1954685 (Tracking ID: 1934537) SYMPTOM: The reverse-name-lookup operation on an inode may panic the machine with the following stack trace: vx_free vx_traverse_tree+0x4a0 vx_dir_lookup+0x1e4 vx_rev_namelookup+0x294 vx_aioctl_common+0xac4 vx_aioctl+0x12c vx_ioctl+0xe0 DESCRIPTION: During the lookup operation if the memory allocation fails, the user still goes ahead and adds it to the used memory, and later tries to free that memory. This results in the panic. RESOLUTION: The code is modified to update the usage counters correctly, and skip updating the count during the error condition. An additional check is added to free only the non-null buffers. * 2798208 (Tracking ID: 2767579) SYMPTOM: The system hangs during a lookup operation with the following stack trace: vx_dnlc_pathname_realloc+0x80 () vx_dnlc_getpathname+0xcf0 () audit_get_pathname_from_dnlc+0x370 () audit_dnlc_path_name+0x70 ftruncate+0x480 () syscall+0x590 () DESCRIPTION: The system hangs because of an infinite loop that gets triggered when an inode with the negative DNLC entry is encountered, during a reverse name DNLC lookup. RESOLUTION: The code is modified to add an avoidance fix to prevent the infinite loop to occur. * 2852520 (Tracking ID: 2850738) SYMPTOM: The system may hang with the following stack trace during the low memory condition: swtch_to_thread(inlined) slpq_swtch_core+0x520 real_sleep(inlined) sleep+0x400 mrg_reserve_swapmem(inlined) $cold_steal_swap+0x460 $cold_kalloc_nolgpg+0x4b0 kalloc_internal(inlined) $cold_kmem_arena_refill+0x650 kmem_arena_varalloc+0x280 vx_alloc(inlined)vx_worklist_enqueue+0x40 vx_buffer_kmcache_callback+0x160 kmem_gc_arena(inlined) foreach_arena_ingroup+0x840 kmem_garbage_collect_group(inlined) kmem_garbage_collect+0x390 kmem_arena_gc+0x240 kthread_daemon_startup+0x90 DESCRIPTION: The VxFS kernel memory callback() routine allocates memory with M_WAITOK flag. This results in the system hang during the low memory condition as the callback () routine waits for memory allocation. RESOLUTION: The code is modified to allocate memory without waiting in the VxFS kernel memory callback() routine. Patch ID: PHKL_43062 * 2036217 (Tracking ID: 2019793) SYMPTOM: While umounting the file system the system may panic and the following stack trace is displayed: vx_set_tunefs+0x264() vx_aioctl_full+0xc7c() vx_aioctl_common+0x738() vx_aioctl+0x13c() vx_ioctl+0xe4() syscall+0xcc() DESCRIPTION: Due to a race condition during the umount operation to update fse_fs and the fse_zombie pointers, there may be a small window where both the pointers are out of sync, which results in the panic. RESOLUTION: The code is modified to set the fsext->fse_zombie pointer before the fsext- >fse_fs pointer to keep them consistent. * 2043627 (Tracking ID: 2028782) SYMPTOM: For a file managed by Hierarchical Storage Management (HSM), application file quota gets doubled after an HSM migrate or recall process.DESCRIPTION: HSM file quota get doubled after HSM migrate/recall process. DESCRIPTION: A file has a set of quota when it is created. For efficient storage, the data is moved from the high speed disk to the low speed tape using HSM. This process is called HSM migrate. When the data is recalled from the tape to the disk, the HSM quota for the file gets updated and doubled. RESOLUTION: The code is modified to handle quota updates for files managed by HSM correctly. * 2410793 (Tracking ID: 1466351) SYMPTOM: Mount hangs in vx_bc_binval_cookie like the following stack, delay vx_bc_binval_cookie vx_blkinval_cookie vx_freeze_flush_cookie vx_freeze_all vx_freeze vx_set_tunefs1 vx_set_tunefs vx_aioctl_full vx_aioctl_common vx_aioctl vx_ioctl genunix:ioctl unix:syscall_trap32 DESCRIPTION: The hanging process is waiting for a buffer to be unlocked. But that buffer can only be released if its associated cloned map writes get flushed. But a necessary flush is missed. RESOLUTION: Add code to synchronize cloned map writes so that all the cloned maps will be cleared and the buffers associated with them will be released. * 2429335 (Tracking ID: 2337470) SYMPTOM: The Cluster File System (CFS) can unexpectedly and prematurely report a 'file system out of inodes' error when attempting to create a new file. The following error message is displayed: vxfs: msgcnt 1 mesg 011: V-2-11: vx_noinode - /dev/vx/dsk/dg/vol file system out of inodes. DESCRIPTION: While allocating new index nodes (inodes) in a CFS, Veritas File System (VxFS) searches for an available free inode in the Inode Allocation Units (IAUs) that are delegated to the local node. If none are available, it searches the IAUs that are not delegated to any node, or revokes an IAU delegated to another node. Gaps may be created in the IAU structures as a side effect of the CFS delegation processing. However, while searching for an available free inode, if VxFS ignores any gaps, new IAUs cannot be created if the maximum size of the metadata structures reaches (2^31). Therefore, one of the gaps must be populated and used for the allocation of the new inode. If the gaps are ignored, VxFS may prematurely report the "file system out of inodes" error message even though there is enough free space in the VxFS file system to create new inodes. RESOLUTION: The code is modified to allocate new inodes from the gaps in the IAU structures created as a part of the CFS delegation processing. * 2730965 (Tracking ID: 2730759) SYMPTOM: The sequential read performance is poor because of the read-ahead issues. DESCRIPTION: The read-ahead on sequential reads performed incorrectly because of wrong read- advisory and the read-ahead pattern offsets are used to detect and perform the read-ahead. Also, more sync reads are performed which can affect the performance. RESOLUTION: The code is modified and the read-ahead pattern offsets are updated correctly to detect and perform the read-ahead at the required offsets. The read-ahead detection is also modified to reduce the sync reads. * 2796940 (Tracking ID: 2599590) SYMPTOM: Expansion of a 100% full file system may panic the machine with the following stack trace. bad_kern_reference() $cold_vfault() vm_hndlr() bubbledown() vx_logflush() vx_log_sync1() vx_log_sync() vx_worklist_thread() kthread_daemon_startup() DESCRIPTION: When 100% full file system is expanded intent log of the file system is truncated and blocks freed up are used during the expansion. Due to a bug the block map of the replica intent log inode was not getting updated thereby causing the block maps of the two inodes to differ. This caused some of the in- core structures of the intent log to go NULL. The machine panics while de- referencing one of this structure. RESOLUTION: Updated the block map of the replica intent log inode correctly. 100% full file system now can be expanded only If the last extent in the intent log contains more than 32 blocks, otherwise fsadm will fail. To expand such a file-system, some of the files should be deleted manually and resize be retried. * 2798203 (Tracking ID: 2316793) SYMPTOM: After removing the files in a file system, the df(1M)command which uses the statfs(2)function may take 10 seconds to complete. DESCRIPTION: To obtain an up-to- date and valid free block count in a file system a delay and retry loop delays for one second and retries 10 times. This excessive retrying causes a 10 second delay per file system while executing the df(1M) command. RESOLUTION: The code is modified to reduce the original 10 retries with one second delay each, to one retry after a 20 millisecond delay. * 2806468 (Tracking ID: 2806466) SYMPTOM: A reclaim operation on a filesystem mounted on a Logical Volume Manager (LVM) volume using the fsadm(1M) command with the 'R' option may panic the system and the following stack trace is displayed: vx_dev_strategy+0xc0() vx_dummy_fsvm_strategy+0x30() vx_ts_reclaim+0x2c0() vx_aioctl_common+0xfd0() vx_aioctl+0x2d0() vx_ioctl+0x180() DESCRIPTION: Thin reclamation is supported only on the file systems mounted on a Veritas Volume Manager (VxVM) volume. RESOLUTION: The code is modified to error out gracefully if the underlying volume is LVM. * 2831287 (Tracking ID: 1703223) SYMPTOM: The internal local mount test exits due to full fsck operation failure. DESCRIPTION: While testing, if some directory inode is set with the extended VX_IEREMOVE operation it must mark the files from the directory with the extended operation. If all the files inside that directory are not set with the VX_IEREMOVE operation, it results in a number of unreferenced files and the test fails. RESOLUTION: The code is modified to remove the file entries if the directory inode has VX_IEREMOVE operation set. * 2832560 (Tracking ID: 2829708) SYMPTOM: On a locally mounted Veritas File System (VxFS) machine, the system hangs due to low memory during an internal test. DESCRIPTION: The system hangs because the memory allocated to the structure used to enqueue or dequeue work items in batches, is not freed. RESOLUTION: The code is modified to free the memory. * 2847808 (Tracking ID: 2845175) SYMPTOM: When the Access Control List (ACL) feature is enabled, the system may panic with "Data Key Miss Fault in KERNEL mode" error message in the vx_do_getacl() function and the following stack trace is displayed: vx_do_getacl+0x840 () vx_getacl+0x70 () acl+0x480 () DESCRIPTION: In the vx_do_getacl() function, a local variable is accessed without being initializing as a result leading to a panic. RESOLUTION: The code is modified to initialize the local variable to NULL before using it. Patch ID: PHKL_42892 * 1946124 (Tracking ID: 1797955) SYMPTOM: In a setup that involves files with large extents (greater than 32 MB), which have encountered state map corruptions previously, the file system is disabled and the following message is displayed in the syslog: WARNING: msgcnt 3 mesg 037: V-2-37: vx_metaioerr - vx_tflush_1 - /dev/vx/dsk// file system meta data write error in dev/block ?/???? DESCRIPTION: The file system is disabled after an existing corruption is discovered. According to the VxFS I/O error policy, the file system must be disabled only when a read/write operation fails with an I/O error to prevent further corruption and not when an existing corruption is discovered. RESOLUTION: The code has been modified such that the file system is not disabled in case of a discovered corruption. * 1995390 (Tracking ID: 1985626) SYMPTOM: The system panics during multiple parallel unmount operations. The following stack trace is displayed: vx_freeze_idone_list+24c vx_workitem_process+10 vx_worklist_process+344 vx_worklist_thread+94 DESCRIPTION: During simultaneous unmount operations, a race condition occurs between the INODE_DEINIT() and FREEZE_IDONE_LIST() functions. While the INODE_DEINIT() function uses locks, the FREEZE_IDONE_LIST() function does not use locks while updating a pointer value. Hence, there is a possibility of a pointer having a NULL value, which gets de-referenced. This results in a panic. RESOLUTION: The code is modified to serialize the inode de-initialization operation. * 2036843 (Tracking ID: 2026799) SYMPTOM: A hang occurs while enforcing the policy on File Change Log (FCL) inode and the following stack trace is displayed: lwp_park() cond_wait_queue () cond_wait () pthread_cond_wait() ts_drain_masters() do_walk_ilist() do_process() do_enforce() main() _start() DESCRIPTION: While enforcing the policy on FCL, the file system is frozen. After enforcing the policy on FCL, a function is called which checks whether the file system is frozen. If it is frozen, the function returns the EACTIVERETRY error. This error gets percolated to further functions. The AIOCTL_COMMON() function performs the retry operation endlessly, resulting in a hang. RESOLUTION: The code is modified to return the EACTIVERETRY error only if the file system is not in the frozen state. * 2084004 (Tracking ID: 2040647) SYMPTOM: Quota on cluster mounted system does not enforce hard limit. DESCRIPTION: The information on the hard limit is stored in an unsigned integer. When a larger value is subtracted from it, it wraps around and stores an incorrect value. RESOLUTION: The code is modified to handle the wrap around and special code is added to identify the quota soft limit in the Cluster File System (CFS) environment. * 2092072 (Tracking ID: 2029556) SYMPTOM: When the file system is full, it is not possible to remove a file which has more than 13 hard links. This may lead to a panic with the following stack trace: panicsys() vpanic() panic() mutex_panic() vx_iunlock() vx_remove_tran() vx_do_remove() vx_remove1() vx_remove() vn_remove() unlink() syscall_trap32() DESCRIPTION: While removing or updating an inode, a new attribute inode is allocated which is modified. When the file system is full, no new attribute inodes can be allocated and the above operation may panic the system. RESOLUTION: The code is modified such that the file system can modify the attribute inode rather than allocating new attribute inode. * 2194629 (Tracking ID: 2161379) SYMPTOM: In a Cluster File System (CFS) environment, various file system operations hang with the following stack trace: T1: vx_event_wait() vx_async_waitmsg() vx_msg_send() vx_iread_msg() vx_rwlock_getdata() vx_glm_cbfunc() vx_glmlist_thread() T2: vx_ilock() vx_assume_iowner() vx_hlock_getdata() vx_glm_cbfunc() vx_glmlist_thread() DESCRIPTION: Due to improper handling of the ENOTOWNER error in the ireadreceive() function, the operation is retried repeatedly while holding an inode lock. All the other threads are blocked, thus causing a deadlock. RESOLUTION: The code is modified to release the inode lock on the ENOTOWNER error and acquire it again, thus resolving the deadlock. * 2222508 (Tracking ID: 2192895) SYMPTOM: A system panic occurs when executing File Change Log (FCL) commands and the following stack trace is displayed: panicsys() panic_common() panic() vmem_xalloc() vmem_alloc() segkmem_xalloc() segkmem_alloc_vn() vmem_xalloc() vmem_alloc() kmem_alloc() vx_getacl() vx_getsecattr() fop_getsecattr() cacl() acl() syscall_trap32() DESCRIPTION: The Access Control List (ACL) count in the inode can be corrupted due to a race condition. For example, the setacl() function can change the ACL count when the getacl() function is processing the same inode. This results in an incorrect ACL count. RESOLUTION: The code is modified to add protection to the vulnerable ACL count to avoid corruption. * 2370061 (Tracking ID: 2370046) SYMPTOM: Read ahead operations miss to read early blocks of data when the value of "read_nstream" tunable is not set to 1. PROBLEM DESCRIPTION: The read ahead operation reads the file on demand and does not read the portion of the file which is to be read in advance. This occurs because the parameter that determines the next read ahead offset is incorrectly reset. RESOLUTION: The code is modified so that the read ahead length is set correctly. * 2720002 (Tracking ID: 1396859) SYMPTOM: The internal spin watcher tools show heavy contention on the buffer freelist lock, bc_freelist_lock, even when the I/O loads are predominantly direct I/Os. DESCRIPTION: Direct I/O data need not be cached in file system buffers. But Veritas File System (VxFS) maintains empty buffers to track these requests. Hence, contention is seen even when I/O is direct in nature. RESOLUTION: The code is modified to have a separate arena to track direct I/O buffers so that the contention on the buffer freelist lock is reduced. * 2722869 (Tracking ID: 1244756) SYMPTOM: A lookup operation on a VxFS file system may fail with the following stack trace: vx_cbdnlc_purge_iplist+0x64 vx_inode_free_list+0x160 vx_ifree_scan_list+0xf8 vx_workitem_process+0x10 vx_worklist_process+0x17c vx_worklist_thread+0x94 DESCRIPTION: Due to a race condition between the Directory Name Lookup Cache (DNLC) lookup and the DNLC get functions, there is an attempt to move a DNLC entry to the tail of the freelist in the lookup function when it has already been removed from the freelist by the DNLC get function. This leads to a null pointer de- reference. RESOLUTION: The code is modified to verify that the DNLC entry is present on the freelist before it is moved to the tail by the DNLC get function. * 2722958 (Tracking ID: 2696067) SYMPTOM: When a getaccess() command is issued on a file which inherits the default Access Control List (ACL) entries from the parent, it shows incorrrect group object permissions. DESCRIPTION: If a newly created file leverages the ACL entries of its parent directory, the vx_daccess() function does not fabricate any GROUP_OBJ entry unlike the vx_do_getacl() function. RESOLUTION: The code is modified to fabricate a GROUP_OBJ entry. * 2726001 (Tracking ID: 2371710) SYMPTOM: User quota file gets corrupted when DELICACHE feature is enabled and the current usage of inodes of a user becomes negative after frequent file creations and deletions. If the quota information is checked using the vxquota command with the '-vu username ' option, the number of files is "-1". For example: # vxquota -vu testuser2 Disk quotas for testuser2 (uid 500): Filesystem usage quota limit timeleft files quota limit timeleft /vol01 1127809 8239104 8239104 -1 0 0 DESCRIPTION: This issue is introduced by the inode DELICACHE feature which is a performance enhancement to optimize the updates done to the inode map during file creation and deletion operations. The feature is enabled by default, and can be changed by the vxtunefs(1M) command. When DELICACHE is enabled and the quota is set for Veritas File System (VxFS), there is an extra quota update for the inodes on the inactive list during the removal process. Since this quota has been updated already before being put on the DELICACHE list, the current number of user files gets decremented twice. RESOLUTION: The code is modified to add a flag to identify the inodes which have been moved to the inactive list from the DELICACHE list. This flag is used to prevent decrementing the quota again during the removal process. * 2730957 (Tracking ID: 2651922) SYMPTOM: On a local VxFS file system, the ls(1M) command with the '-l' option runs slowly and high CPU usage is observed. DESCRIPTION: Currently, Cluster File System (CFS) inodes are not allowed to be reused as local inodes to avoid Global Lock Manager (GLM) deadlo`ck issue when Veritas File System (VxFS) reconfiguration is in process. Hence, if a VxFS local inode is needed, all the inode free lists need to be traversed to find a local inode if the free lists are almost filled up with CFS inodes. RESOLUTION: The code is modified to add a global variable, 'vxi_icache_cfsinodes' to count the CFS inodes in inode cache. The condition is relaxed for converting a cluster inode to a local inode when the number of in-core CFS inodes is greater than the 'vx_clreuse_threshold' threshold and reconfiguration is not in progress. * 2733704 (Tracking ID: 2421901) SYMPTOM: Internal stress test on locally mounted VxFS file system resulted in the following assert: f:vx_getinoquota:1a DESCRIPTION: While reusing the inode from inactive inode list, the inode field that contains quota information is expected to be NULL. While moving the inode from delicache list to inactive inode list, quota information field is not set to NULL. This results in the assert. RESOLUTION: The code is modified to reset the quota information field of inode while moving it from delicache list to inactive inode list. INSTALLING THE PATCH -------------------- To install the VxFS 5.0.1-11.31 patch: a) To install this patch on a CVM cluster, install it one system at a time so that all the nodes are not brought d own simultaneously. b) The VxFS 11.31 pkg with revision 5.0.31.5 must be installed before applying the patch. c) To verify the VERITAS file system level, execute: # swlist -l product | egrep -i 'VRTSvxfs' VRTSvxfs 5.0.31.5 VERITAS File System Note: VRTSfsman is a corequisite for VRTSvxfs. So, VRTSfsman also needs to be installed with VRTSvxfs. # swlist -l product | egrep -i 'VRTS' VRTSvxfs 5.0.31.5 Veritas File System VRTSfsman 5.0.31.5 Veritas File System Manuals d)All prerequisite/corequisite patches have to be installed.The Kernel patch requires a system reboot for both i nstallation and removal. e) To install the patch, execute the following command: # swinstall -x autoreboot=true -s PHKL_44288, PHCO_44287, PHCO_44286 If the patch is not registered, you can register it using the following command: # swreg -l depot The is the absolute path where the patch resides. REMOVING THE PATCH ------------------ a)To remove the patch, enter the following command: # swremove -x autoreboot=true PHCO_44287 PHKL_44288 PHCO_44286 SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ PATCH ID:PHCO_44165 * INCIDENT NO:3554807 TRACKING ID:2134670 SYMPTOM: The extendfs(1M) utility fails when the file system is marked for the full full- file-system check (fsck). The following error message is displayed: # extendfs extendfs: /etc/default/fs is used for determining the file system type UX:vxfs fsck: ERROR: V-3-20858: cannot bmap stateino UX:vxfs extendfs: ERROR: V-3-20141: Invocation of the fsck program terminated abnormally. The file system is marked bad. Run full fsck manually. DESCRIPTION: The extendfs(1M) utility increases the file system size via the fsck. It calls the fsck with the "extendfs" option and the fsck in turn increases the size of the file system, and modifies the relevant data structures by calculating the new size of the various metadata. The blocks that are to be allocated for the state-inode file are calculated incorrectly. As a result, one less block is allocated and hence the fsck operation fails. The one extra block is allocated only when there is an extra byte that would not fit in the existing blocks. But there may be situations where there are two extra bits, or four to six extra bits that are less than 1 byte, for one AU. This may require an extra block. But since the extra space required in the state-inode file is less than 1 byte, it is ignored. As a result, the extendfs (1M) utility fails. RESOLUTION: The code is modified to allocate a block in the state-inode file, even when the extra space that is required is less than a byte. * INCIDENT NO:3593545 TRACKING ID:3593546 SYMPTOM: The fsadm(1M) command fails to shrink the file system, even when the free space is available. The following error message is displayed: UX:vxfs fsadm: INFO: V-3-23586: is currently 157155328 sectors - size will be reduced UX:vxfs fsadm: ERROR: V-3-20343: cannot shrink - blocks are currently in use. DESCRIPTION: The VxFS library inochk_typd() function iterates over the extents of an inode that has the typed-extent descriptors. The indirect-address extents in an inode are mapped to the inochk_typd() function incorrectly. As a result, the fsadm (1M) command is unable to read the information from the indirect-address extents. RESOLUTION: The code is modified so that the indirect-address extents in an inode are mapped to the inochk_typd() function correctly. PATCH ID:PHCO_43853 * INCIDENT NO:2810115 TRACKING ID:2597347 SYMPTOM: The fsck(1M) command dumps core with the following stack trace: _lwp_kill abort iget_i iget ag_ivalidate ag_validate_repino ag_olt_init ag_init process_device main _start DESCRIPTION: The fsck(1M) command dumps core when one of the device record gets corrupted, while the replica-device record is valid. The fsck(1M) command is unable to reconstruct the corrupted record with the valid record. RESOLUTION: The code is modified to reconstruct the corrupted-device record using the valid replica-device record. * INCIDENT NO:3404271 TRACKING ID:3284698 SYMPTOM: On a 32 TB file system the fsck(1M) operation fails with the following error message: root@hostname> fsck -F vxfs -o full UX:vxfs fsck: ERROR: V-3-25289: could not seek to block offset devid/blknum failed UX:vxfs fsck: ERROR: V-3-20694: cannot initialize aggregate file system check failure, aborting ... DESCRIPTION: The problem is observed when a 64 bit value is type-casted to a 32 bit value that was evaluated to be a negative number. As a result, the fsck(1M) operation fails with an error message. RESOLUTION: The code is modified to remove the incorrect typecast. PATCH ID:PHCO_43599 * INCIDENT NO:2340734 TRACKING ID:2282201 SYMPTOM: On a Veritas File System (VxFS), the vxdump(1M) operation running in parallel with other files system operations like create, delete etc., can fail with signal SIGSEGV generating a core file with the following stack trace: _kill(), at 0xff14c0c0 __sighndlr(), at 0xff148b4c chkdblk(), at 0x1c1ac dsrch(), at 0x1bf7c add(), at 0x1b030 pass(), at 0x1ac58 main(), at 0x17550 DESCRIPTION: The vxdump(1M) command caches the inodes to be dumped in a bit map before starting the dump of a directory. The value of the number of inodes can change if create or delete operations happen in the background. This can lead to an inconsistent bit map which is responsible for generating a core file. RESOLUTION: The code is modified to refresh the inode bit map before actually starting the dump operation thus avoiding the core file generation. * INCIDENT NO:3232402 TRACKING ID:3107628 SYMPTOM: The vxdump(1M) utility incorrectly estimates the number of tapes required to complete the backup and prematurely prompts for the next tape. DESCRIPTION: The vxdump(1M) utility prematurely prompts for next tape. RESOLUTION: The code is modified to fix the initialization for track and tape variables. PATCH ID:PHCO_43476 * INCIDENT NO:3024008 TRACKING ID:2858683 SYMPTOM: The reserve-extent attributes are changed after the vxrestore(1M ) operation, for files that are greater than 8192 bytes. DESCRIPTION: A local variable is used to contain the number of the reserve bytes that are reused during the vxrestore(1M) operation, for further VX_SETEXT ioctl call for files that are greater than 8k. As a result, the attribute information is changed. RESOLUTION: The code is modified to preserve the original variable value till the end of the function. * INCIDENT NO:3069242 TRACKING ID:2964018 SYMPTOM: On a high end machine with about 125 CPUs operations using the lstat64(2) function, may seem to be hung and the following stack trace is observed: spinlock+0xe0 rwspin_wrlock+0x30 specvp+0x510 vx_lookup+0x8a0 -> lookuppnvp(inlined) -> lookuppn(inlined) DESCRIPTION: The statvfsdev search calls the devnm() function to search the whole /dev/ directory for reverse- pathname RESOLUTION: The code is modified such that a new fs_load() function is implemented to make use of the incoming file descriptor, if it is already a character device. However, the devnm() function is still needed if the incoming file descriptor is a block device. PATCH ID:PHCO_43261 * INCIDENT NO:1935635 TRACKING ID:1742707 SYMPTOM: Mounting a Cluster File System (CFS) fails with the following usage message: UX:vxfs fsck_logv : INFO: V-3-20896: Usage: fsck [-V] [-F vxfs] [-mnNyY] [-o fu'll, nolog, mounted, p] special [...] DESCRIPTION: The "switchout fsck" command needs to be invoked for CFS with two separate options: "-o" and "mounted". However, the "switchout fsck" command gets invoked with the "-o mounted" option. As a result, the error occurs. RESOLUTION: The code is modified so that the "switchout fsck" command gets invoked with two different options: "-o" and "mounted", instead of being invoked as the "-o mounted" option. * INCIDENT NO:2822988 TRACKING ID:2822984 SYMPTOM: When the extendfs(1m) command extends the file system that is greater than 2TB the extendfs(1m) command fails and the following error message is displayed: "UX:vxfs fsck: ERROR: V-3-25315: could not seek to block offset" DESCRIPTION: This is a typecasting problem. When the extendfs(1m) command tries to extend the file system, the bs_bseek() function is invoked. The bs_bseek() function's return type is a 32 bit integer value. This value gets negative for offsets greater than 2TB and results in failure. RESOLUTION: The code is modified to resolve the typecasting problem. PATCH ID:PHCO_42893 * INCIDENT NO:1946136 TRACKING ID:1859532 SYMPTOM: The umount (1M) operation on a file system checkpoint fails with a file system busy message after using for File Change Log(FCL) Application Program Interfaces (API) for FCL operations. DESCRIPTION: The FCL file is not closed while using FCL APIs for a read only checkpoint. This leads to the failure of the umount(1M) operation as the file remains open. RESOLUTION:The code is modified to close the FCL file even for a read only checkpoint. * INCIDENT NO:2036204 TRACKING ID:2028811 SYMPTOM: The ncheck() function dumps core in the printname() function during heavy I/O and the following stack trace is displayed: printname+0x24c() do_dirblk+0x21c() nxtpass+0x3a4() check+0x488() checkallfilesets+0x620() main+0x78() _start+0x108() DESCRIPTION: When files are created under different directories in parallel and the ncheck() function starts printing the name of a new directory entry that has been created and whose inode number is greater than the maximum inode number, it is possible that the inodes beyond the end of print_itable get referenced. This results in a core dump. RESOLUTION:The code is modified to ensure that the number of inodes is less than the maximum number of inodes before performing the printname() function. * INCIDENT NO:2069664 TRACKING ID:2069059 SYMPTOM: A hang occurs in Cluster File System (CFS) when setting a LIBPATH to a CFS directory and the following threads are displayed: pvthread+0B5F00 STACK: e_block_thread+000278 () e_sleep_thread+00005C () vxg_ilock_wait+0000A0 () vxg_range_lock_body+000384 () vxg_api_range_lockwf+0000EC () vx_glm_range_lock+000078 () vx_glmrange_rangelock+000054 () vx_irwlock2+000114 () vx_irwlock+00003C () vx_recover_fsethdr+000080 () vx_assumption+000024 () vx_recover+000304 () vx_recover_evanesce+00000C () vx_thread_base+00002C () threadentry+000014 () pvthread+0B5600 STACK: e_block_thread+000278 () e_sleep_thread+00005C () vxg_get_block+00013C () vxg_api_initlock+0002B0 () vx_glm_init_blocklock+00005C () vx_cbuf_lookup+000280 () vx_getblk_clust+0000B4 () vx_getblk_cmn+0000BC () vx_getblk+00003C () vx_rdwrnomap+00032C () vx_kernread+000084 () vx_recover_fsethdr+000238 () vx_assumption+000024 () vx_recover+000304 () vx_recover_evanesce+00000C () vx_thread_base+00002C () threadentry+000014 () DESCRIPTION: This issue occurs because of setting a LIBPATH to a CFS directory. When LIBPATH is set, every run through the shell opens the CFS directory. The vxfsckd exec ()'s fsck command may require opening/closing the CFS files if LIBPATH is set to a CFS directory. This may cause a deadlock if it is done as a part of the log replay during reconfiguration. RESOLUTION:The code is modified so that the environment of the vxfsckd() function process is not used. * INCIDENT NO:2090261 TRACKING ID:2074281 SYMPTOM: An attempt to dump the File Control Log (FCL) using the fcladm(1M) command with the 'dump' option results in a segmentation fault and the following stack trace is displayed: strlen() vfprintf() vx_viprintf() vx_iprintf () fcl_dump () main() _start() DESCRIPTION: The internal function, fcl_dump(), opens the savefile (a command line option of the fcladm(1M) command) and if it fails, an error message is displayed. But if the savefile option is not specified with fcladm dump, the value of savefile is taken as NULL, which causes a segmentation fault in the string length. RESOLUTION:The code is modified to add a check for a non NULL savefile and handle the error appropriately. * INCIDENT NO:2114116 TRACKING ID:2061177 SYMPTOM: On systems running Veritas File System (VxFS) version of 5.0MP3RP1, the fsadm (1M) command with the ' -de' option displays an error with 'bad file number' on file systems. DESCRIPTION: The fsadm(1M) command maintains an in-core copy of the inode information from the disk and may re-read it later for any other processing. The command fails with an error because the in-core data and the on disk data are out of synchronization. RESOLUTION:The code is modified to add a sync operation in the fsadm(1M) command before it reads the layout from the raw disk to ensure synchronization of the in-core data and the on disk data. PATCH ID:PHCO_43598 * INCIDENT NO:3220502 TRACKING ID:1960815 SYMPTOM: The vxtunefs (1M) command manpage does not mention the performance impact of tuning any of the Veritas File System (VxFS) parameters on the system with heavy load. DESCRIPTION: The performance impact that the system face on tuning any of the VxFS parameters needs to be mentioned in the vxtunefs (1M) command manpage. RESOLUTION: The manpage is updated to mention the performance impact of tuning any of the VxFS parameters on the system with heavy load. * INCIDENT NO:3230688 TRACKING ID:3099638 SYMPTOM: When the vxfs_ifree_timelag(5) tunable is tuned the following error message is displayed: # kctune vxfs_ifree_timelag=400 ERROR: mesg 095: V-2-95: Setting vxfs_ifree_timelag to 450 since the specified value for vxfs_ifree_timelag is less than the recommended minimum value of 1035 DESCRIPTION: In the vxfs_ifree_timelag(5) tunable man page, the minimum value is set to "None". The error message is displayed when the vxfs_ifree_timelag(5) tunable is set to a value which is less than 450. In the error message, a garbage value is displayed as the recommended minimum value. The error occurs because a single argument is passed for the error message that has two format specifier's. RESOLUTION: The code is modified to set the correct minimum value of the vxfs_ifree_timelag (5) tunable, and display the correct error message. PATCH ID:PHCO_43476 * INCIDENT NO:3024008 TRACKING ID:2858683 SYMPTOM: The reserve-extent attributes are changed after the vxrestore(1M ) operation, for files that are greater than 8192 bytes. DESCRIPTION: A local variable is used to contain the number of the reserve bytes that are reused during the vxrestore(1M) operation, for further VX_SETEXT ioctl call for files that are greater than 8k. As a result, the attribute information is changed. RESOLUTION: The code is modified to preserve the original variable value till the end of the function. * INCIDENT NO:3069242 TRACKING ID:2964018 SYMPTOM: On a high end machine with about 125 CPUs operations using the lstat64(2) function, may seem to be hung and the following stack trace is observed: spinlock+0xe0 rwspin_wrlock+0x30 specvp+0x510 vx_lookup+0x8a0 -> lookuppnvp(inlined) -> lookuppn(inlined) DESCRIPTION: The statvfsdev search calls the devnm() function to search the whole /dev/ directory for reverse- pathname RESOLUTION: The code is modified such that a new fs_load() function is implemented to make use of the incoming file descriptor, if it is already a character device. However, the devnm() function is still needed if the incoming file descriptor is a block device. PATCH ID:PHCO_43261 * INCIDENT NO:1935635 TRACKING ID:1742707 SYMPTOM: Mounting a Cluster File System (CFS) fails with the following usage message: UX:vxfs fsck_logv : INFO: V-3-20896: Usage: fsck [-V] [-F vxfs] [-mnNyY] [-o fu'll, nolog, mounted, p] special [...] DESCRIPTION: The "switchout fsck" command needs to be invoked for CFS with two separate options: "-o" and "mounted". However, the "switchout fsck" command gets invoked with the "-o mounted" option. As a result, the error occurs. RESOLUTION: The code is modified so that the "switchout fsck" command gets invoked with two different options: "-o" and "mounted", instead of being invoked as the "-o mounted" option. * INCIDENT NO:2822988 TRACKING ID:2822984 SYMPTOM: When the extendfs(1m) command extends the file system that is greater than 2TB the extendfs(1m) command fails and the following error message is displayed: "UX:vxfs fsck: ERROR: V-3-25315: could not seek to block offset" DESCRIPTION: This is a typecasting problem. When the extendfs(1m) command tries to extend the file system, the bs_bseek() function is invoked. The bs_bseek() function's return type is a 32 bit integer value. This value gets negative for offsets greater than 2TB and results in failure. RESOLUTION: The code is modified to resolve the typecasting problem. PATCH ID:PHCO_42893 * INCIDENT NO:1946136 TRACKING ID:1859532 SYMPTOM: The umount (1M) operation on a file system checkpoint fails with a file system busy message after using for File Change Log(FCL) Application Program Interfaces (API) for FCL operations. DESCRIPTION: The FCL file is not closed while using FCL APIs for a read only checkpoint. This leads to the failure of the umount(1M) operation as the file remains open. RESOLUTION:The code is modified to close the FCL file even for a read only checkpoint. * INCIDENT NO:2036204 TRACKING ID:2028811 SYMPTOM: The ncheck() function dumps core in the printname() function during heavy I/O and the following stack trace is displayed: printname+0x24c() do_dirblk+0x21c() nxtpass+0x3a4() check+0x488() checkallfilesets+0x620() main+0x78() _start+0x108() DESCRIPTION: When files are created under different directories in parallel and the ncheck() function starts printing the name of a new directory entry that has been created and whose inode number is greater than the maximum inode number, it is possible that the inodes beyond the end of print_itable get referenced. This results in a core dump. RESOLUTION:The code is modified to ensure that the number of inodes is less than the maximum number of inodes before performing the printname() function. * INCIDENT NO:2069664 TRACKING ID:2069059 SYMPTOM: A hang occurs in Cluster File System (CFS) when setting a LIBPATH to a CFS directory and the following threads are displayed: pvthread+0B5F00 STACK: e_block_thread+000278 () e_sleep_thread+00005C () vxg_ilock_wait+0000A0 () vxg_range_lock_body+000384 () vxg_api_range_lockwf+0000EC () vx_glm_range_lock+000078 () vx_glmrange_rangelock+000054 () vx_irwlock2+000114 () vx_irwlock+00003C () vx_recover_fsethdr+000080 () vx_assumption+000024 () vx_recover+000304 () vx_recover_evanesce+00000C () vx_thread_base+00002C () threadentry+000014 () pvthread+0B5600 STACK: e_block_thread+000278 () e_sleep_thread+00005C () vxg_get_block+00013C () vxg_api_initlock+0002B0 () vx_glm_init_blocklock+00005C () vx_cbuf_lookup+000280 () vx_getblk_clust+0000B4 () vx_getblk_cmn+0000BC () vx_getblk+00003C () vx_rdwrnomap+00032C () vx_kernread+000084 () vx_recover_fsethdr+000238 () vx_assumption+000024 () vx_recover+000304 () vx_recover_evanesce+00000C () vx_thread_base+00002C () threadentry+000014 () DESCRIPTION: This issue occurs because of setting a LIBPATH to a CFS directory. When LIBPATH is set, every run through the shell opens the CFS directory. The vxfsckd exec ()'s fsck command may require opening/closing the CFS files if LIBPATH is set to a CFS directory. This may cause a deadlock if it is done as a part of the log replay during reconfiguration. RESOLUTION:The code is modified so that the environment of the vxfsckd() function process is not used. * INCIDENT NO:2090261 TRACKING ID:2074281 SYMPTOM: An attempt to dump the File Control Log (FCL) using the fcladm(1M) command with the 'dump' option results in a segmentation fault and the following stack trace is displayed: strlen() vfprintf() vx_viprintf() vx_iprintf () fcl_dump () main() _start() DESCRIPTION: The internal function, fcl_dump(), opens the savefile (a command line option of the fcladm(1M) command) and if it fails, an error message is displayed. But if the savefile option is not specified with fcladm dump, the value of savefile is taken as NULL, which causes a segmentation fault in the string length. RESOLUTION:The code is modified to add a check for a non NULL savefile and handle the error appropriately. * INCIDENT NO:2114116 TRACKING ID:2061177 SYMPTOM: On systems running Veritas File System (VxFS) version of 5.0MP3RP1, the fsadm (1M) command with the ' -de' option displays an error with 'bad file number' on file systems. DESCRIPTION: The fsadm(1M) command maintains an in-core copy of the inode information from the disk and may re-read it later for any other processing. The command fails with an error because the in-core data and the on disk data are out of synchronization. RESOLUTION:The code is modified to add a sync operation in the fsadm(1M) command before it reads the layout from the raw disk to ensure synchronization of the in-core data and the on disk data.