* * * READ ME * * * * * * Symantec File System 6.1.1 * * * * * * Patch 6.1.1.400 * * * Patch Date: 2016-07-25 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH * KNOWN ISSUES PATCH NAME ---------- Symantec File System 6.1.1 Patch 6.1.1.400 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- Solaris 11 SPARC PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Symantec File System 6.1 * Symantec Storage Foundation 6.1 * Symantec Storage Foundation Cluster File System HA 6.1 * Symantec Storage Foundation for Oracle RAC 6.1 * Symantec Storage Foundation HA 6.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: 6.1.1.400 * 3652109 (3553328) During internal testing full fsck failed to clean the file system cleanly. * 3690067 (3615043) Data loss when writing to a file while dalloc is on. * 3729811 (3719523) 'vxupgrade' retains the superblock replica of old layout versions. * 3852733 (3729158) Deadlock occurs due to incorrect locking order between write advise and dalloc flusher thread. * 3859806 (3451730) Installation of VRTSodm, VRTSvxfs in a zone fails when running zoneadm -z Zone attach -U * 3864007 (3558087) The ls -l and other commands which uses stat system call may take long time to complete. * 3864010 (3269553) VxFS returns inappropriate message for read of hole via Oracle Disk Manager (ODM). * 3864013 (3811849) System panics while executing lookup() in a directory with large directory hash(LDH). * 3864035 (3790721) High cpu usage caused by vx_send_bcastgetemapmsg_remaus * 3864036 (3233276) With a large file system, primary to secondary migration takes longer duration. * 3864037 (3616907) System is unresponsive causing the NMI watchdog service to stall. * 3864040 (3633683) vxfs thread consumes high CPU while running an application that makes excessive sync() calls. * 3864042 (3466020) File system is corrupted with an error message "vx_direrr: vx_dexh_keycheck_1". * 3864141 (3647749) On Solaris, an obsolete v_path is created for the VxFS vnode. * 3864148 (3695367) Unable to remove volume from multi-volume VxFS using "fsvoladm" command. * 3864150 (3602322) System panics while flushing the dirty pages of the inode. * 3864155 (3707662) Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap. * 3864156 (3662284) File Change Log (FCL) read may retrun ENXIO. * 3864158 (2560032) System panics after SFHA is upgraded from 5.1SP1 to 5.1SP1RP2 or from 6.0.1 to 6.0.5 * 3864160 (3691633) Remove RCQ Full messages * 3864161 (3708836) fallocate causes data corruption * 3864164 (3762125) Directory size increases abnormally. * 3864165 (3751049) The umountall operation fails on Solaris. * 3864167 (3735697) vxrepquota reports error * 3864170 (3743572) File system may get hang when reaching 1 billion inode limit * 3864173 (3779916) vxfsconvert fails to upgrade layout verison for a vxfs file system with large number of inodes. * 3864175 (3804400) /opt/VRTS/bin/cp does not return any error when quota hard limit is reached and partial write is encountered. * 3864177 (3808033) When using 6.2.1 ODM on RHEL7, Oracle resource cannot be killed after forced umount via VCS. * 3864178 (1428611) 'vxcompress' can spew many GLM block lock messages over the LLT network. * 3864184 (3857444) The default permission of /etc/vx/vxfssystem file is incorrect. * 3864185 (3859032) System panics in vx_tflush_map() due to NULL pointer de-reference. * 3864186 (3855726) Panic in vx_prot_unregister_all(). * 3864246 (3657482) Stress test on cluster file system fails due to data corruption * 3864247 (3861713) High %sys CPU seen on Large CPU/Memory configurations. * 3864250 (3833816) Read returns stale data on one node of the CFS. * 3864255 (3827491) Data relocation is not executed correctly if the IOTEMP policy is set to AVERAGE. * 3864256 (3830300) Degraded CPU performance during backup of Oracle archive logs on CFS vs local filesystem * 3864257 (3844820) Removing/Adding vCPU on Solaris could trigger system panic * 3864259 (3856363) Filesystem inodes have incorrect blocks. * 3864260 (3846521) "cp -p" fails if modification time in nano seconds have 10 digits. * 3866968 (3866962) Data corruption seen when dalloc writes are going on the file and simultaneously fsync started on the same file. * 3874662 (3871489) Performance issue observed when number of HBAs increased on high end servers. * 3877070 (3880121) Internal assert failure when coalescing the extents on clone. * 3877339 (3880113) Internal assert failure when pushing zfod extent on clone. Patch ID: 6.1.1.100 * 3520113 (3451284) Internal testing hits an assert "vx_sum_upd_efree1" * 3536233 (3457803) File System gets disabled intermittently with metadata IO error. * 3583963 (3583930) When the external quota file is restored or over-written, the old quota records are preserved. * 3617774 (3475194) Veritas File System (VxFS) fscdsconv(1M) command fails with metadata overflow. * 3617788 (3604071) High CPU usage consumed by the vxfs thread process. * 3617793 (3564076) The MongoDB noSQL db creation fails with an ENOTSUP error. * 3620279 (3558087) The ls -l command hangs when the system takes backup. * 3620288 (3469644) The system panics in the vx_logbuf_clean() function. * 3645825 (3622326) Filesystem is marked with fullfsck flag as an inode is marked bad during checkpoint promote Patch ID: 6.1.1.000 * 3383149 (3383147) The ACA operator precedence error may occur while turning off delayed allocation. * 3422580 (1949445) System is unresponsive when files were created on large directory. * 3422584 (2059611) The system panics due to a NULL pointer dereference while flushing bitmaps to the disk. * 3422586 (2439261) When the vx_fiostats_tunable value is changed from zero to non-zero, the system panics. * 3422604 (3092114) The information output displayed by the "df -i" command may be inaccurate for cluster mounted file systems. * 3422614 (3297840) A metadata corruption is found during the file removal process. * 3422626 (3332902) While shutting down, the system running the fsclustadm(1M) command panics. * 3422629 (3335272) The mkfs (make file system) command dumps core when the log size provided is not aligned. * 3422636 (3340286) After a file system is resized, the tunable setting of dalloc_enable gets reset to a default value. * 3422649 (3394803) A panic is observed in VxFS routine vx_upgrade7() function while running the vxupgrade command(1M). * 3436431 (3434811) The vxfsconvert(1M) in VxFS 6.1 hangs. * 3448503 (3448492) In Solaris SPARC, introduced Vnode Page Mapping (VPM) interface. * 3501832 (3413926) Internal testing hangs due to high memory consumption resulting in fork failure. * 3504362 (3472551) The attribute validation (pass 1d) of full fsck takes too much time to complete. * 3507608 (3478017) Internal test hits assert in voprwunlock. * 3512292 (3348520) In a Cluster File System (CFS) cluster having multi volume file system of a smaller size, execution of the fsadm command causes system hang if the free space in the file system is low. * 3518943 (3534779) Internal stress testing on Cluster File System (CFS) hits a debug assert. * 3519809 (3463464) Internal kernel functionality conformance test hits a kernel panic due to null pointer dereference. * 3528770 (3449152) Failed to set 'thin_friendly_alloc' tunable in case of cluster file system (CFS). * 3529852 (3463717) Information regarding Cluster File System (CFS) that does not support the 'thin_friendly_alloc' tunable is not updated in the vxtunefs(1M) command man page. * 3529862 (3529860) The package verification using the Apkg verifyA command fails for VRTSglm, VRTSgms, and VRTSvxfs packages on Solaris 11. * 3530038 (3417321) The vxtunefs(1M) tunable man page gives an incorrect * 3541125 (3541083) The vxupgrade(1M) command for layout version 10 creates 64-bit quota files with inappropriate permission configurations. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following Symantec incidents: Patch ID: 6.1.1.400 * 3652109 (Tracking ID: 3553328) SYMPTOM: During internal testing it was found that per node LCT file was corrupted, due to which attribute inode reference counts were mismatching, resulting in fsck failure. DESCRIPTION: During clone creation LCT from 0th pindex is copied to the new clone's LCT. Any update to this LCT file from non-zeroth pindex can cause count mismatch in the new fileset. RESOLUTION: The code is modified to handle this issue. * 3690067 (Tracking ID: 3615043) SYMPTOM: At times, while writing to a file, data could be missed. DESCRIPTION: While writing to a file when delayed allocation is on, Solaris could dishonor the NON_CLUSTERING flag and cluster pages beyond the range for which we have issued the flushing, leading to data loss. RESOLUTION: Make sure we clear the flag and flush the exact range, in case of dalloc. * 3729811 (Tracking ID: 3719523) SYMPTOM: 'vxupgrade' does not clear the superblock replica of old layout versions. DESCRIPTION: While upgrading the file system to a new layout version, a new superblock inode is allocated and an extent is allocated for the replica superblock. After writing the new superblock (primary + replica), VxFS frees the extent of the old superblock replica. Now, if the primary superblock corrupts, the full fsck searches for replica to repair the file system. If it finds the replica of old superblock, it restores the file system to the old layout, instead of creating a new one. This behavior is wrong. In order to take the file system to a new version, we should clear the replica of old superblock as part of vxupgrade, so that full fsck won't detect it later. RESOLUTION: Clear the replica of old superblock as part of vxupgrade. * 3852733 (Tracking ID: 3729158) SYMPTOM: The fuser and other commands hang on VxFS file systems. DESCRIPTION: The hang is seen while 2 threads contest for 2 locks -ILOCK and PLOCK. The writeadvise thread owns the ILOCK but is waiting for the PLOCK, while the dalloc thread owns the PLOCK and is waiting for the ILOCK. RESOLUTION: The code is modified to correct the order of locking. Now PLOCK is followed by ILOCK. * 3859806 (Tracking ID: 3451730) SYMPTOM: Installation of VRTSodm, VRTSvxfs in a zone fails when running zoneadm -z Zoneattach -U DESCRIPTION: When you upgrade a zone using attach U option, the checkinstall script is executed. There were certain zone-irrelevant commands (which should not be executed during attach) in the checkinstall script which failed the installation of VRTSodm, VRTSvxfs. RESOLUTION: Code is added in the postinstall script to fix the checkinstall script. * 3864007 (Tracking ID: 3558087) SYMPTOM: When stat system call is executed on VxFS File System with delayed allocation feature enabled, it may take long time or it may cause high cpu consumption. DESCRIPTION: When delayed allocation (dalloc) feature is turned on, the flushing process takes much time. The process keeps the get page lock held, and needs writers to keep the inode reader writer lock held. Stat system call may keeps waiting for inode reader writer lock. RESOLUTION: Delayed allocation code is redesigned to keep the get page lock unlocked while flushing. * 3864010 (Tracking ID: 3269553) SYMPTOM: VxFS returns inappropriate message for read of hole via ODM. DESCRIPTION: Sometimes sparse files containing temp or backup/restore files are created outside the Oracle database. And, Oracle can read these files only using the ODM. As a result, ODM fails with an ENOTSUP error. RESOLUTION: The code is modified to return zeros instead of an error. * 3864013 (Tracking ID: 3811849) SYMPTOM: System panics due to size mismatch in the cluster-wide buffers containing hash bucket data. Offending stack looks like below: $cold_vm_hndlr bubbledown as_ubcopy vx_populate_bpdata vx_getblk_clust $cold_vx_getblk vx_exh_getblk vx_exh_get_bucket vx_exh_lookup vx_dexh_lookup vx_dirscan vx_dirlook vx_pd_lookup vx_lookup_pd vx_lookup lookupname lstat syscall On some platforms, instead of panic, LDH corruption can be reported. Full fsck can report some meta-data inconsistencies, which looks like the below sample messages: fileset 999 primary-ilist inode 263 has invalid alternate directory index (fileset 999 attribute-ilist inode 8193), clear index? (ynq)y fileset 999 primary-ilist inode 29879 has invalid alternate directory index (fileset 999 attribute-ilist inode 8194), clear index? (ynq)y fileset 999 primary-ilist inode 1070691 has invalid alternate directory index (fileset 999 attribute-ilist inode 24582), clear index? (ynq)y fileset 999 primary-ilist inode 1262102 has invalid alternate directory index (fileset 999 attribute-ilist inode 8198), clear index? (ynq)y DESCRIPTION: On a very fragmented file system with FS block sizes 1K, 2K or 4K, any segment of the hash inode (i.e. buckets/CDF/directory segment with fixed size: 8K) can spread across multiple extents. Instead of initializing the buffers on the final bmap after all allocations are finished, LDH code allocates the buffer-cache buffers as the allocations come along.As a result, small allocations can be merged in final bmap, e.g. two CFS nodes can end up having buffers representing same metadata, with different sizes. This leads to panics because the buffers are passed around the cluster or the corruption reaches LDH portions on the disk. RESOLUTION: The code is modified to separate the allocation and buffer initialization in LDH code paths. * 3864035 (Tracking ID: 3790721) SYMPTOM: High CPU usage on the vxfs thread process. The backtrace of such kind of threads usually look like this: schedule schedule_timeout __down down vx_send_bcastgetemapmsg_remaus vx_send_bcastgetemapmsg vx_recv_getemapmsg vx_recvdele vx_msg_recvreq vx_msg_process_thread vx_kthread_init kernel_thread DESCRIPTION: The locking mechanism in vx_send_bcastgetemapmsg_process() is inefficient. So that every time vx_send_bcastgetemapmsg_process() is called, it will perform a series of down-up operation on a certain semaphore. This can result in a huge CPU cost when multiple threads have contention on this semaphore. RESOLUTION: Optimize the locking mechanism in vx_send_bcastgetemapmsg_process(), so that it only do down-up operation on the semaphore once. * 3864036 (Tracking ID: 3233276) SYMPTOM: On a 40 TB file system, the fsclustadm setprimary command consumes more than 2 minutes for execution. And, the unmount operation consumes more time causing a primary migration. DESCRIPTION: The old primary needs to process the delegated allocation units while migrating from primary to secondary. The inefficient implementation of the allocation unit list is consuming more time while removing the element from the list. As the file system size increases, the allocation unit list also increases, which results in additional migration time. RESOLUTION: The code is modified to process the allocation unit list efficiently. With this modification, the primary migration is completed in 1 second on the 40 TB file system. * 3864037 (Tracking ID: 3616907) SYMPTOM: While performing the garbage collection operation, VxFS causes the non-maskable interrupt (NMI) service to stall. DESCRIPTION: With a highly fragmented Reference Count Table (RCT), when a garbage collection operation is performed, the CPU could be used for a longer duration. The CPU could be busy if a potential entry that could be freed is not identified. RESOLUTION: The code is modified such that the CPU is released after a when it is idle after a specified time interval. * 3864040 (Tracking ID: 3633683) SYMPTOM: "top" command output shows vxfs thread consuming high CPU while running an application that makes excessive sync() calls. DESCRIPTION: To process sync() system call vxfs scans through inode cache which is a costly operation. If an user application is issuing excessive sync() calls and there are vxfs file systems mounted, this can make vxfs sync processing thread to consume high CPU. RESOLUTION: Combine all the sync() requests issued in last 60 second into a single request. * 3864042 (Tracking ID: 3466020) SYMPTOM: File system is corrupted with the following error message in the log: WARNING: msgcnt 28 mesg 008: V-2-8: vx_direrr: vx_dexh_keycheck_1 - /TraceFile file system dir inode 3277090 dev/block 0/0 diren WARNING: msgcnt 27 mesg 008: V-2-8: vx_direrr: vx_dexh_keycheck_1 - /TraceFile file system dir inode 3277090 dev/block 0/0 diren WARNING: msgcnt 26 mesg 008: V-2-8: vx_direrr: vx_dexh_keycheck_1 - /TraceFile file system dir inode 3277090 dev/block 0/0 diren WARNING: msgcnt 25 mesg 096: V-2-96: vx_setfsflags - /dev/vx/dsk/a2fdc_cfs01/trace_lv01 file system fullfsck flag set - vx_direr WARNING: msgcnt 24 mesg 008: V-2-8: vx_direrr: vx_dexh_keycheck_1 - /TraceFile file system dir inode 3277090 dev/block 0/0 diren DESCRIPTION: In case an error is returned from the vx_dirbread() function via the vx_dexh_keycheck1() function, the FULLFSCK flag is set on the file system unconditionally. A corrupted Large Directory Hash (LDH) can lead to the incorrect block being read, this results in the FULLFSCK flag being set. The system does not verify whether it reads the incorrect value due to a corrupted LDH. Subsequently, the FULLFSCK flag is set unnecessarily, because a corrupted LDH is fixed online by recreating the hash. RESOLUTION: The code is modified such that when a LDH corruption is detected, the system removes the LDH, instead of setting FULLFSCK. The LDH is recreated the next time the directory is modified. * 3864141 (Tracking ID: 3647749) SYMPTOM: An obsolete v_path is created for the VxFS node when the following steps are performed: 1) Create a file(file1). 2) Delete the file (file2). 3) Create a new file(file2, has the same inode number as file1). 4) vnode of file2 has an obsolete v_path. However, it still shows file1. DESCRIPTION: When VxFS reuses an inode, it performs some clear or reset operations to clean the obsolete information. However, the corresponding Solaris vnode may not be improperly handled, which leads to the obsolete v_path. RESOLUTION: The code is modified to call the vn_recycle() function in the VxFS inode clear routine to reset the corresponding Solaris vnode. * 3864148 (Tracking ID: 3695367) SYMPTOM: Unable to remove volume from multi-volume VxFS using "fsvoladm" command. It fails with "Invalid argument" error. DESCRIPTION: Volumes are not being added in the in-core volume list structure correctly. Therefore while removing volume from multi-volume VxFS using "fsvoladm", command fails. RESOLUTION: The code is modified to add volumes in the in-core volume list structure correctly. * 3864150 (Tracking ID: 3602322) SYMPTOM: System may panic while flushing the dirty pages of the inode. DESCRIPTION: Panic may occur due to the synchronization problem between one thread that flushes the inode, and the other thread that frees the chunks that contain the inodes on the freelist. The thread that frees the chunks of inodes on the freelist grabs an inode, and clears/de-reference the inode pointer while deinitializing the inode. This may result in the pointer de-reference, if the flusher thread is working on the same inode. RESOLUTION: The code is modified to resolve the race condition by taking proper locks on the inode and freelist, whenever a pointer in the inode is de- referenced. If the inode pointer is already de-initialized to NULL, then the flushing is attempted on the next inode. * 3864155 (Tracking ID: 3707662) SYMPTOM: Race between reorg processing and fsadm timer thread (alarm expiry) leads to panic in vx_reorg_emap with the following stack:: vx_iunlock vx_reorg_iunlock_rct_reorg vx_reorg_emap vx_extmap_reorg vx_reorg vx_aioctl_full vx_aioctl_common vx_aioctl vx_ioctl fop_ioctl ioctl DESCRIPTION: When the timer expires (fsadm with -t option), vx_do_close() calls vx_reorg_clear() on local mount which performs cleanup on reorg rct inode. Another thread currently active in vx_reorg_emap() will panic due to null pointer dereference. RESOLUTION: When fop_close is called in alarm handler context, we defer the cleaning up untill the kernel thread performing reorg completes its operation. * 3864156 (Tracking ID: 3662284) SYMPTOM: File Change Log (FCL) read may retrun ENXIO as follows: # file changelog changelog: ERROR: cannot read `changelog' (No such device or address) DESCRIPTION: VxFS reads FCL file and returns ENXIO when there is a HOLE in the file. RESOLUTION: The code is modified to zero out the user buffer when hitting a hole if FCL read is from user space. * 3864158 (Tracking ID: 2560032) SYMPTOM: System may panics while upgrading VRTSvxfs in the presence of a zone mounted on VxFS. DESCRIPTION: When the upgrade happens from base version to the target version, The post install script unloads the base level fdd module and loads the target level fdd modules when the VxFS module is still at the "base version" level. This leads to an inconsistency in the file device driver (fdd) and VxFS modules. RESOLUTION: The post install script is modified such as to avoid inconsistency. * 3864160 (Tracking ID: 3691633) SYMPTOM: Remove RCQ Full messages DESCRIPTION: Too many unnecessary RCQ Full messages were logging in the system log. RESOLUTION: The RCQ Full messages removed from the code. * 3864161 (Tracking ID: 3708836) SYMPTOM: When using fallocate together with delayed extending write, data corruption may happen. DESCRIPTION: When doing fallocate after EOF, vxfs grows the file by splitting the last extent of the file into two parts, then converts the part after EOF to a ZFOD extent. During this procedure, a stale file size is used to calculate the start offset of the newly zeroed extent. This may overwrite the blocks which contain the unflushed data generated by the extending write and cause data corruption. RESOLUTION: The code is modified to use up-to-date file size instead of the stale file size, to make sure the new ZFOD extent is created correctly. * 3864164 (Tracking ID: 3762125) SYMPTOM: Directory size sometimes keeps increasing even though the number of files inside it doesn't increase. DESCRIPTION: This only happens to CFS. A variable in the directory inode structure marks the start of directory free space. But when the directory ownership changes, the variable may become stale, which could cause this issue. RESOLUTION: The code is modified to reset this free space marking variable when there's ownershipchange. Now the space search goes from beginning of the directory inode. * 3864165 (Tracking ID: 3751049) SYMPTOM: The umountall operation fails on Solaris with error "V-3-20358: cannot open mnttab" DESCRIPTION: On Solaris, normally, fopen() returns an EMFILE error for 32-bit applications if it attempts to associate a stream with a file accessed by a file descriptor with a value greater than 255. When using umountall to umount more than 256 file systems, the command will fork child process and open more than 256 file descriptors at the same time.This will cross the 256 file descriptor maximum limit and cause the operation to fail. RESOLUTION: Use "F" mode in fopen call to avoid the 256 file descriptor limitation. * 3864167 (Tracking ID: 3735697) SYMPTOM: vxrepquota reports error like, # vxrepquota -u /vx/fs1 UX:vxfs vxrepquota: ERROR: V-3-20002: Cannot access /dev/vx/dsk/sfsdg/fs1:ckpt1: No such file or directory UX:vxfs vxrepquota: ERROR: V-3-24996: Unable to get disk layout version DESCRIPTION: vxrepquota checks each mount point entry in mounted file system table. If any checkpoint mount point entry presents before the mount point specified in the vxrepquota command, vxrepquota will report errors, but the command can succeed. RESOLUTION: Skip checkpoint mount point in the mounted file system table. * 3864170 (Tracking ID: 3743572) SYMPTOM: File system may get hang when reaching 1 billion inode limit, the hung stack is as following: vx_svar_sleep_unlock vx_event_wait vx_async_waitmsg vx_msg_send llt_msgalloc vx_cfs_getias vx_update_ilist vx_find_partial_au vx_cfs_noinode vx_noinode vx_dircreate_tran vx_pd_create vx_dirlook vx_create1_pd vx_create1 vx_create_vp vx_create DESCRIPTION: The maximum number of inodes supported by VxFS is 1 billion. When the file system is running out of inodes and the maximum inode allocation unit(IAU) limit is reached, VxFS can still create two extra IAUs if there is a hole in the last IAU. Because of the hole, when a secondary requests more inodes, the primary still thinks there is a hole available and notifies the secondary to retry. However, the secondary fails to find a slot since the 1 billion limit is hit, then it goes back to the primary to request free inodes again, and this loops infinitely, hence the hang. RESOLUTION: When the maximum IAU number is reached, prevent primary to create the extra IAUs. * 3864173 (Tracking ID: 3779916) SYMPTOM: vxfsconvert fails to upgrade layout verison for a vxfs file system with large number of inodes. Error message will show some inode discrepancy. DESCRIPTION: vxfsconvert walks through the ilist and converts inode. It stores chunks of inodes in a buffer and process them as a batch. The inode number parameter for this inode buffer is of type unsigned integer. The offset of a particular inode in the ilist is calculated by multiplying the inode number with size of inode structure. For large inode numbers this product of inode_number * inode_size can overflow the unsigned integer limit, thus giving wrong offset within the ilist file. vxfsconvert therefore reads wrong inode and eventually fails. RESOLUTION: The inode number parameter is defined as unsigned long to avoid overflow. * 3864175 (Tracking ID: 3804400) SYMPTOM: /opt/VRTS/bin/cp does not return any error when quota hard limit is reached and partial write is encountered. DESCRIPTION: When quota hard limit is reached, /opt/VRTS/bin/cp encounters a partial write, but it does not return any error to up layer application in such situation. RESOLUTION: The code is modified to adjust /opt/VRTS/bin/cp to detect the partial write caused by quota limit, and return a proper error to up layer application. * 3864177 (Tracking ID: 3808033) SYMPTOM: After a service group is set offline via VOM or VCSOracle process is left in an unkillable state. DESCRIPTION: Whenever ODM issues an async request to FDD, FDD is required to do iodone processing on it, regardless of how far the request gets. The forced unmount causes FDD to take one of the early error branch which misses iodone routine for this particular async request. From ODM's perspective, the request is submitted, but iodone will never be called. This has several bad consequences, one of which is a user thread is blocked uninterruptibly forever, if it waits for request. RESOLUTION: The code is modified to add iodone routine in the error handling code. * 3864178 (Tracking ID: 1428611) SYMPTOM: 'vxcompress' command can cause many GLM block lock messages to be sent over the network. This can be observed with 'glmstat -m' output under the section "proxy recv", as shown in the example below - bash-3.2# glmstat -m message all rw g pg h buf oth loop master send: GRANT 194 0 0 0 2 0 192 98 REVOKE 192 0 0 0 0 0 192 96 subtotal 386 0 0 0 2 0 384 194 master recv: LOCK 193 0 0 0 2 0 191 98 RELEASE 192 0 0 0 0 0 192 96 subtotal 385 0 0 0 2 0 383 194 master total 771 0 0 0 4 0 767 388 proxy send: LOCK 98 0 0 0 2 0 96 98 RELEASE 96 0 0 0 0 0 96 96 BLOCK_LOCK 2560 0 0 0 0 2560 0 0 BLOCK_RELEASE 2560 0 0 0 0 2560 0 0 subtotal 5314 0 0 0 2 5120 192 194 DESCRIPTION: 'vxcompress' creates placeholder inodes (called IFEMR inodes) to hold the compressed data of files. After the compression is finished, IFEMR inode exchange their bmap with the original file and later given to inactive processing. Inactive processing truncates the IFEMR extents (original extents of the regular file, which is now compressed) by sending cluster-wide buffer invalidation requests. These invalidations need GLM block lock. Regular file data need not be invalidated across the cluster, thus making these GLM block lock requests unnecessary. RESOLUTION: Pertinent code has been modified to skip the invalidation for the IFEMR inodes created during compression. * 3864184 (Tracking ID: 3857444) SYMPTOM: The default permission of /etc/vx/vxfssystem file is incorrect. DESCRIPTION: When creating the file "/etc/vx/vxfssystem", no permission is passed, which results in having the permission to this file as 000. RESOLUTION: The code is modified to create the file "/etc/vx/vxfssystem" with default permission as "600". * 3864185 (Tracking ID: 3859032) SYMPTOM: System panics in vx_tflush_map() due to NULL pointer dereference. DESCRIPTION: When converting VxFS using vxconvert, new blocks are allocated to the structural files like smap etc which can contain garbage. This is done with the expectation that fsck will rebuild the correct smap. but in fsck, we have missed to distinguish between EAU fully EXPANDED and ALLOCATED. because of which, if allocation to the file which has the last allocation from such affected EAU is done, it will create the sub transaction on EAU which are in allocated state. Map buffers of such EAUs are not initialized properly in VxFS private buffer cache, as a result, these buffers will be released back as stale during the transaction commit. Later, if any file-system wide sync tries to flush the metadata, it can refer to these buffer pointers and panic as these buffers are already released and reused. RESOLUTION: Code is modified in fsck to correctly set the state of EAU on disk. Also, modified the involved code paths as to avoid using doing transactions on unexpanded EAUs. * 3864186 (Tracking ID: 3855726) SYMPTOM: Panic happens in vx_prot_unregister_all(). The stack looks like this: - vx_prot_unregister_all - vxportalclose - __fput - fput - filp_close - sys_close - system_call_fastpath DESCRIPTION: The panic is caused by a NULL fileset pointer, which is due to referencing the fileset before it's loaded, plus, there's a race on fileset identity array. RESOLUTION: Skip the fileset if it's not loaded yet. Add the identity array lock to prevent the possible race. * 3864246 (Tracking ID: 3657482) SYMPTOM: Stress test on cluster file system fails due to data corruption DESCRIPTION: In direct I/O write code path, there is an optimization which avoids invalidation of any in-core pages in the range. Instead, in-core pages are updated with new data together with disk write. This optimization comes into picture when cached qio is enabled on the file. When we modify an in-core page, it is not getting marked dirty. If the page was already not dirty, there are chances that in-core changes might be lost if page was reused. This can cause a corruption if the page is read again before the disk update completes. RESOLUTION: In case of cached qio/ODM, disable the page overwrite optimization. * 3864247 (Tracking ID: 3861713) SYMPTOM: Contention observed on vx_sched_lk and vx_worklist_lk spinlock when profiled using lockstats. DESCRIPTION: Internal worker threads take a lock to sleep on a CV while waiting for work. This lock is global, If there are large numbers of CPU's and large numbers of worker threads then contention can be seen on the vx_sched_lk and vx_worklist_lk using lockstat as well as an increased %sys CPU RESOLUTION: Make the lock more scalable in large CPU configs * 3864250 (Tracking ID: 3833816) SYMPTOM: In a CFS cluster, one node returns stale data. DESCRIPTION: In a 2-node CFS cluster, when node 1 opens the file and writes to it, the locks are used with CFS_MASTERLESS flag set. But when node 2 tries to open the file and write to it, the locks on node 1 are normalized as part of HLOCK revoke. But after the Hlock revoke on node 1, when node 2 takes the PG Lock grant to write, there is no PG lock revoke on node 1, so the dirty pages on node 1 are not flushed and invalidated. The problem results in reads returning stale data on node 1. RESOLUTION: The code is modified to cache the PG lock before normalizing it in vx_hlock_putdata, so that after the normalizing, the cache grant is still with node 1.When node 2 requests PG lock, there is a revoke on node 1 which flushes and invalidates the pages. * 3864255 (Tracking ID: 3827491) SYMPTOM: Data relocation is not executed correctly if the IOTEMP policy is set to AVERAGE. DESCRIPTION: Database table is not created correctly which results in an error on the database query. This affects the relocation policy of data and the files are not relocated properly. RESOLUTION: The code is modified fix the database table creation issue. Therelocation policy based calculations are done correctly. * 3864256 (Tracking ID: 3830300) SYMPTOM: Heavy cpu usage while oracle archive process are running on a clustered fs. DESCRIPTION: The cause of the poor read performance in this case was due to fragmentation, fragmentation mainly happens when there are multiple archivers running on the same node. The allocation pattern of the oracle archiver processes is 1. write header with O_SYNC 2. ftruncate-up the file to its final size ( a few GBs typically) 3. do lio_listio with 1MB iocbs The problem occurs because all the allocations in this manner go through internal allocations i.e. allocations below file size instead of allocations past the file size. Internal allocations are done at max 8 Pages at once. So if there are multiple processes doing this, they all get these 8 Pages alternately and the fs becomes very fragmented. RESOLUTION: Added a tunable, which will allocate zfod extents when ftruncate tries to increase the size of the file, instead of creating a hole. This will eliminate the allocations internal to file size thus the fragmentation. Fixed the earlier implementation of the same fix, which ran into locking issues. Also fixed the performance issue while writing from secondary node. * 3864257 (Tracking ID: 3844820) SYMPTOM: System panic got triggered by the stress test to release removing/adding VCPUs to the guest domain while VxFS I/O was continued. Stack looks like this: - panicsys - vpanic_common - panic - die - trap - ktl0 DESCRIPTION: The adding/removing of vCPUs can cause address change of this Solaris global array: cpu[]. VxFS saved the addresses of cpu[].cpu_stat at initialization. So updating to this stale address triggered the panic. RESOLUTION: Update the addresses of cpu[].cpu_stat before vx_sar_cpu_update(). * 3864259 (Tracking ID: 3856363) SYMPTOM: vxfs reports mapbad errors in the syslog as below: vxfs: msgcnt 15 mesg 003: V-2-3: vx_mapbad - vx_extfind - /dev/vx/dsk/vgems01/lvems01 file system free extent bitmap in au 0 marked bad. And, full fsck reports following metadata inconsistencies: fileset 999 primary-ilist inode 6 has invalid number of blocks (18446744073709551583) fileset 999 primary-ilist inode 6 failed validation clear? (ynq)n pass2 - checking directory linkage fileset 999 directory 8192 block devid/blknum 0/393216 offset 68 references free inode ino 6 remove entry? (ynq)n fileset 999 primary-ilist inode 8192 contains invalid directory blocks clear? (ynq)n pass3 - checking reference counts fileset 999 primary-ilist inode 5 unreferenced file, reconnect? (ynq)n fileset 999 primary-ilist inode 5 clear? (ynq)n fileset 999 primary-ilist inode 8194 unreferenced file, reconnect? (ynq)n fileset 999 primary-ilist inode 8194 clear? (ynq)n fileset 999 primary-ilist inode 8195 unreferenced file, reconnect? (ynq)n fileset 999 primary-ilist inode 8195 clear? (ynq)n pass4 - checking resource maps DESCRIPTION: While processing the VX_IEZEROEXT extop, VxFS frees the extent without setting VX_TLOGDELFREE flag. Similarly, there are other cases where the flag VX_TLOGDELFREE is not set in the case of the delayed extent free, this could result in mapbad errors and invalid block counts. RESOLUTION: Since the flag VX_TLOGDELFREE need to be set on every extent free, modified to code to discard this flag and treat every extent free as delayed extent free implicitly. * 3864260 (Tracking ID: 3846521) SYMPTOM: cp -p is failing with EINVAL for files with 10 digit modification time. EINVAL error is returned if the value in tv_nsec field is greater than/outside the range of 0 to 999, 999, 999. VxFS supports the update in usec but when copying in the user space, we convert the usec to nsec. So here in this case, usec has crossed the upper boundary limit i.e 999, 999. DESCRIPTION: In a cluster, its possible that time across nodes might differ.so when updating mtime, vxfs check if it's cluster inode and if nodes mtime is newer time than current node time, then accordingly increment the tv_usec instead of changing mtime to older time value. There might be chance that it, tv_usec counter got overflowed here, which resulted in 10 digit mtime.tv_nsec. RESOLUTION: Code is modified to reset usec counter for mtime/atime/ctime when upper boundary limit i.e. 999999 is reached. * 3866968 (Tracking ID: 3866962) SYMPTOM: Data corruption seen when dalloc writes are going on the file and simultaneously fsync started on the same file. DESCRIPTION: In case if dalloc writes are going on the file and simultaneously synchronous flushing is started on the same file, then synchronous flushing will try to flush all the dirty pages of the file without considering underneath allocation. In this case, flushing can happen on the unallocated blocks and this can result into data loss. RESOLUTION: Code is modified to flush data till actual allocation in case of dalloc writes. * 3874662 (Tracking ID: 3871489) SYMPTOM: IO service times increased with IO intensive workload on high end server. DESCRIPTION: VxFS has worklist threads which sleep on single conditional variable. while waking up the worker threads contention can be seen on the OS sleep dispatch locks and service time for the IO can increase due to this contention. RESOLUTION: Scale the number of conditional variables to reduce contention. And also add padding to the conditional variable structure to avoid cache allocation problems. Also make sure to wakeup exact number of threads that required. * 3877070 (Tracking ID: 3880121) SYMPTOM: Internal assert failure when coalescing the extents on clone. DESCRIPTION: When coalescing extents on clone, resolving overlay extent is not supported but still code try to resolve these overlay extents. This was resulting in internal assert failure. RESOLUTION: Code is modified to not resolve these overlay extents when coalescing. * 3877339 (Tracking ID: 3880113) SYMPTOM: Internal assert failure when pushing zfod extent on clone. DESCRIPTION: If filesnaps are created for a file which has zfod extents and if the clone is created with these snapped files then writing on cloned files can end up with same extent being allocated to multiple files, resulting in emap corruption. RESOLUTION: Code is modified to not push zfod extents on clone. Patch ID: 6.1.1.100 * 3520113 (Tracking ID: 3451284) SYMPTOM: While allocating extent during write operation, if summary and bitmap data for filesystem allocation unit get mismatched then the assert hits. DESCRIPTION: if extent was allocated using SMAP on the deleted inode, and part of the AU space is moved from deleted inode to the new inode. At this point SMAP state is set to VX_EAU_ALLOCATED and EMAP is not initialized. When more space is needed for new inode, it tries to allocate from the same AU using EMAP and can hit "f:vx_sum_upd_efree1:2a" assert, as EMAP is not initialized. RESOLUTION: Code has been modified to expand AU while moving partial AU space from one inode to other inode. * 3536233 (Tracking ID: 3457803) SYMPTOM: File System gets disabled with the following message in the system log: WARNING: V-2-37: vx_metaioerr - vx_iasync_wait - /dev/vx/dsk/testdg/test file system meta data write error in dev/block DESCRIPTION: The inode's incore information gets inconsistent as one of its field is getting modified without the locking protection. RESOLUTION: Protect the inode's field properly by taking the lock operation. * 3583963 (Tracking ID: 3583930) SYMPTOM: When external quota file is over-written or restored from backup, new settings which were added after the backup still remain. DESCRIPTION: The internal quota file is not always updated with correct limits, so the quotaon operation is to copy the quota limits from external to internal quota file. To complete the copy operation, the extent of external file is compared to the extent of internal file at the corresponding offset. If the external quota file is overwritten (or restored to its original copy) and the size of internal file is more than that of external, the quotaon operation does not clear the additional (stale) quota records in the internal file. Later, the sync operation (part of quotaon) copies these stale records from internal to external file. Hence, both internal and external files contain stale records. RESOLUTION: The code has been modified to remove the stale records in the internal file at the time of quotaon. * 3617774 (Tracking ID: 3475194) SYMPTOM: Veritas File System (VxFS) fscdsconv(1M) command fails with the following error message: ... UX:vxfs fscdsconv: INFO: V-3-26130: There are no files violating the CDS limits for this target. UX:vxfs fscdsconv: INFO: V-3-26047: Byteswapping in progress ... UX:vxfs fscdsconv: ERROR: V-3-25656: Overflow detected UX:vxfs fscdsconv: ERROR: V-3-24418: fscdsconv: error processing primary inode list for fset 999 UX:vxfs fscdsconv: ERROR: V-3-24430: fscdsconv: failed to copy metadata UX:vxfs fscdsconv: ERROR: V-3-24426: fscdsconv: Failed to migrate. DESCRIPTION: The fscdsconv(1M) command takes a filename argument which is used as a recovery failure, to be used to restore the original file system in case of failure when the file system conversion is in progress. This file has two parts: control part and data part. The control part is used to store information about all the metadata like inodes and extents etc. In this instance, the length of the control part is being underestimated for some file systems where there are few inodes, but the average number of extents per file is very large (this can be seen in the fsadm E report). RESOLUTION: Make recovery file sparse, start the data part after 1TB offset, and then the control part can do allocating writes to the hole from the beginning of the file. * 3617788 (Tracking ID: 3604071) SYMPTOM: With the thin reclaim feature turned on, you can observe high CPU usage on the vxfs thread process. The backtrace of such kind of threads usually look like this: - vx_dalist_getau - vx_recv_bcastgetemapmsg - vx_recvdele - vx_msg_recvreq - vx_msg_process_thread - vx_kthread_init DESCRIPTION: In the routine to get the broadcast information of a node which contains maps of Allocation Units (AUs) for which node holds the delegations, the locking mechanism is inefficient. Thus every time when this routine is called, it will perform a series of down-up operation on a certain semaphore. This can result in a huge CPU cost when many threads calling the routine in parallel. RESOLUTION: The code is modified to optimize the locking mechanism in the routine to get the broadcast information of a node which contains maps of Allocation Units (AUs) for which node holds the delegations, so that it only does down-up operation on the semaphore once. * 3617793 (Tracking ID: 3564076) SYMPTOM: The MongoDB noSQL db creation fails with an ENOTSUP error. MongoDB uses posix_fallocate to create a file first. When it writes at offset which is not aligned with File System block boundary, an ENOTSUP error comes up. DESCRIPTION: On a file system with 8k bsize and 4k page size, the application creates a file using posix_fallocate, and then writes at some offset which is not aligned with fs block boundary. In this case, the pre-allocated extent is split at the unaligned offset into two parts for the write. However the alignment requirement of the split fails the operation. RESOLUTION: Split the extent down to block boundary. * 3620279 (Tracking ID: 3558087) SYMPTOM: Run simultaneous dd threads on a mount point and start the ls l command on the same mount point. Then the system hangs. DESCRIPTION: When the delayed allocation (dalloc) feature is turned on, the flushing process takes much time. The process keeps the glock held, and needs writers to keep the irwlock held. Thels l command starts stat internally and keeps waiting for irwlock to real ACLs. RESOLUTION: Redesign dalloc to keep the glock unlocked while flushing. * 3620288 (Tracking ID: 3469644) SYMPTOM: The system panics in the vx_logbuf_clean() function when it traverses chain of transactions off the intent log buffer. The stack trace is as follows: vx_logbuf_clean () vx_logadd () vx_log() vx_trancommit() vx_exh_hashinit () vx_dexh_create () vx_dexh_init () vx_pd_rename () vx_rename1_pd() vx_do_rename () vx_rename1 () vx_rename () vx_rename_skey () DESCRIPTION: The system panics as the vx_logbug_clean() function tries to access an already freed transaction from transaction chain to flush it to log. RESOLUTION: The code has been modified to make sure that the transaction gets flushed to the log before it is freed. * 3645825 (Tracking ID: 3622326) SYMPTOM: Filesystem is marked with fullfsck flag as an inode is marked bad during checkpoint promote DESCRIPTION: VxFS incorrectly skipped pushing of data to clone inode due to which the inode is marked bad during checkpoint promote which intern resulted in filesystem being marked with fullfsck flag. RESOLUTION: Code is modified to push the proper data to clone inode. Patch ID: 6.1.1.000 * 3383149 (Tracking ID: 3383147) SYMPTOM: The ACA operator precedence error may occur while turning AoffA delayed allocation. DESCRIPTION: Due to the C operator precedence issue, VxFS evaluates a condition wrongly. RESOLUTION: The code is modified to evaluate the condition correctly. * 3422580 (Tracking ID: 1949445) SYMPTOM: System is unresponsive when files are created on large directory. The following stack is logged: vxg_grant_sleep() vxg_cmn_lock() vxg_api_lock() vx_glm_lock() vx_get_ownership() vx_exh_coverblk() vx_exh_split() vx_dexh_setup() vx_dexh_create() vx_dexh_init() vx_do_create() DESCRIPTION: For large directories, large directory hash (LDH) is enabled to improve the lookup feature. When a system takes ownership of LDH inode twice in same thread context (while building hash for directory), it becomes unresponsive RESOLUTION: The code is modified to avoid taking ownership again if we already have the ownership of the LDH inode. * 3422584 (Tracking ID: 2059611) SYMPTOM: The system panics due to a NULL pointer dereference while flushing the bitmaps to the disk and the following stack trace is displayed: a| a| vx_unlockmap+0x10c vx_tflush_map+0x51c vx_fsq_flush+0x504 vx_fsflush_fsq+0x190 vx_workitem_process+0x1c vx_worklist_process+0x2b0 vx_worklist_thread+0x78 DESCRIPTION: The vx_unlockmap() function unlocks a map structure of the file system. If the map is being used, the hold count is incremented. The vx_unlockmap() function attempts to check whether this is an empty mlink doubly linked list. The asynchronous vx_mapiodone routine can change the link at random even though the hold count is zero. RESOLUTION: The code is modified to change the evaluation rule inside the vx_unlockmap() function, so that further evaluation can be skipped over when map hold count is zero. * 3422586 (Tracking ID: 2439261) SYMPTOM: When the vx_fiostats_tunable is changed from zero to non-zero, the system panics with the following stack trace: vx_fiostats_do_update vx_fiostats_update vx_read1 vx_rdwr vno_rw rwuio pread DESCRIPTION: When vx_fiostats_tunable is changed from zero to non-zero, all the incore-inode fiostats attributes are set to NULL. When these attributes are accessed, the system panics due to the NULL pointer dereference. RESOLUTION: The code has been modified to check the file I/O stat attributes are present before dereferencing the pointers. * 3422604 (Tracking ID: 3092114) SYMPTOM: The information output by the "df -i" command can often be inaccurate for cluster mounted file systems. DESCRIPTION: In Cluster File System 5.0 release a concept of delegating metadata to nodes in the cluster is introduced. This delegation of metadata allows CFS secondary nodes to update metadata without having to ask the CFS primary to do it. This provides greater node scalability. However, the "df -i" information is still collected by the CFS primary regardless of which node (primary or secondary) the "df -i" command is executed on. For inodes the granularity of each delegation is an Inode Allocation Unit [IAU], thus IAUs can be delegated to nodes in the cluster. When using a VxFS 1Kb file system block size each IAU will represent 8192 inodes. When using a VxFS 2Kb file system block size each IAU will represent 16384 inodes. When using a VxFS 4Kb file system block size each IAU will represent 32768 inodes. When using a VxFS 8Kb file system block size each IAU will represent 65536 inodes. Each IAU contains a bitmap that determines whether each inode it represents is either allocated or free, the IAU also contains a summary count of the number of inodes that are currently free in the IAU. The ""df -i" information can be considered as a simple sum of all the IAU summary counts. Using a 1Kb block size IAU-0 will represent inodes numbers 0 - 8191 Using a 1Kb block size IAU-1 will represent inodes numbers 8192 - 16383 Using a 1Kb block size IAU-2 will represent inodes numbers 16384 - 32768 etc. The inaccurate "df -i" count occurs because the CFS primary has no visibility of the current IAU summary information for IAU that are delegated to Secondary nodes. Therefore the number of allocated inodes within an IAU that is currently delegated to a CFS Secondary node is not known to the CFS Primary. As a result, the "df -i" count information for the currently delegated IAUs is collected from the Primary's copy of the IAU summaries. Since the Primary's copy of the IAU is stale, therefore the "df -i" count is only accurate when no IAUs are currently delegated to CFS secondary nodes. In other words - the IAUs currently delegated to CFS secondary nodes will cause the "df -i" count to be inaccurate. Once an IAU is delegated to a node it can "timeout" after a 3 minutes of inactivity. However, not all IAU delegations will timeout. One IAU will always remain delegated to each node for performance reasons. Also an IAU whose inodes are all allocated (so no free inodes remain in the IAU) it would not timeout either. The issue can be best summarized as: The more IAUs that remain delegated to CFS secondary nodes, the greater the inaccuracy of the "df -i" count. RESOLUTION: Allow the delegations for IAU's whose inodes are all allocated (so no free inodes in the IAU) to "timeout" after 3 minutes of inactivity. * 3422614 (Tracking ID: 3297840) SYMPTOM: A metadata corruption is found during the file removal process with the inode block count getting negative. DESCRIPTION: When the user removes or truncates a file having the shared indirect blocks, there can be an instance where the block count can be updated to reflect the removal of the shared indirect blocks when the blocks are not removed from the file. The next iteration of the loop updates the block count again while removing these blocks. This will eventually lead to the block count being a negative value after all the blocks are removed from the file. The removal code expects the block count to be zero before updating the rest of the metadata. RESOLUTION: The code is modified to update the block count and other tracking metadata in the same transaction as the blocks are removed from the file. * 3422626 (Tracking ID: 3332902) SYMPTOM: The system running the fsclustadm(1M) command panics while shutting down. The following stack trace is logged along with the panic: machine_kexec crash_kexec oops_end page_fault [exception RIP: vx_glm_unlock] vx_cfs_frlpause_leave [vxfs] vx_cfsaioctl [vxfs] vxportalkioctl [vxportal] vfs_ioctl do_vfs_ioctl sys_ioctl system_call_fastpath DESCRIPTION: There exists a race-condition between "fsclustadm(1M) cfsdeinit" and "fsclustadm(1M) frlpause_disable". The "fsclustadm(1M) cfsdeinit" fails after cleaning the Group Lock Manager (GLM), without downgrading the CFS state. Under the false CFS state, the "fsclustadm(1M) frlpause_disable" command enters and accesses the GLM lock, which "fsclustadm(1M) cfsdeinit" frees resulting in a panic. Another race condition exists between the code in vx_cfs_deinit() and the code in fsck, and it leads to the situation that although fsck has a reservation held, but this couldn't prevent vx_cfs_deinit() from freeing vx_cvmres_list because there is no such a check for vx_cfs_keepcount. RESOLUTION: The code is modified to add appropriate checks in the "fsclustadm(1M) cfsdeinit" and "fsclustadm(1M) frlpause_disable" to avoid the race-condition. * 3422629 (Tracking ID: 3335272) SYMPTOM: The mkfs (make file system) command dumps core when the log size provided is not aligned. The following stack trace is displayed: (gdb) bt #0 find_space () #1 place_extents () #2 fill_fset () #3 main () (gdb) DESCRIPTION: While creating the VxFS file system using the mkfs command, if the log size provided is not aligned properly, you may end up in doing miscalculations for placing the RCQ extents and finding no place. This leads to illegal memory access of AU bitmap and results in core dump. RESOLUTION: The code is modified to place the RCQ extents in the same AU where log extents are allocated. * 3422636 (Tracking ID: 3340286) SYMPTOM: The tunable setting of dalloc_enable gets reset to a default value after a file system is resized. DESCRIPTION: The file system resize operation triggers the file system re-initialization process. During this process, the tunable value of dalloc_enable gets reset to the default value instead of retaining the old tunable value. RESOLUTION: The code is fixed such that the old tunable value of dalloc_enable is retained. * 3422649 (Tracking ID: 3394803) SYMPTOM: The vxupgrade(1M) command causes VxFS to panic with the following stack trace: panic_save_regs_switchstack() panic bad_kern_reference() $cold_pfault() vm_hndlr() bubbleup() vx_fs_upgrade() vx_upgrade() $cold_vx_aioctl_common() vx_aioctl() vx_ioctl() vno_ioctl() ioctl() syscall() DESCRIPTION: The panic is caused due to de_referencing the operator in the NULL device (one of the devices in the DEVLIST is showing as a NULL device). RESOLUTION: The code is modified to skip the NULL devices when the device in EVLIST is processed. * 3436431 (Tracking ID: 3434811) SYMPTOM: In VxFS 6.1, the vxfsconvert(1M) command hangs within the vxfsl3_getext() Function with following stack trace: search_type() bmap_typ() vxfsl3_typext() vxfsl3_getext() ext_convert() fset_convert() convert() DESCRIPTION: There is a type casting problem for extent size. It may cause a non-zero value to overflow and turn into zero by mistake. This further leads to infinite looping inside the function. RESOLUTION: The code is modified to remove the intermediate variable and avoid type casting. * 3448503 (Tracking ID: 3448492) SYMPTOM: In Solaris SPARC, introduced Vnode Page Mapping (VPM) interface instead of the legacy segmap interface. DESCRIPTION: For improved performance, the VPM interface is introduced, which employs the kernel page mapping (KPM). RESOLUTION: The code is modified to support the VPM interface. * 3501832 (Tracking ID: 3413926) SYMPTOM: Internal testing hangs due to high memory consumption resulting in fork failure DESCRIPTION: The issue of high swap usage occurs with recent updates of Solaris10 and Solaris11. This issue is predominantly seen with internal stress/noise testing. The recent solaris update release had increased ncpu. As large number of buffer cache free lists in VxFS are spawned with respect to ncpu, there is high memory consumption which results in fork failure. RESOLUTION: For large number of CPU greater than 16, the number of buffer cache free lists is adjusted according to the maximum number of CPU supported. * 3504362 (Tracking ID: 3472551) SYMPTOM: The attribute validation (pass 1d) of full fsck takes too much time to complete. DESCRIPTION: The current implementation of full fsck Pass 1d (attribute inode validation) is single threaded. This causes slow full fsck performance on large file system, especially the ones having large number of attribute inodes. RESOLUTION: The Pass 1d is modified to work in parallel using multiple threads, which enables full fsck to process the attribute inode validation faster. * 3507608 (Tracking ID: 3478017) SYMPTOM: Internal test hits assert in voprwunlock. DESCRIPTION: In slow path write routine, the inode is not locked with Vnode Operation (VOP) read-write lock before returning to Operating System (OS). RESOLUTION: The code is modified to take VOP read-write lock in shared mode on inode before returning to OS. * 3512292 (Tracking ID: 3348520) SYMPTOM: In a Cluster File System (CFS) cluster having multi volume file system of a smaller size, execution of the fsadm command causes system hang if the free space in the file system is low. The following stack trace is displayed: vx_svar_sleep_unlock() vx_extentalloc_device() vx_extentalloc() vx_reorg_emap() vx_extmap_reorg() vx_reorg() vx_aioctl_full() vx_aioctl_common() vx_aioctl() vx_unlocked_ioctl() vfs_ioctl() do_vfs_ioctl() sys_ioctl() tracesys() And vxg_svar_sleep_unlock() vxg_grant_sleep() vxg_api_lock() vx_glm_lock() vx_cbuf_lock() vx_getblk_clust() vx_getblk_cmn() vx_getblk() vx_getmap() vx_getemap() vx_extfind() vx_searchau_downlevel() vx_searchau_uplevel() vx_searchau() vx_extentalloc_device() vx_extentalloc() vx_reorg_emap() vx_extmap_reorg() vx_reorg() vx_aioctl_full() vx_aioctl_common() vx_aioctl() vx_unlocked_ioctl() vfs_ioctl() do_vfs_ioctl() sys_ioctl() tracesys() DESCRIPTION: While performing the fsadm operation, the secondary node in the CFS cluster is unable to allocate space from EAU (Extent Allocation Unit) delegation given by the primary node. It requests the primary node for another delegation. While giving such delegations, the primary node does not verify whether the EAU has exclusion zones set on it. It only verifies if it has enough free space. On secondary node, the extent allocation cannot be done from EAU which has exclusion zone set, resulting in loop. RESOLUTION: The code is modified such that the primary node will not delegate EAU to the secondary node which have exclusion zone set on it. * 3518943 (Tracking ID: 3534779) SYMPTOM: Internal stress testing on Cluster File System (CFS) hits a debug assert. DESCRIPTION: The assert was hit while refreshing the incore reference count queue (rcq) values from the disk in response to a loadfs message. Due to which, a race occurs with a rcq processing thread that has already advanced the incore rcq indexes on a primary node in CFS. RESOLUTION: The code is modified to avoid selective updates in incore rcq. * 3519809 (Tracking ID: 3463464) SYMPTOM: Internal kernel functionality conformance test hits a kernel panic due to null pointer dereference. DESCRIPTION: In the vx_fsadm_query()function, error handling code path incorrectly sets the nodeid to AnullA in the file system structure. As a result of clearing nodeid, any subsequent access to this field results in the kernel panic. RESOLUTION: The code is modified to improve the error handling code path. * 3528770 (Tracking ID: 3449152) SYMPTOM: The vxtunefs(1M) command fails to set the thin_friendly_alloc tunable in CFS. DESCRIPTION: The thin_friendly_alloc tunable is not supported on CFS. But when the vxtunefs(1M) command is used to set it in CFS, a false successful message is displayed. RESOLUTION: The code is modified to report error for the attempt to set the thin_friendly_alloc tunable in CFS. * 3529852 (Tracking ID: 3463717) SYMPTOM: CFS does not support the 'thin_friendly_alloc' tunable. And, the vxtunefs(1M) command man page is not updated with this information. DESCRIPTION: Since the man page does not explicitly mention that the 'thin_friendly_alloc' tunable is not supported, it is assumed that CFS supports this feature. RESOLUTION: The man page pertinent to the vxtunefs(1M) command is updated to denote that CFS does not support the 'thin_friendly_alloc' tunable. * 3529862 (Tracking ID: 3529860) SYMPTOM: The package verification using the pkg verifyA command fails for VRTSglm, VRTSgms, and VRTSvxfs packages on Solaris 11. DESCRIPTION: The package verification using the Apkg verifyA command fails for the Group Lock Manager (GLM), Group Messaging Services (GMS), and VxFS packages on Solaris 11 due to a missing minor node permission '* 0640 root sys' from etc/minor_perm file. RESOLUTION: The code is modified to update the entry in etc/minor_perm file. * 3530038 (Tracking ID: 3417321) SYMPTOM: The vxtunefs(1M) man page gives an incorrect DESCRIPTION: According to the current design, the tunable Adelicache_enableA is enabled by default both in case of local mount and cluster mount. But, the man page is not updated accordingly. It still specifies that this tunable is enabled by default only in case of a local mount. The man page needs to be updated to correct the RESOLUTION: The code is modified to update the man page of the vxtunefs(1m) tunable to display the correct contents for the Adelicache_enableA tunable. Additional information is provided with respect to the performance benefits, in case of CFS being limited as compared to the local mount. Also, in case of CFS, unlike the other CFS tunable parameters, there is a need to explicitly turn this tunable on or off on each node. * 3541125 (Tracking ID: 3541083) SYMPTOM: The vxupgrade(1M) command for layout version 10 creates 64-bit quota files with inappropriate permission configurations. DESCRIPTION: Layout version 10 supports 64-bit quota feature. Thus, while upgrading to version 10, 32-bit external quota files are converted to 64-bit. During this conversion process, 64-bit files are created without specifying any permission. Hence, random permissions are assigned to the 64-bit file, which creates an impression that the conversion process was not successful as expected. RESOLUTION: The code is modified such that appropriate permissions are provided while creating 64-bit quota files. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- To install the patch perform the following steps on at least one node in the cluster: 1. Copy the patch fs-sol11_sparc-Patch-6.1.1.400.tar.gz to /tmp 2. Untar fs-sol11_sparc-Patch-6.1.1.400.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/fs-sol11_sparc-Patch-6.1.1.400.tar.gz # tar xf /tmp/fs-sol11_sparc-Patch-6.1.1.400.tar 3. Install the hotfix # pwd /tmp/hf # ./installVRTSvxfs611P400 [ ...] You can also install this patch together with 6.1.1 maintenance release using Install Bundles 1. Download this patch and extract it to a directory 2. Change to the Veritas InfoScale 6.1.1 directory and invoke the installmr script with -patch_path option where -patch_path should point to the patch directory # ./installmr -patch_path [] [ ...] Install the patch manually: -------------------------- 1. pkg uninstall VRTSvxfs 2. pkg unset-publisher Symantec 3. pkg unset-publisher Veritas 4. pkg set-publisher -g Veritas 5. pkg install --accept -g VRTSvxfs REMOVING THE PATCH ------------------ 1. pkg uninstall VRTSvxfs KNOWN ISSUES ------------ * Tracking ID: 3877575 SYMPTOM: File System migration may hit kernel panic. WORKAROUND: No SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE