* * * READ ME * * * * * * Veritas File System 5.0 MP3 RP5 * * * * * * P-patch 1 * * * Patch Date: 2013-01-04 This document provides the following information: * PATCH NAME * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * OPERATING SYSTEMS SUPPORTED BY THE PATCH * INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- Veritas File System 5.0 MP3 RP5 P-patch 1 PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSvxfs BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * Veritas Cluster Server 5.0 MP3 * Veritas File System 5.0 MP3 * Veritas Storage Foundation for Oracle RAC 5.0 MP3 * Veritas Storage Foundation Cluster File System 5.0 MP3 * Veritas Storage Foundation 5.0 MP3 * Veritas Storage Foundation High Availability 5.0 MP3 * Veritas Storage Foundation for DB2 5.0 MP3 * Veritas Storage Foundation for Oracle 5.0 MP3 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- AIX 5.3 ppc AIX 6.1 ppc AIX 7.1 ppc INCIDENTS FIXED BY THE PATCH ---------------------------- This patch fixes the following Symantec incidents: Patch ID: 5.0.3.510 * 2377088 (Tracking ID: 2372093) SYMPTOM: In a Cluster File System (CFS) environment , the file read performances gradually degrade up to 10% of the original read performance and the fsadm(1M) - F vxfs -D -E shows a large number (> 70%) of free blocks in extents smaller than 64k. For example, % Free blocks in extents smaller than 64 blks: 73.04 % Free blocks in extents smaller than 8 blks: 5.33 DESCRIPTION: In a CFS environment, the disk space is divided into Allocation Units (AUs).The delegation for these AUs is cached locally on the nodes. When an extending write operation is performed on a file, the file system tries to allocate the requested block from an AU whose delegation is locally cached, rather than finding the largest free extent available that matches the requested size in the other AUs. This leads to a fragmentation of the free space, thus leading to badly fragmented files. RESOLUTION: The code is modified such that the time for which the delegation of the AU is cached can be reduced using a tunable, thus allowing allocations from other AUs with larger size free extents. Also, the fsadm(1M) command is enhanced to de- fragment free space using the -C option. * 2377975 (Tracking ID: 2346730) SYMPTOM: On the AIX platform a method of identifying the pinned memory in use by VxFS and GLM is required. DESCRIPTION: The VxFS inode cache, buffer cache and Directory Name lookup Cache (DNLC) are the main consumers of pinned memory by VxFS. We require counters detailing the total pinned memory in use by VxFS and total in use by GLM, plus a break down of the pinned memory in-use for these three VxFS caches RESOLUTION: New vxfsstat counters have been added to track the VxFS pinned memory usage, the VxFS counters can be displayed using the "vxfsstat -m" option. A new option to glmstat has been added to display the current pinned memory inuse by GLM. This new option is "glmstat -p". * 2403128 (Tracking ID: 2403126) SYMPTOM: Hang is seen in the cluster when one of the nodes in the cluster leaves or rebooted. One of the nodes in the cluster will contain the following stack trace. e_sleep_thread() vx_event_wait() vx_async_waitmsg() vx_msg_send() vx_send_rbdele_resp() vx_recv_rbdele+00029C () vx_recvdele+000100 () vx_msg_recvreq+000158 () vx_msg_process_thread+0001AC () vx_thread_base+00002C () threadentry+000014 (??, ??, ??, ??) DESCRIPTION: Whenever a node in the cluster leaves, reconfiguration happens and all the resources that are held by the leaving nodes are consolidated. This is done on one node of the cluster called primary node. Each node sends a message to the primary node about the resources it is currently holding. During this reconfiguration, in a corner case, VxFS is incorrectly calculating the message length which is larger than what GAB(Veritas Group Membership and Atomic Broadcast) layer can handle. As a result the message is getting lost. The sender thinks that the message is sent and waits for acknowledgement. The message is actually dropped at sender and never sent. The master node which is waiting for this message will wait forever and the reconfiguration never completes leading to hang. RESOLUTION: The message length calculation is done properly now and GAB can handle the messages. * 2515105 (Tracking ID: 2515101) SYMPTOM: The "svmon -O vxfs=on" option can be used to collect VxFS file system details, with this enabled subsequently executing the "svmon -S" command can generate a system panic in the svm_getvxinode_gnode routine when trying to collect information from the VxFS segment control blocks. 16)> f pvthread+838900 STACK: [F100000090704A38]perfvmmstat:svm_getvxinode_gnode+000038 DESCRIPTION: VxFS will create and delete AIX Virtual Memory Management [VMM] structures called Segment Control Blocks [SCB] via VMM interfaces. VxFS was leaking SCBs via one specific code path. The "svmon -S" command will parse a global list of SCB structures, including any SCB structures leaked by VxFS. If svmon is also collecting information about VxFS file systems the gnode element of the SCB will be dereferenced, for a leaked SCB the gnode will be stale and thus now contain unrelated content, reading and dereferencing this content can generate the panic. RESOLUTION: A very simple and low risk change now prevents Segment Control Blocks from being leaked by VxFS, the SCBs will now be correctly removed by VxFS. * 2517944 (Tracking ID: 2517942) SYMPTOM: Poor VxFS performance for application doing writes on a mmaped file which has been written to before being mmaped and therefore has all the associated pages already brought into memory as a result. Large page-ins are observed as soon as the writes begin. DESCRIPTION: Buffered writes to a file brings the associated page in memory. Such a page can be written to. However mmaping this file now will mark the page read-only. Any mmaped write writing to same page will encounter now a protection fault thereby leading to page-in. RESOLUTION: For preventing large page-ins we have removed the extra overhead of protection fault by marking pages read-write in the mmap range wherever possible. * 2711974 (Tracking ID: 2650330) SYMPTOM: Accessing a file with O_NSHARE mode by multiple process concurrently on Aix could cause file system hang. DESCRIPTION: There are two different hang scenarios. First, a deadlock can be seen between open threads and a freeze operation. For example, a freeze T1 issued by commands, such as umount, fsadm etc stops new threads from getting active level and meanwhile is waiting for an old thread T2 which is holding active level, but expecting ilock. However, T3 thread with the ilock is not able to get the active level because of freeze thread T1. T1: vx_async_waitmsg+00001C vx_msg_broadcast+000118 vx_cwfa_step+0000A0 vx_cwfreeze_common+0000F8 vx_cwfreeze_all+0002E8 vx_freeze+000038 vx_detach_fset+000394 vx_unmount+0001AC vx_unmount_skey+000034 T2: simple_lock+000058 vx_ilock+000020 vx_close1+000720 vx_close+00006C vx_close_skey+00003C vnop_close+000094 vno_close+000050 closef+00005C T3: vx_delay+000010 vx_active_common_flush+000038 vx_open_modes+00058C vx_open1+0001FC vx_open+00007C vx_open_skey+000044 Another RESOLUTION: Give up the ilock prior to attempting active level, and wakeup function regards ilock as simple lock instead of complex lock. * 3004492 (Tracking ID: 2848948) SYMPTOM: VxFS buff cache consumption increased significantly after running over 248 days. The problem is specifix to aix platform. DESCRIPTION: As a fundamental concept, the buffer cache holds copies of data which correspond to blocks containing file system metadata (directory blocks, indirect blocks, raw inode structures, and many other data types). These buffers are held in memory until they get explicitly invalidated for some reason, the memory is needed for other purposes, or they have not been accessed recently. Every UNIX system provides a time counter for storing system time value. On AIX it is a 64 bit variable named lbolt. The problem however is that the data type that the vxfs code presumed was the correct one to save this value (clock_t) is only a 32 bit type on AIX. Due to this oversight, any AIX system which has been running long enough has a value in lbolt which is improperly typecasted to a variable of type clock_t. The variable saved for the age of the buffer cache can be negative. The code which compares the various saved and generated timestamps can calculate the wrong time differences due to losing the higher bits in the clock. This can cause it to think that the buffers are effectively newer than the current time. Because of these issues, any AIX system running vxfs may encounter a hang after the value in lbolt exceeds the maximum signed 32 bit integer. RESOLUTION: The data types of lbolt and all those variables where time is stored are corrected to be consistent (64 bit). INSTALLING THE PATCH -------------------- If the currently installed VRTSvxfs is below 5.0.3.0 level, you must upgrade VRTSvxfs to 5.0.3.0 level before installing this patch. AIX maintenance levels and APARs can be downloaded from the IBM Web site: http://techsupport.services.ibm.com Install the VRTSvxfs.bff patch if VRTSvxfs is already installed at fileset level 5.0.3.0 A system reboot is required after installing this patch. To apply the patch, first unmount all VxFS file systems, then enter these commands: # mount | grep vxfs # cd # installp -aXd VRTSvxfs.bff VRTSvxfs # reboot REMOVING THE PATCH ------------------ If you need to remove the patch, first unmount all VxFS file systems, then enter these commands: # mount | grep vxfs # installp -r VRTSvxfs # reboot SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE