README VERSION		: 1.1
README CREATION DATE	: 2012-03-09
PATCH-ID 		: PVCO_03952 
PATCH NAME		: VRTSvxfs 6.0RP1
BASE PACKAGE NAME	: VRTSvxfs
BASE PACKAGE VERSION	: 6
OBSOLETE PATCHES	: NONE
SUPERSEDED PATCHES	: NONE
REQUIRED PATCHES	: PVKL_03951
INCOMPATIBLE PATCHES	: NONE
SUPPORTED PADV		: hpux1131
(P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION)
PATCH CATEGORY		:  CORE ,  CORRUPTION ,  HANG ,  PANIC ,  PERFORMANCE
REBOOT REQUIRED		: NO

PATCH INSTALLATION INSTRUCTIONS:
--------------------------------


Please refer to the Release Notes for Installation and Uninstallation Instructions.

PATCH UNINSTALLATION INSTRUCTIONS:
----------------------------------
Please refer to the Release Notes for Installation and Uninstallation Instructions.

SPECIAL INSTRUCTIONS:
---------------------
NONE

SUMMARY OF FIXED ISSUES:
-----------------------------------------
2627108  Expanding or shrinking a DLV5 file system using the fsadm(1M)command causes a system panic. 
2645821  The fscdsconv(1M) command which is used to convert corrupted or non-VxFS file systems generates core. 
2645827  The system may panic while re-organizing the file system using the fsadm(1M) command. 
2653418  The performance of vxbench degrades for buffered writes on the HP-UX platform. 
2654505  Listing of a partitioned directory using the DMAPI does not list all the entries. 
2654506  Metadata corruption may be seen after recovery. 
2654770  Upgrade of a file system from version 8 to 9 fails in the presence of partition directories and clones. 
2654773  In certain cases write on a regular file which has shared 
extent as the last allocated extent can fail with EIO error. 
2654783  write operation on a regular file mapping to shared 
compressed extent  results in corruption 
2654790  In certain rare cases after a successful execution of 
vxfilesnap command, if the source file gets deleted in a very short span of 
time 
after the filesnap operation, then the destination file can get corrupted and 
this 
could also lead to setting of VX_FULLFSCK flag in the super block. 
2660094  On a cluster mounted file system, the mount(1M) command hangs. 
2661222  In a cluster mounted file system, memory corruption is seen during the execution of the SmartMove feature. 
2679504  A write(2) operation exceeding the quota limit fails with an EDQUOT error. 
2679523  Duplicate file names can be seen in a directory. 
2684895  Deadlock because of wrong spin lock interrupt level at which delayed allocation list lock is taken. 


SUMMARY OF  KNOWN ISSUES:
-----------------------------------------
2654682  Full file system check using fsck_vxfs(1m) takes over a week 
2654685  on a cluster mounted filesystem ls(1m) with -l option may 
take longer 
2627081  vxfsd is taking a lot of CPU time after deleting some 
directories 
2627096  fsckptadm (1m) fails with ENXIO 
2627101  On file-system having low free memory , commands like ls,find 
may seem to be hung. 
2627084  umount(1m) on a CFS filesystem panics the machine. 
2627089  mmap(1m) performance is lower on VXFS 5.0.1 with HPUX 11.31 
2654673  A cluster mounted file-system may panic the system showing  
vx_tflush_map in the stack trace. 
2684732  fsppadm(1m) dumps core with SIGSEGV while assigning a policy. 


KNOWN ISSUES : 
--------------


 * INCIDENT NO::2654682	 TRACKING ID ::2628207

SYMPTOM:: One large file-system with many checkpoints the fsck operation may seem 
to be hung 

WORKAROUND:: None 

 * INCIDENT NO::2654685	 TRACKING ID ::2651922

SYMPTOM:: on a cluster mounted filesystem ls(1m) with -l option may take longer 

WORKAROUND:: None 

 * INCIDENT NO::2627081	 TRACKING ID ::2129455

SYMPTOM:: vxfsd is taking a lot of CPU time after deleting some directories 

WORKAROUND:: None. 

 * INCIDENT NO::2627096	 TRACKING ID ::1956458

SYMPTOM:: fsckptadm (1m) fails with ENXIO and filesytem is marked for full fsck 

WORKAROUND:: None. 

 * INCIDENT NO::2627101	 TRACKING ID ::2359706

SYMPTOM:: On file-system having low free memory , commands like ls,find may seem 
to be hung. 

WORKAROUND:: None 

 * INCIDENT NO::2627084	 TRACKING ID ::2107152

SYMPTOM:: In rare corner cases, system panics while unmounting a cluster mounted 
filesystem 

WORKAROUND:: None 

 * INCIDENT NO::2627089	 TRACKING ID ::2183320

SYMPTOM:: mmap(1m) performance is lower on VXFS 5.0.1 with HPUX 11.31 

WORKAROUND:: None 

 * INCIDENT NO::2654673	 TRACKING ID ::2558892

SYMPTOM:: VxFS causes a server to panic. The subroutine initiating the panic is 
vx_tflush_map. 

WORKAROUND:: None 

 * INCIDENT NO::2684732	 TRACKING ID ::2715414

SYMPTOM:: fsppadm(1m) dumps core with SIGSEGV while assigning a policy. 

WORKAROUND:: Increase the pthread stack size using the following command 
export PTHREAD_DEFAULT_STACK_SIZE=2048000 

FIXED INCIDENTS: 
----------------


 PATCH ID:PVCO_03952

 * INCIDENT NO:2627108	 TRACKING ID:2599590

SYMPTOM: Expansion of a 100% full file system may panic the machine with the following 
stack trace.
 
bad_kern_reference()
$cold_vfault()
vm_hndlr()
bubbledown()
vx_logflush()
vx_log_sync1()
vx_log_sync()
vx_worklist_thread()
kthread_daemon_startup() 

DESCRIPTION: When 100% full file system is expanded intent log of the file system is 
truncated and blocks freed up are used during the expansion. Due to a bug the 
block map of the replica intent log inode was not getting updated thereby 
causing the block maps of the two inodes to differ. This caused some of the in-
core structures of the intent log to go NULL. The machine panics while de-
referencing one of this structure. 

RESOLUTION:Updated the block map of the replica intent log inode correctly. 100% full file 
system now can be expanded only If the last extent in the intent log contains 
more than 32 blocks, otherwise fsadm will fail. To expand such a file-system, 
some of the files should be deleted manually and resize be retried. 

 * INCIDENT NO:2645821	 TRACKING ID:2536130

SYMPTOM: Issue 1:
If fscdsconv command is used to convert 
    a) corrupted or
    b) non-VxFS 
file system, instead of exiting with prooper error message it crashes with the 
following message:

devops.c    1693:	ASSERT(0) failed 

DESCRIPTION: Before starting to convert a filesystem, VxFS does a sanity check on the file 
system. While doing sanity check, for corrupt file system the command 
incorrectly tries to close the file system twice. Hence on the second close the 
program crashes. 

RESOLUTION:Fix makes sure that if close is called once it is not called again. Now if 
fscdsconv is run on a corrupted or non VxFs file system it exits with the 
following error message:
UX:vxfs fscdsconv: ERROR: V-3-20318: file system dirty, run fsck first
UX:vxfs fscdsconv: ERROR: V-3-24426: fscdsconv: Failed to migrate.

Issue 2:
If partition directory feature is enabled in filesystem and trying to export 
the 
filesystem from Linux to AIX platform , mount command asserts with 
"f:VX_FCL_INIT_FAILED:ndebug". 

 * INCIDENT NO:2645827	 TRACKING ID:2552095

SYMPTOM: While re-organizing the file-system using fsadm (1M) the system may 
panic with the following stack trace
vx_iget
vx_aioctl_full
vx_aioctl_common
vx_aioctl
vx_ioctl
vx_ioctl_skey 

DESCRIPTION: Due to a race condition in the function vx_inactive and vx_iget, A 
inode which is on the freelist can be given erroneously , leading to panic 
similar to the one mentioned above. 

RESOLUTION:Code is modified to take necessary protection while updating the 
inode pointer, thus avoiding the race. 

 * INCIDENT NO:2653418	 TRACKING ID:2646933

SYMPTOM: Large sequential writes take extra time to complete compared to previous 
version 
of the software. The degradation in write performance is observed for large 
sequential write in the order of Gigabytes. 

DESCRIPTION: This issue was introduced because of omission of an optimization flag in the 
call 
to operating system virtual memory interfaces. The flag is    FVF_FLUSH_BHND 
used 
in virtual memory API  call  to flush a file. This issue has no effect when 
direct 
i/o is used for writes. The performance degradation is observed only when 
delayed 
allocation feature is used (feature is ON by default). The bug doesn't have any 
effect if the file system is mounted with cluster option. 

RESOLUTION:The correct flag was passed when flushing a file. 

 * INCIDENT NO:2654505	 TRACKING ID:2624459

SYMPTOM: Listing of a partitioned directory through DMAPI does not list all the 
entries in certain scenarios. 

DESCRIPTION: A partitioned namespace visible directory has multiple hidden 
directories under it and the entries are distributed across those hidden 
directories. A readdir operation on the visible directory is expected to 
traverse 
all the hidden directories and list their entries, but some of the hidden 
directories were getting skipped due to wrong offset manipulation, resulting in 
entries presenting in those hidden directories being non-reported. 

RESOLUTION:Offset manipulation problem is fixed to ensure all hidden 
directories 
are traversed and their contents being reported. 

 * INCIDENT NO:2654506	 TRACKING ID:2613884

SYMPTOM: Metadata corruption may be seen after recovery 

DESCRIPTION: If during getting a delegation, the primary dies then even though
the node doesn't get the delegation, a flag is set instructing it to put the
delegation .Because of this after recovery , when we retry we may put the
delegation which we no longer have (some other node may have it) leading to
corruption. 

RESOLUTION:Handled this case by resetting the flag when we retry again if the
primary dies. 

 * INCIDENT NO:2654770	 TRACKING ID:2583197

SYMPTOM: Upgrade of filesystem from version 8 to 9 fails in presence of 
partition
directories and clones. 

DESCRIPTION: In version 9, we changed the hash function for partitioned directory. So, while
upgrading from version 8 to 9, all partitioned directories need to be converted
back to regular directories. In our code, we have restrictions to avoid 
modifying
read only clones which hampered changing partitioned directories to regular
directories on read only clones, resulting in upgrade failing with EROFS. 

RESOLUTION:Fixed the code to let us modify read only clones as well while we 
are
upgrading from 8 to 9. 

 * INCIDENT NO:2654773	 TRACKING ID:2645108

SYMPTOM: A write to a regular file can fail with EIO in certain cases. 

DESCRIPTION: In certain cases on a file, which has a shared extent allocated as 
the last extent of the file, an extending write beyond EOF can fail with EIO 
error due to VxFS erroneously looking for allocated extents beyond the largest 
permissible file offset. 

RESOLUTION:Corrected the code to not look for extents beyond EOF when not 
necessary. 

 * INCIDENT NO:2654783	 TRACKING ID:2645112

SYMPTOM: A write operation on a shared compressed extent can result in 
corruption 
of the compressed data associated with that file. 

DESCRIPTION: In certain cases a write operation on a shared compressed extent 
can 
lead to corruption of compressed extent associated with a file. This can happen 
due to not copy-on-write operation not splitting a shared compressed extent at 
the 
right extent boundary. 

RESOLUTION:Corrected the code to handle this case to avoid breaking a 
compressed 
extent during copy-on-write operation. 

 * INCIDENT NO:2654790	 TRACKING ID:2645109

SYMPTOM: In certain rare cases after a successful execution of vxfilesnap 
command, 
if the source file gets deleted in a very short span of time after the filesnap 
operation, then the destination file can get corrupted and this could also lead 
to 
setting of VX_FULLFSCK flag in the super block. 

DESCRIPTION: In certain rare cases a successful execution of the vxfilesnap 
command can lead to allocation of a new indirect extent to the source and 
destination files. In such a case if the source file gets deleted immediately 
after the filesnap operation then file system may decide not to flush data 
belong 
to this indirect extent and this may result in corruption of the destination 
file. 
Also depending on what data previously existed in the indirect extent, 
filesystem's VX_FULLFSCK flag can get set in the superblock. 

RESOLUTION:In the above mentioned case code now takes care to flush any newly 
allocated indirect extent immediately after the successful execution of the 
filesnap operation. 

 * INCIDENT NO:2660094	 TRACKING ID:2580905

SYMPTOM: On a cluster mounted files-system mount (1M) command hangs. 

DESCRIPTION: While mounting the file-system we unnecessarily query the re-org status and the
device information which races with other updates causing the hang 

RESOLUTION:Code is modified to remove the extra re-org status and device information
queries, thus reducing the possibility of hang. 

 * INCIDENT NO:2661222	 TRACKING ID:2660761

SYMPTOM: system panics due to memory corruption. The panic stack could be anything. With 
postmark benchmark the most common stack was :

page_fault+0x1f/0x30
vx_iupdat_cluster+0x6c/0x390 [vxfs]
vx_iupdat_local+0x32e/0x610 [vxfs]
vx_cfs_iupdat+0x4d/0x150 [vxfs]
vx_tflush_inode+0x11e/0x180 [vxfs]
vx_fsq_flush+0x2fe/0x7d0 [vxfs]
vx_fsflush_fsq+0x93/0xc0 [vxfs]
vx_workitem_process+0xb/0x20 [vxfs]
vx_worklist_process+0x115/0x260 [vxfs]
vx_worklist_thread+0x5d/0xa0 [vxfs]
vx_kthread_init+0x75/0x90 [vxfs]
child_rip+0xa/0x20 

DESCRIPTION: This code that handles smartmove requests for cluster mounted file system had a 
bug that under certain conditions could write to memory it did not allocate. 

RESOLUTION:The code was changed to ensure that smartmove does not touch the memory it did 
not 
allocate. 

 * INCIDENT NO:2679504	 TRACKING ID:2566875

SYMPTOM: A write(2) exceeding a quota limit fails with EDQUOT error(i.e. "Disc quota
exceeded") before user quota limit is reached. Therefore write upto length below
quota limit is not performed. Partial write of length within quota limit should
be performed instead of just returning EDQUOT error. 

DESCRIPTION: When a write request exceeds a quota limit, the error EDQUOT should be handled
so that vxfs can manage to allocate space up to the hard quota limit to proceed
with a partial write. However, vxfs doesn't handle this error and error is
returned without performing partial write. 

RESOLUTION:EDQUOT error from extent allocation routine is now handled to retry write of
length within quota limit. 

 * INCIDENT NO:2679523	 TRACKING ID:2670022

SYMPTOM: Duplicate file names can be seen in a directory. 

DESCRIPTION: VxFS maintains internal directory name lookup cache (DNLC) to improve the 
performance of directory lookups. A race condition is arising in DNLC lists 
manipulation code during lookup/creation of file names having >32 characters ( 
which will further affect other file creations). This is causing DNLC to have a 
stale entry for an existing file in the directory. A lookup of such a file 
through DNLC will say file as non-existent which will allow another duplicate 
file name in the directory. 

RESOLUTION:Fixed the race condition by protecting the DNLC lists through proper locks. 

 * INCIDENT NO:2684895	 TRACKING ID:2655754

SYMPTOM: Several vxfsd kernel threads are stuck trying to take a simple_lock. System 
responsiveness may drop drastically. One may observe vxfskd threads with following 
stack trace in the kernel thread list

(0)> f pvthread+010600
pvthread+010600 STACK:
Use current context [F000000032099600] of cpu 3
[0001151C]krlock_confer_norestart+000000 ()
[004B1820]krlock+0003A0 (??, ??)
[00534138]slock_krlock_acquire+000258 (??, ??, ??)
[005349B8]slock+000558 (??, ??)
[00009558].simple_lock+000058 ()
[0487613C].vx_idalloc_off+0002F8 ()
[048CDF64].vx_iflush_list+0009BC ()
[048CE1FC].vx_iflush+00009C ()
[048C8304].vx_workitem_process+00003C ()
[048D357C].vx_worklist_process+0000F4 ()
[048D37B4].vx_worklist_thread+000090 ()
[04855B28].vx_thread_base+000034 ()
[0035B6DC]threadentry+00005C (??, ??, ??, ??)
[kdb_read_mem] no real storage @ FFFFFFFFFFF9600 

DESCRIPTION: This is a deadlock resulting from wrong ipl (interrupt level) used in a file 
system spinlock. The locks are related to delayed allocation feature and so the 
issue doesn't exists if the feature is unused (the feature is ON by default). The 
bug doesn't have any effect if the file system is mounted with cluster option. 

RESOLUTION:The spin lock initialization and use are modified to use the correct ipl. 

INCIDENTS FROM OLD PATCHES:
---------------------------
NONE