README VERSION : 1.1 README CREATION DATE : 2012-03-06 PATCH-ID : 6.0.1.0 PATCH NAME : VRTSvxvm 6.0RP1 BASE PACKAGE NAME : VRTSvxvm BASE PACKAGE VERSION : 6.0.0.0 OBSOLETE PATCHES : NONE SUPERSEDED PATCHES : NONE REQUIRED PATCHES : NONE INCOMPATIBLE PATCHES : NONE SUPPORTED PADV : rhel5_x86_64,rhel6_x86_64,sles10_x86_64,sles11_x86_64 (P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION) PATCH CATEGORY : CORE , CORRUPTION , HANG , MEMORYLEAK , PANIC , PERFORMANCE REBOOT REQUIRED : YES PATCH INSTALLATION INSTRUCTIONS: -------------------------------- Please refer the release notes for installation instructions. PATCH UNINSTALLATION INSTRUCTIONS: ---------------------------------- Please refer the release notes for un-installation instructions. SPECIAL INSTALL INSTRUCTIONS: ----------------------------- NONE SUMMARY OF FIXED ISSUES: ----------------------------------------- 2589962 Support utility vxfmrmap (deprecating vxfmrshowmap) to display DCO map contents and verification against possible state corruptions 2598525 CVR: memory leaks reported 2605706 write fails on volume on slave node after join which earlier had disks in "lfailed" state 2607793 DMP-ASM: disabling all paths and reboot of the host causes losing of /etc/vx/.vxdmprawdev records 2615288 Site consistency: Both sites become detached after data/dco plex failue at each site, leading to I/O cluster wide outage 2624574 VVR Logowner: local I/O starved with heavy I/O load from Logclient 2625718 vxconfigbackup script error: vxcfgbk_corrupt calling keep_recentcopies with insufficient argument 2625743 while upgrading diskgroup version if rlink is not upto date then the vxrvg shows error but diskgroup version gets updated 2625762 secondary master panics at volkiofree 2625766 I/O hang on master node after storage is removed 2626746 Using vxassist -o ordered and mediatype:hdd options together do not work as expected 2626894 When detached disk after connectivity restoration is tried to reattach gives 'Tagid conflict' error 2626994 'vxdg listtag' should give error message and display correct usage when executed with wrong syntax 2630074 Longevity:sfrac: after 'vxdg destroy' hung (for shared DiskGroup), all vxcommands hang on master 2637183 Intermittent data corruption after a vxassist move 2643124 Install Upgrade : After upgrade to 6.0 encapsulated root disk is marked as 'clone_disk'. 2643125 [abrt] new crash was detected on RHEL6.1 during upgrade due to mod unload, possibly of vxspec 2643126 Display error message when sector size is too large when adding foreign device. 2643134 Failure during validating mirror name interface for linked mirror volume 2643137 read/seek I/O errors during init/define of nopriv slice 2643142 'vxmake -g -d ' fails with very large configuration due to memory leaks 2643151 disks with hpdisk format can't be initialized with private region offset other than 128 2643154 vxtune doesn't accept tunables correctly in human readable format 2643155 VVR: Primary master panic'ed in rv_ibc_freeze_timeout 2643156 CVM: diskgroup activation can hang due to a bug in vxvm kernel code 2644185 vxdmpadm dump core in display_dmpnodes_of_redundancy 2646999 World Writable and unapproved file permissions 2647120 Volume Manager does not recover a failed path on 5.1SP1RP2 2660157 vxtune -r option is printing wrong tunable value 2666174 A small portion of possible memory leak incase of mix (clone and non-cloned) diskgroup import 2668641 vxunroot does not set original menu.lst and fstab files, SUSE 10.0 NETAPP FAS3000 ALUA SANBOOT 2682534 Starting 32TB RAID5 volume fails with V-5-1-10128 Unexpected kernel error in configuration update 2689104 Data Corruption while adding/removing LUNs Data Corruption while adding/removing LUNs 2693078 vxconfigd is generating a series of LVM header messages for devices (CLONES/replicated devices)Secondary EMC MirrorView LUNS in an error state SUMMARY OF KNOWN ISSUES: ----------------------------------------- KNOWN ISSUES : -------------- FIXED INCIDENTS: ---------------- PATCH ID:6.0.1.0 * INCIDENT NO:2589962 TRACKING ID:2574752 SYMPTOM: Existing vxfmrshowmap diagnostic shows invalid output with SF6.0 based instant DCO and requires user to find and specify DCO volume attributes to be specified in the CLI. DESCRIPTION: With SF6.0, instant DCO configured has layout different than previous releases. The new layout is not supported with vxfmrshowmap as the CLI was found to be complex to use. RESOLUTION:vxfmrshowmap is being deprecated and a new CLI, vxfmrmap is being introduced which is much simpler to use. vxfmrmap will have added functionality to check for inconsistencies in the map which could lead to data corruptions. As with vxfmrshowmap, vxfmrmap can be used to display the DCO map contents for the volume which is useful for Symantec Support in analysis of snapshot related issues. * INCIDENT NO:2598525 TRACKING ID:2526498 SYMPTOM: Memory leak after running the automated VVR test case. DESCRIPTION: The IBC after servicing the request , it queued in free queue. This update is queued only when the reference count is not zero. In some cases IBC receive and IBC send race each other. during this time the ref_count may not be euqal to 0, and the update is queued in free queue. This free queue, is freed by the garbage collector, or get cleaned up when the RVG is removed. But in some code path, free queue is set to NULL with out freeing up the update. RESOLUTION:So, the proposed fix is to keep the free queue not RESET, and so either garbage collector or RVG delete will free the update. * INCIDENT NO:2605706 TRACKING ID:2590183 SYMPTOM: IOs on newly enabled paths can fail with reservation conflict error DESCRIPTION: While enabling the path PGR registration is not done, so IOs can fail with reservation conflict. RESOLUTION:Do the PGR registration on newly enabled paths. * INCIDENT NO:2607793 TRACKING ID:2556467 SYMPTOM: When dmp_native_support is enabled, ASM (Automatic Storage Management) disks are disconnected from host and host is rebooted, user defined user-group ownership of respective DMP (Dynamic Multipathing) devices is lost and ownership is set to default values. DESCRIPTION: The user-group ownership records of DMP devices in /etc/vx/.vxdmprawdev file are refreshed at the time of boot and only the records of currently available devices are retained. As part of refresh, records of all the disconnected ASM disks are removed from /etc/vx/.vxdmpraw and hence set to default value. RESOLUTION:Made code changes so that the file /etc/vx/.vxdmprawdev will not be refreshed at boot time. * INCIDENT NO:2615288 TRACKING ID:2527289 SYMPTOM: In a Campus Cluster setup, storage fault may lead to DETACH of all the configured site. This also results in IOfailure on all the nodes in the Campus Cluster. DESCRIPTION: Site detaches are done on site consistent dgs when any volume in the dg looses all the mirrors of a Site. During the processing of the DETACH of last mirror in a site we identify that it is the last mirror and DETACH the site which in turn detaches all the objects of that site. In Campus Cluster setup we attach a dco volume for any data volume created on a site-consistent dg. The general configuration is to have one DCO mirror on each site. Loss of a single mirror of the dco volume on any node will result in the detach of that site. In a 2 site configuration this particular scenario would result in both the dco mirrors being lost simultaneously. While the site detach for the first mirror is being processed we also signal for DETACH of the second mirror which ends up DETACHING the second site too. This is not hit in other tests as we already have a check to make sure that we do not DETACH the last mirror of a Volume. This check is being subverted in this particular case due to the type of storage failure. RESOLUTION:Before triggering the site detach we need to have an explicit check to see if we are trying to DETACH the last ACTIVE site. * INCIDENT NO:* INCIDENT NO:2643134 TRACKING ID:2348180 SYMPTOM: Mirror name is getting truncated while getting the name of mirror for a given volume and mirror number. DESCRIPTION: VxVM supports volume name up to 32 characters. But while getting the name of mirror for a given volume and mirror number because of miscalculation mirror name is getting truncated. RESOLUTION:Proper and complete mirror name is returned. * INCIDENT NO:2643137 TRACKING ID:2565569 SYMPTOM: VxVM displays read i/o error messages when a VxVM 'nopriv' disk is defined on a partition slice other then slice 2. For example: VxVM vxdisk ERROR V-5-1-14581 read of block # 3840000 of /dev/vx/rdmp/c1t5d2s4 failed. VxVM vxdisk ERROR V-5-1-15859 read ID block of dev/vx/rdmp/c1t5d2s4 failed. DESCRIPTION: When a VxVM 'nopriv' disk (a disk type that has no private region metadata) is defined on a partition slice other than slice 2, read i/o error meessages may be displayed on the terminal. The are errors are displayed because VxVM used the wrong disk partition slice to check for ASM signatures. These error messages can be ignored, since they do not prevent the 'nopriv' disk from being created. RESOLUTION:The code has been modified to use the "full" partition slice when checking for ASM signatures on the disk. * INCIDENT NO:2643142 TRACKING ID:2627056 SYMPTOM: vxmake(1M) command when run with a very large DESCRIPTION: Due to a memory leak in vxmake(1M) command, data section limit for the process was reached. As a result further memory allocations failed and vxmake command failed with the above error RESOLUTION:Fixed the memory leak by freeing the memory after it has been used. * INCIDENT NO:2643154 TRACKING ID:2600863 SYMPTOM: vxtune doesn't accept tunable values correctly in human readable format # vxtune volpagemod_max_memsz 10k # vxtune -uh volpagemod_max_memsz Tunable Current Value Default Value Reboot --------------------- --------------- ------------- ------ volpagemod_max_memsz 10 MB 6 MB N DESCRIPTION: when tunable values are provided in human readable format, vxtune is not setting the tunable with correct value. RESOLUTION:vxtune behavior is rectified to accept and set correct tunable value when presented in human readable format. * INCIDENT NO:2643155 TRACKING ID:2607293 SYMPTOM: VVR Primary panics while deleting RVG. Here is stack trace panic_save_regs_switchstack+0x110 panic bad_news bubbleup+0x880 rv_ibc_freeze_timeout invoke_callouts_for_self soft_intr_handler external_interrupt bubbleup+0x880 DESCRIPTION: VVR Primary is frozen to send IBC for given timeout value. If RVG is deleted before unfreeze is done or timeout expire then it can cause panic. During RVG deletion freeze timer is not cleared due to bug in code. As freeze timer expires callback routine is called which access the RVG information, if RVG is deleted then accessing it causes panic. RESOLUTION:To fix this issue, check for IBC freeze timer while deleting RVG and unset it. * INCIDENT NO:2643156 TRACKING ID:2610877 SYMPTOM: vxdg -g set activation= might hang due to a bug in activation code path, when memory allocation fails in the kernel. DESCRIPTION: vxdg activation cmd is used to set read-write permission at dg level on each node. While running this command if there is a memory allocation failure in the vxvm kernel path, due to a bug in this code path command can hang. If this command hangs, then it will also end up blocking most of vxvm commands. RESOLUTION:Code changes are made in the vxvm kernel code path to handle memory allocation failure correctly and keep retrying memory allocation until it succeeds. * INCIDENT NO:2644185 TRACKING ID:2649958 SYMPTOM: vxdmpadm dumped core with following stack. #0 0x40bd750:1 in display_dmpnodes_of_redundancy+0x661 () #1 0x4092e10:0 in do_getdmpnode+0x400 () #2 0x40db040:0 in main+0x1980 () DESCRIPTION: Core dump occurs due to NULL pointer dereference and only occurs if DMP database is in in-consistent state. RESOLUTION:Added changes in VxVM code to avoid NULL pointer dereference in concerned code path. * INCIDENT NO:2646999 TRACKING ID:1765916 SYMPTOM: VxVM socket files have unacceptable permissions. The following files are World Writeable files: srwxrwxrwx root root /etc/vx/vold_diag/socket srwxrwxrwx root root /etc/vx/vold_inquiry/socket srwxrwxrwx root root /etc/vx/vold_request/socket DESCRIPTION: These sockets are used by the admin/support commands to communicate with the vold. These sockets are created by vold during it's start up process. RESOLUTION:Changed above files with following permissions :- srw------- root root /etc/vx/vold_diag/socket srw-rw-rw root root /etc/vx/vold_inquiry/socket srw------- root root /etc/vx/vold_request/socket vold_inquiry socket still has world writeable permissions because this socket is used by vxprint like command , and non-root users must be able to do vxprint . * INCIDENT NO:2647120 TRACKING ID:2635476 SYMPTOM: DMP (Dynamic Multi Pathing) driver does not automatically enable the failed paths of Logical Units (LUNs) that are restored. DESCRIPTION: DMP's restore demon probes each failed path at a default interval of 5 minutes (tunable) to detect if that path can be enabled. As part of enabling the path, DMP issues an open() on the path's device number. Owing to a bug in the DMP code, the open() was issued on a wrong device partition which resulted in failure for every probe. Thus, the path remained in failed status at DMP layer though it was enabled at the array side. RESOLUTION:Modified the DMP restore daemon code path to issue the open() on the appropriate device partitions. * INCIDENT NO:2660157 TRACKING ID:2575581 SYMPTOM: vxtune -r option is printing incorrect tunable values # vxtune vol_rvio_maxpool_sz | awk '{print $1"\t"$2}' Tunable Current Value --------------------- --------------- vol_rvio_maxpool_sz 1048704 # vxtune -r vol_rvio_maxpool_sz | awk '{print $1"\t"$2}' Tunable Current Value --------------------- ------------- vol_rvio_maxpool_sz 1048576 DESCRIPTION: vxtune '-r' option which is used to print tunable values in raw bytes is displaying incorrect value in bytes for some of the tunable values. RESOLUTION:vxtune behavior is rectified to print correct value in bytes for all possible tunable values. * INCIDENT NO:2666174 TRACKING ID:2666163 SYMPTOM: A small memory leak may be seen in vxconfigd, the VxVM configuration daemon when Serial Split Brain(SSB) error is detected in the import process. DESCRIPTION: The leak may occur when Serial Split Brain(SSB) error is detected in the import process. It is because when the SSB error is returning from a function, a dynamically allocated memory area in the same function would not be freed. The SSB detection is a VxVM feature where VxVM detects if the configuration copy in the disk private region becomes stale unexpectedly. A typical use case of the SSB error is that a disk group is imported to different systems at the same time and configuration copy update in both systems results in an inconsistency in the copies. VxVM cannot identify which configuration copy is most up-to-date in this situation. As a result, VxVM may detect SSB error on the next import and show the details through a CLI message. RESOLUTION:Code changes are made to avoid the memory leak and also a small message fix has been done. * INCIDENT NO:2668641 TRACKING ID:2629429 SYMPTOM: On SLES10 (multipath root disk) machine, during unroot (un-encapsulation of root disk) machine does not come up after reboot as entries in menu.lst and fstab are not restored and default boot option still remains same i.e.'vxvm_root'. DESCRIPTION: In unroot process, the effects of encapsulation are reversed to restore the original configuration. During this process device name entries are set in fstab and menu.lst. But with multipath root disk device name can change over reboots causing the entries in fstab and menu.lst to become invalid. RESOLUTION:Instead of device names, vxunroot command will set symbolic link entries in fstab and menu.lst which will be persistent over reboots * INCIDENT NO:2682534 TRACKING ID:2657797 SYMPTOM: Starting a RAID5 volume fails, when one of the sub-disks in the RAID5 column starts at an offset greater than 1TB. Example: # vxvol -f -g dg1 -o delayrecover start vol1 VxVM vxvol ERROR V-5-1-10128 Unexpected kernel error in configuration update DESCRIPTION: VxVM uses an integer variable to store the starting block offset of a sub-disk in a RAID5 column. This overflows when a sub-disk is located at an offset greater than 2147483647 blocks (1TB) and results in failure to start the volume. Refer to "sdaj" in the following example. E.g. v RaidVol - DETACHED NEEDSYNC 64459747584 RAID - raid5 pl RaidVol-01 RaidVol ENABLED ACTIVE 64459747584 RAID 4/128 RW [..] SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE sd DiskGroup101-01 RaidVol-01 DiskGroup101 0 1953325744 0/0 sdaa ENA sd DiskGroup106-01 RaidVol-01 DiskGroup106 0 1953325744 0/1953325744 sdaf ENA sd DiskGroup110-01 RaidVol-01 DiskGroup110 0 1953325744 0/3906651488 sdaj ENA RESOLUTION:VxVM code is modified to handle integer overflow conditions for RAID5 volumes. * INCIDENT NO:2689104 TRACKING ID:2674465 SYMPTOM: Data corruption is observed when DMP node names are changed by following commands for DMP devices that are controlled by a third party multi-pathing driver (E.g. MPXIO and PowerPath ) # vxddladm [-c] assign names # vxddladm assign names file= # vxddladm set namingscheme= DESCRIPTION: The above said commands when executed would re-assign names to each devices. Accordingly the in-core DMP database should be updated for each device to map the new device name with appropriate device number. Due to a bug in the code, the mapping of names with the device number wasn't done appropriately which resulted in subsequent IOs going to a wrong device thus leading to data corruption. RESOLUTION:DMP routines responsible for mapping the names with right device number is modified to fix this corruption problem. * INCIDENT NO:2693078 TRACKING ID:2660151 SYMPTOM: The following error messages were seen for inactive EMC Mirror View devices when starting vxconfigd. Examples: # vxconfigd VxVM vxconfigd ERROR V-5-1-13583 read of lvm header blocks for /dev/vx/rdmp/emc_clariion0_148 failed DESCRIPTION: Vxconfigd reads the disk header to detect LVM disk format. The read error occurs as inactive Mirror View device is un-readable. RESOLUTION:Code is modified to use media format discovery functions which do not display the error while detecting LVM format. INCIDENTS FROM OLD PATCHES: --------------------------- NONE