README VERSION : 1.1 README CREATION DATE : 2013-02-01 PATCH-ID : 148490-02 PATCH NAME : VRTSvxvm 6.0.300.0 BASE PACKAGE NAME : VRTSvxvm BASE PACKAGE VERSION : 6.0.100.0 SUPERSEDED PATCHES : 148490-01 REQUIRED PATCHES : NONE INCOMPATIBLE PATCHES : NONE SUPPORTED PADV : sol10_sparc (P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION) PATCH CATEGORY : CORE , CORRUPTION , HANG , PANIC , PERFORMANCE PATCH CRITICALITY : CRITICAL HAS KERNEL COMPONENT : YES ID : NONE REBOOT REQUIRED : YES REQUIRE APPLICATION DOWNTIME : YES PATCH INSTALLATION INSTRUCTIONS: -------------------------------- Please refer to Release Notes for install instructions PATCH UNINSTALLATION INSTRUCTIONS: ---------------------------------- Please refer to Release Notes for uninstall instructions SPECIAL INSTRUCTIONS: --------------------- NONE SUMMARY OF FIXED ISSUES: ----------------------------------------- PATCH ID:148490-02 2853712 (2815517) vxdg adddisk allows mixing of clone & non-clone disks in a DiskGroup. 2863672 (2834046) NFS migration failed due to device reminoring. 2863708 (2836528) Unable to grow LUN dynamically on Solaris x86 using "vxdisk resize" command. 2892571 (1856733) Support for FusionIO on Solaris x64 2892590 (2779580) Secondary node gives configuration error (no Primary RVG) after reboot of master node on Primary site. 2892682 (2837717) "vxdisk(1M) resize" command fails if 'da name' is specified. 2892684 (1859018) "link detached from volume" warnings are displayed when a linked-breakoff snapshot is created 2892698 (2851085) DMP doesn't detect implicit LUN ownership changes for some of the dmpnodes 2892716 (2753954) When a cable is disconnected from one port of a dual-port FC HBA, the paths via another port are marked as SUSPECT PATH. 2940447 (2940446) Full fsck hangs on I/O in VxVM when cache object size is very large 2941167 (2915751) Solaris machine panics during dynamic lun expansion of a CDS disk. 2941193 (1982965) vxdg import fails if da-name is based on naming scheme which is different from the prevailing naming scheme on the host 2941226 (2915063) Rebooting VIS array having mirror volumes, master node panicked and other nodes CVM FAULTED 2941234 (2899173) vxconfigd hang after executing command "vradmin stoprep" 2941237 (2919318) The I/O fencing key value of data disk are different and abnormal in a VCS cluster with I/O fencing. 2941252 (1973983) vxunreloc fails when dco plex is in DISABLED state 2944708 (1725593) The 'vxdmpadm listctlr' command has to be enhanced to print the count of device paths seen through the controller. 2944710 (2744004) vxconfigd is hung on the VVR secondary node during VVR configuration. 2944714 (2833498) vxconfigd hangs while reclaim operation is in progress on volumes having instant snapshots 2944717 (2851403) System panic is seen while unloading "vxio" module. This happens whenever VxVM uses SmartMove feature and the "vxportal" module gets reloaded (for e.g. during VxFS package upgrade) 2944722 (2869594) Master node panics due to corruption if space optimized snapshots are refreshed and 'vxclustadm setmaster' is used to select master. 2944724 (2892983) vxvol dumps core if new links are added while the operation is in progress. 2944725 (2910043) Avoid order 8 allocation by vxconfigd in node reconfig. 2944727 (2919720) vxconfigd core in rec_lock1_5() 2944729 (2933138) panic in voldco_update_itemq_chunk() due to accessing invalid buffer 2944741 (2866059) Improving error messages hit during vxdisk resize operation 2962257 (2898547) vradmind on VVR Secondary Site dumps core, when Logowner Service Group on VVR (Veritas Volume Replicator) Primary Site is shuffled across its CVM (Clustered Volume Manager) nodes. 2964567 (2964547) About DMP message - cannot load module 'misc/ted'. 2974870 (2935771) In VVR environment, RLINK disconnects after master switch. 2976946 (2919714) On a THIN lun, vxevac returns 0 without migrating unmounted VxFS volumes. 2976956 (1289985) vxconfigd core dumps upon running "vxdctl enable" command 2976974 (2875962) During the upgrade of VRTSaslapm package, a conflict is encountered with VRTSvxvm package because an APM binary is included in VRTSvxvm package which is already installed 2978189 (2948172) Executing "vxdisk -o thin,fssize list" command can result in panic. 2979767 (2798673) System panics in voldco_alloc_layout() while creating volume with instant DCO 2983679 (2970368) Enhance handling of SRDF-R2 Write-Disabled devices in DMP. 3004823 (2692012) vxevac move error message needs to be enhanced to be less generic and give clear message for failure. 3004852 (2886333) vxdg join command should not allow mixing clone & non-clone disks in a DiskGroup 3005921 (1901838) Incorrect setting of Nolicense flag can lead to dmp database inconsistency. 3006262 (2715129) Vxconfigd hangs during Master takeover in a CVM (Clustered Volume Manager) environment. 3011391 (2965910) Volume creation with vxassist using "-o ordered alloc=<disk-class>" dumps core. 3011444 (2398416) vxassist dumps core while creating volume after adding attribute "wantmirror=ctlr" in default vxassist rulefile 3020087 (2619600) Live migration of virtual machine having SFHA/SFCFSHA stack with data disks fencing enabled, causes service groups configured on virtual machine to fault. 3025973 (3002770) Accessing NULL pointer in dmp_aa_recv_inquiry() caused system panic. 3026288 (2962262) Uninstall of dmp fails in presence of other multipathing solutions 3027482 (2273190) Incorrect setting of UNDISCOVERED flag can lead to database inconsistency PATCH ID:148490-01 2860207 (2859470) EMC SRDF (Symmetrix Remote Data Facility) R2 disk with EFI label is not recognized by VxVM (Veritas Volume Manager) and its shown in error state. 2876865 (2510928) The extended attributes reported by "vxdisk -e list" for the EMC SRDF luns are reported as "tdev mirror", instead of "tdev srdf-r1". 2892499 (2149922) Record the diskgroup import and deport events in syslog 2892621 (1903700) Removing mirror using vxassist does not work. 2892630 (2742706) Panic due to mutex not being released in vxlo_open 2892643 (2801962) Growing a volume takes significantly large time when the volume has version 20 DCO attached to it. 2892650 (2826125) VxVM script daemon is terminated abnormally on its invocation. 2892660 (2000585) vxrecover doesn't start remaining volumes if one of the volumes is removed during vxrecover command run. 2892665 (2807158) On Solaris platform, sometimes system can hang during VM upgrade or patch installation. 2892689 (2836798) In VxVM, resizing simple EFI disk fails and causes system panic/hang. 2892702 (2567618) VRTSexplorer coredumps in checkhbaapi/print_target_map_entry. 2922770 (2866997) VxVM Disk initialization fails as an un-initialized variable gets an unexpected value after OS patch installation. 2922798 (2878876) vxconfigd dumps core in vol_cbr_dolog() due to race between two threads processing requests from the same client. 2924117 (2911040) Restore from a cascaded snapshot leaves the volume in unusable state if any cascaded snapshot is in detached state. 2924188 (2858853) After master switch, vxconfigd dumps core on old master. 2924207 (2886402) When re-configuring devices, vxconfigd hang is observed. 2930399 (2930396) The vxdmpasm command (in 5.1SP1 release) and the vxdmpraw command (in 6.0 release) do not work on Solaris platform. 2933467 (2907823) If the user removes the lun at the storage layer and at VxVM layer beforehand, DMP DR tool is unable to clean-up cfgadm (leadville) stack. 2933468 (2916094) Enhancements have been made to the Dynamic Reconfiguration Tool(DR Tool) to create a separate log file every time DR Tool is started, display a message if a command takes longer time, and not to list the devices controlled by TPD (Third Party Driver) in 'Remove Luns' option of DR Tool. 2933469 (2919627) Dynamic Reconfiguration tool should be enhanced to remove LUNs feasibly in bulk. 2934259 (2930569) The LUNs in 'error' state in output of 'vxdisk list' cannot be removed through DR(Dynamic Reconfiguration) Tool. 2942166 (2942609) Message displayed when user quits from Dynamic Reconfiguration Operations is shown as error message. SUMMARY OF KNOWN ISSUES: ----------------------------------------- 2949012(2951032) DR Tool is not supported in Solaris x86 machine. 3037620(2979786) The SCSI registration keys are not removed if VCS engine is stopped for the second time. 3049356(3060327) As part of setting up replication, while doing initial synchronization, 'vradmin repstatus' command shows "DCM (contains 0 Kbytes) (autosync)". KNOWN ISSUES : -------------- * INCIDENT NO::2949012 TRACKING ID ::2951032 SYMPTOM:: As the Dynamic Reconfiguration (DR) script does not contain i386 as list of supported architecture, pre check for DR Tool fails. WORKAROUND:: NONE * INCIDENT NO::3037620 TRACKING ID ::2979786 SYMPTOM:: If VCS engine is stopped for the first time, the SCSI registration keys are removed. But if VCS engine is stopped for the second time, the keys are not removed. WORKAROUND:: None * INCIDENT NO::3049356 TRACKING ID ::3060327 SYMPTOM:: As part of setting up replication, while doing initial synchronization, 'vradmin repstatus' shows incorrect status of DCM. Output looks like: root@hostname#vradmin -g dg1 repstatus rvg Replicated Data Set: rvg Primary: Host name: RVG name: rvg DG name: dg1 RVG state: enabled for I/O Data volumes: 1 VSets: 0 SRL name: srl SRL size: 1.00 G Total secondaries: 1 Secondary: Host name: RVG name: rvg DG name: dg1 Data status: inconsistent Replication status: resync in progress (smartsync autosync) Current mode: asynchronous Logging to: DCM (contains 0 Kbytes) (autosync) Timestamp Information: N/A The issue is specific to configurations in which primary data volumes have VxFS mounted. WORKAROUND:: Use the 'vxrlink status' command to view the number of remaining bytes. NONE FIXED INCIDENTS: ---------------- PATCH ID:148490-02 * INCIDENT NO:2853712 TRACKING ID:2815517 SYMPTOM: vxdg adddisk succeeds to add a clone disk to non-clone and non-clone disk to clone diskgroup, resulting in mixed diskgroup. DESCRIPTION: vxdg import fails for diskgroup which has mix of clone and non-clone disks. So vxdg adddisk should not allow creation of mixed diskgroup. RESOLUTION: vxdisk adddisk code is modified to return an error for an attempt to add clone disk to non-clone or non-clone disks to clone diskgroup, Thus it prevents addition of disk in diskgroup which leads to mixed diskgroup. * INCIDENT NO:2863672 TRACKING ID:2834046 SYMPTOM: VxVM dynamically reminors all the volumes during DG import if the DG base minor numbers are not in the correct pool. This behaviour cases NFS client to have to re-mount all NFS file systems in an environment where CVM is used on the NFS server side. DESCRIPTION: Starting from 5.1, the minor number space is divided into two pools, one for private disk groups and another for shared disk groups. During DG import, the DG base minor numbers will be adjusted automatically if not in the correct pool, and so do the volumes in the disk groups. This behaviour reduces many minor conflicting cases during DG import. But in NFS environment, it makes all file handles on the client side stale. Customers had to unmount files systems and restart applications. RESOLUTION: A new tunable, "autoreminor", is introduced. The default value is "on". Most of the customers don't care about auto-reminoring. They can just leave it as it is. For a environment that autoreminoring is not desirable, customers can just turn it off. Another major change is that during DG import, VxVM won't change minor numbers as long as there is no minor conflicts. This includes the cases that minor numbers are in the wrong pool. * INCIDENT NO:2863708 TRACKING ID:2836528 SYMPTOM: vxdisk resize fails with an error " New geometry makes partition unaligned " bash# vxdisk -g testdg resize disk01 length=8g VxVM vxdisk ERROR V-5-1-8643 Device disk01: resize failed: New geometry makes partition unaligned DESCRIPTION: On Solaris X86 system, the partition 8 is not necessary to align with cylinder size. However VxVM requires this partition to be cylinder aligned. Hence the issue. RESOLUTION: Issue is fixed by doing the necessary changes to skip alignment check for partition 8 on Solaris X86 platform. * INCIDENT NO:2892571 TRACKING ID:1856733 SYMPTOM: Add support for FusionIO on Solaris x64 DESCRIPTION: FusionIO was not previously supported on Solaris x64 platform. RESOLUTION: Support for FusionIO is added for Solaris x64. * INCIDENT NO:2892590 TRACKING ID:2779580 SYMPTOM: Secondary node gives configuration error 'no Primary RVG' when primary master node(default logowner) is rebooted and slave becomes new master. DESCRIPTION: After reboot of primary master, new master sends handshake request for vradmind communication to secondary. As a part of handshake request, secondary deletes the old configuration including primary RVG. During this phase, secondary receives configuration update message from primary for old configuration. Secondary does not find old primary RVG configuration for processing this message. Hence, it cannot proceed with the pending handshake request and gives 'no Primary RVG' configuration error. RESOLUTION: Code changes are done such that during handshake request phase, configuration messages of old primary RVG are discarded. * INCIDENT NO:2892682 TRACKING ID:2837717 SYMPTOM: "vxdisk(1M) resize" command fails if 'da name' is specified. DESCRIPTION: The scenario for 'da name' is not handled in the resize code path. RESOLUTION: The code is modified such that if 'dm name' is not specified to resize, then 'da name' specific operation is performed. * INCIDENT NO:2892684 TRACKING ID:1859018 SYMPTOM: "Link link detached from volume " warnings are displayed when a linked-breakoff snapshot is created. DESCRIPTION: The purpose of these message is to let user and administrators know about the detach of link due to I/O errors. These messages get displayed uneccesarily whenever linked-breakoff snapshot is created. RESOLUTION: Code changes are made to display messages only when link is detached due to I/O errors on volumes involved in link-relationship. * INCIDENT NO:2892698 TRACKING ID:2851085 SYMPTOM: DMP doesn't detect implicit LUN ownership changes DESCRIPTION: DMP does ownership monitoring for ALUA arrays to detect implicit LUN ownership changes. This helps DMP to always use Active/Optimized path for sending down I/O. This feature is controlled using dmp_monitor_ownership tune and is enabled by default. In case of partial discovery triggered through event source daemon (vxesd), ALUA information kept in kernel data structure for ownership monitoring was getting wiped. This causes ownership monitoring to not work for these dmpnodes. RESOLUTION: Source has been updated to handle such case. * INCIDENT NO:2892716 TRACKING ID:2753954 SYMPTOM: When cable is disconnected from one port of a dual-port FC HBA, only paths going through the port should be marked as SUSPECT. But paths going through other port are also getting marked as SUSPECT. DESCRIPTION: Disconnection of a cable from a HBA port generates a FC event. When the event is generated, paths of all ports of the corresponding HBA are marked as SUSPECT. RESOLUTION: The code changes are done to mark the paths only going through the port on which FC event is generated. * INCIDENT NO:2940447 TRACKING ID:2940446 SYMPTOM: I/O can hang on volume with space optimized snapshot if the underlying cache object is of very large size. It can also lead to data corruption in cache- object. DESCRIPTION: Cache volume maintains B+ tree for mapping the offset and its actual location in cache object. Copy-on-write I/O generated on snapshot volumes needs to determine the offset of particular I/O in cache object. Due to incorrect type- casting the value calculated for large offset truncates to smaller value due to overflow, leading to data corruption. RESOLUTION: Code changes are done to avoid overflow during offset calculation in cache object. * INCIDENT NO:2941167 TRACKING ID:2915751 SYMPTOM: Solaris machine panics while resizing CDS-EFI LUN or CDS VTOC to EFI conversion case where new size of resize is greater than 1TB. DESCRIPTION: While resizing a disk having CDS-EFI format or while resizing a CDS disk from less than 1TB to >= 1TB, machine panics because of the incorrect use of device numbers. VxVM uses the whole slice number s0 instead of s7 which represents the whole device for EFI format. Hence, the device open fails and the incorrect disk maxiosize was populated. While doing an I/O, machine panics with divide by zero error. RESOLUTION: While resizing a disk having CDS-EFI format or while resizing a CDS disk from less than 1TB to >= 1TB, VxVM now correctly uses device number corresponding to partition 7 of the device. * INCIDENT NO:2941193 TRACKING ID:1982965 SYMPTOM: "vxdg import DGNAME " fails when "da-name" used as an input to vxdg command is based on namingscheme which is different from the prevailing namingscheme on the host. Error message seen is: VxVM vxdg ERROR V-5-1-530 Device c6t50060E801002BC73d240 not found in configuration VxVM vxdg ERROR V-5-1-10978 Disk group x86dg: import failed: Not a disk access record DESCRIPTION: vxconfigd stores Disk Access (DA) records based on DMP names. If "vxdg" passes a name other than DMP name for the device, vxconfigd cannot map it to a DA record. As vxconfigd cannot locate a DA record corresponding to passed input name from vxdg, it fails the import operation. RESOLUTION: vxdg command now converts the input name to DMP name before passing it to vxconfigd for further processing. * INCIDENT NO:2941226 TRACKING ID:2915063 SYMPTOM: System panic with following stack during detaching plex of volume in CVM environment. vol_klog_findent() vol_klog_detach() vol_mvcvm_cdetsio_callback() vol_klog_start() voliod_iohandle() voliod_loop() DESCRIPTION: During plex-detach operation VxVM searches the plex object to be detached in kernel. In case if there is some transaction in progress on any diskgroup in the system, incorrect plex object gets selected sometime, which results into dereference of invalid address and panics the system. RESOLUTION: Code changes done to make sure that correct plex object is getting selected. * INCIDENT NO:2941234 TRACKING ID:2899173 SYMPTOM: In CVR environment, SRL failure may result into vxconfigd hang and eventually resulting into 'vradmin stoprep' command hang. DESCRIPTION: 'vradmin stoprep' command is hung because vxconfigd is waiting indefinitely in transaction. Transaction was waiting for IO completion on SRL. We generate error handler to handle IO failure on SRL. But if we are in transaction, this error was not getting handled properly resulting into transaction hang. RESOLUTION: Fix is provided such that when SRL failure is encountered, transaction itself handles IO error on SRL. * INCIDENT NO:2941237 TRACKING ID:2919318 SYMPTOM: In a CVM environment with fencing enabled, wrong fencing keys are registered for opaque disks during node join or dg import operations. DESCRIPTION: During cvm node join and shared dg import code path, when opaque disk registration happens, fencing keys in internal dg records are not in sync with actual keys generated. This was causing wrong fencing keys registered for opaque disks. For rest disks fencing key registration happens correctly. RESOLUTION: Fix is to copy correctly generated key to internal dg record for current dg import/node join scenario and use it for disk registration. * INCIDENT NO:2941252 TRACKING ID:1973983 SYMPTOM: Relocation is failing with following error when DCO(data change object) plex is in disabled state. VxVM vxrelocd ERROR V-5-2-600 Failure recovering in disk group DESCRIPTION: When a mirror-plex is added to a volume using "vxassist snapstart", attached DCO plex can be in DISABLED/DCOSNP state. While recovering such DCO plexes, if enclosure is disabled, plex can get in DETACHED/DCOSNP state and relocation fails. RESOLUTION: Code changes are made to handle DCO plexs in disabled state in relocation. * INCIDENT NO:2944708 TRACKING ID:1725593 SYMPTOM: The 'vxdmpadm listctlr' command does not show the count of device paths seen through it DESCRIPTION: The 'vxdmpadm listctlr' currently does not show the number of device paths seen through it. The CLI option has been enhanced to provide this information as an additional column at the end of each line in the CLI's output RESOLUTION: The number of paths under each controller is counted and the value is displayed as the last column in the 'vxdmpadm listctlr' CLI output * INCIDENT NO:2944710 TRACKING ID:2744004 SYMPTOM: When VVR is configured, vxconfigd on secondary gets hung. Any vx commands issued during this time does not complete. DESCRIPTION: Vxconfigd is waiting for IOs to drain before allowing a configuration change command to proceed. The IOs never drain completely resulting into the hang. This is because there is a deadlock where pending IOs are unable to start and vxconfigd keeps waiting for their completion. RESOLUTION: Changed the code so that this deadlock does not arise. The IOs can be started properly and complete allowing vxconfigd to function properly. * INCIDENT NO:2944714 TRACKING ID:2833498 SYMPTOM: vxconfigd daemon hangs in vol_ktrans_commit() while reclaim operation is in progress on volumes having instant snapshots. Stack trace is given below: vol_ktrans_commit volconfig_ioctl DESCRIPTION: Storage reclaim leads to the generation of special IOs (termed as Reclaim IOs), which can be very large in size(>4G) and unlike application IOs, these are not broken into smaller sized IOs. Reclaim IOs need to be tracked in snapshot maps if the volume has full snapshots configured. The mechanism to track reclaim IO is not capable of handling such large IOs causing hang. RESOLUTION: Code changes are made to use the alternative mechanism in Volume manager to track the reclaim IOs. * INCIDENT NO:2944717 TRACKING ID:2851403 SYMPTOM: System panics while unloading 'vxio' module when VxVM SmartMove feature is used and the "vxportal" module gets reloaded (for e.g. during VxFS package upgrade). Stack trace looks like: vxportalclose() vxfs_close_portal() vol_sr_unload() vol_unload() DESCRIPTION: During a smart-move operation like plex attach, VxVM opens the 'vxportal' module to read in-use file system maps information. This file descriptor gets closed only when 'vxio' module is unloaded. If the 'vxportal' module is unloaded and reloaded before 'vxio', the file descriptor with 'vxio' becomes invalid and results in a panic. RESOLUTION: Code changes are made to close the file descriptor for 'vxportal' after reading free/invalid file system map information. This ensures that stale file descriptors don't get used for 'vxportal'. * INCIDENT NO:2944722 TRACKING ID:2869594 SYMPTOM: Master node would panic with following stack after a space optimized snapshot is refreshed or deleted and master node is selected using 'vxclustadm setmaster' volilock_rm_from_ils vol_cvol_unilock vol_cvol_bplus_walk vol_cvol_rw_start voliod_iohandle voliod_loop thread_start In addition to this, all space optimized snapshots on the corresponding cache object may be corrupted. DESCRIPTION: In CVM, the master node owns the responsibility of maintaining the cache object indexing structure for providing space optimized functionality. When a space optimized snapshot is refreshed or deleted, the indexing structure would get rebuilt in background after the operation is returned. When the master node is switched using 'vxclustadm setmaster' before index rebuild is complete, both old master and new master nodes would rebuild the index in parallel which results in index corruption. Since the index is corrupted, the data stored on space optimized snapshots should not be trusted. I/Os issued on corrupted index would lead to panic. RESOLUTION: When the master role is switched using 'vxclustadm setmaster', the index rebuild on old master node would be safely aborted. Only new master node would be allowed to rebuild the index. * INCIDENT NO:2944724 TRACKING ID:2892983 SYMPTOM: vxvol command dumps core with the following stack trace, if executed parallel to vxsnap addmir command strcmp() do_link_recovery trans_resync_phase1() vxvmutil_trans() trans() common_start_resync() do_noderecover() main() DESCRIPTION: During creation of link between two volumes if vxrecover is triggered, vxvol command may not have information about the newly created links. This leads to NULL pointer dereference and dumps core. RESOLUTION: The code has been modified to check if links information is properly present with vxvol command and fail operation with appropriate error message. * INCIDENT NO:2944725 TRACKING ID:2910043 SYMPTOM: Frequent swapin/swapout seen due to higher order memory requests DESCRIPTION: In VxVM operations such as plex attach, snapshot resync/reattach issue ATOMIC_COPY IOCTL's. Default I/O size for these operation is 1MB and VxVM allocates this memory from operating system. Memory allocations of such large size can results into swapin/swapout of pages and are not very efficient. In presence of lot of such operations , system may not work very efficiently. RESOLUTION: VxVM has its own I/O memory management module, which allocates pages from operating system and efficiently manage them. Modified ATOMIC_COPY code to make use of VxVM's internal I/O memory pool instead of directly allocating memory from operating system. * INCIDENT NO:2944727 TRACKING ID:2919720 SYMPTOM: vxconfigd dumps core in rec_lock1_5() function. rec_lock1_5() rec_lock1() rec_lock() client_trans_start() req_vol_trans() request_loop() main() DESCRIPTION: During any configuration changes in VxVM, vxconfigd locks all involved objects in operations to avoid any unexpected modification. Some objects which do not belong to the context of current transactions are not handled properly which resuls in core dump. This case is particularly seen during snapshots operation of cross-dg linked volume snapshots. RESOLUTION: Code changes are done to avoid locking of records which are not yet part of the committed VxVM configuration. * INCIDENT NO:2944729 TRACKING ID:2933138 SYMPTOM: System panics with stack trace given below: voldco_update_itemq_chunk() voldco_chunk_updatesio_start() voliod_iohandle() voliod_loop() DESCRIPTION: While tracking IOs in snapshot MAPS information is stored in- memory pages. For large sized IOs (such as reclaim IOs), this information can span across multiple pages. Sometimes the pages are not properly referenced in MAP update for IOs of larger size which lead to panic because of invalid page addresses. RESOLUTION: Code is modified to properly reference pages during MAP update for large sized IOs. * INCIDENT NO:2944741 TRACKING ID:2866059 SYMPTOM: When disk resize fails, following messages can appear on screen: 1. "VxVM vxdisk ERROR V-5-1-8643 Device : resize failed: One or more subdisks do not fit in pub reg" or 2. "VxVM vxdisk ERROR V-5-1-8643 Device : resize failed: Cannot remove last disk in disk group" DESCRIPTION: In first message extra information should be provided like which subdisk is under consideration and what are subdisk and public region lengths etc. After vxdisk resize fails with the second message, if -f(force) option is used, resize operation succeeds. This message can be improved by suggesting the user to use -f (force) option for resizing RESOLUTION: Code changes are made to improve the error messages. * INCIDENT NO:2962257 TRACKING ID:2898547 SYMPTOM: vradmind dumps core on VVR (Veritas Volume Replicator) Secondary site in a CVR (Clustered Volume Replicator) environment. Stack trace would look like: __kernel_vsyscall raise abort fmemopen malloc_consolidate delete delete[] IpmHandle::~IpmHandle IpmHandle::events main DESCRIPTION: When Logowner Service Group is moved across nodes on the Primary Site, it induces deletion of IpmHandle of the old Logowner Node, as the IpmHandle of the new Logowner Node gets created. During destruction of IpmHandle object, a pointer '_cur_rbufp' is not set to NULL, which can lead to freeing up of memory which is already freed, and thus, causing 'vradmind' to dump core. RESOLUTION: Destructor of IpmHandle is modified to set the pointer to NULL after it is deleted. * INCIDENT NO:2964567 TRACKING ID:2964547 SYMPTOM: Whenever system reboots, below messages are logged on system console: Oct 10 19:10:01 sol11_server unix: [ID 779321 kern.notice] vxdmp: unable to resolve dependency, Oct 10 19:10:01 sol11_server unix: [ID 969242 kern.notice] cannot load module 'misc/ted' DESCRIPTION: Module 'misc/ted' is part of debug package. It was wrongly getting linked with vxdmp driver for non-debug builds. These are harmless messages. RESOLUTION: Source makefile was modified to remove this dependency for non-debug packages. * INCIDENT NO:2974870 TRACKING ID:2935771 SYMPTOM: Rlinks disconnect after switching the master. DESCRIPTION: Sometimes switching a master on the primary can cause the Rlinks to disconnect. vradmin repstatus would show "paused due to network disconnection" as the replication status. VVR uses a connection to check if the secondary is alive. The secondary responds to these requests by replying back, indicating that it is alive. On a master switch, the old master fails to close this connection with the secondary. Thus after the master switch the old master as well as the new master would send the requests to the secondary. This causes a mismatch of connection numbers on the secondary and the secondary does not reply to the requests of the new master. Thus it causes the Rlinks to disconnect. RESOLUTION: The solution is to close the connection of the old master with the secondary, so that it does not keep sending connection requests to the secondary. * INCIDENT NO:2976946 TRACKING ID:2919714 SYMPTOM: On a THIN lun, vxevac returns 0 without migrating unmounted VxFS volumes. The following error messages are displayed when an unmounted VxFS volumes is processed: VxVM vxsd ERROR V-5-1-14671 Volume v2 is configured on THIN luns and not mounted. Use 'force' option, to bypass smartmove. To take advantage of smartmove for supporting thin luns, retry this operation after mounting the volume. VxVM vxsd ERROR V-5-1-407 Attempting to cleanup after failure ... DESCRIPTION: On a THIN lun, VM will not move or copy data on an unmounted VxFS volumes unless smartmove is bypassed. The vxevac command fails needs to be enhanced to detect unmounted VxFS volumes on THIN luns and to support a force option that allows the user to bypass smartmove. RESOLUTION: The vxevac script has been modified to check for unmounted VxFS volumes on THIN luns prior to performing the migration. If an unmounted VxFS volume is detected the command fails with a non-zero return code and displays a message notifying the user to mount the volumes or bypass smartmove by specifying the force option: VxVM vxevac ERROR V-5-2-0 The following VxFS volume(s) are configured on THIN luns and not mounted: v2 To take advantage of smartmove support on thin luns, retry this operation after mounting the volume(s). Otherwise, bypass smartmove by specifying the '-f' force option. * INCIDENT NO:2976956 TRACKING ID:1289985 SYMPTOM: vxconfigd core dumps upon running "vxdctl enable" command, as vxconfigd is not checking the status value returned by the device when it sends SCSI mode sense command to the device. DESCRIPTION: vxconfigd sends SCSI mode sense command to the device to obtain device information, but it only checks the return value of ioctl(). The return value of ioctl() only stands if there is an error while sending the command to target device. Vxconfigd should also check the value of SCSI status byte returned by the device to get the real status of SCSI command execution. RESOLUTION: The code has been changed to check the value of SCSI status byte returned by the device and it takes appropriate action if status value is nonzero. * INCIDENT NO:2976974 TRACKING ID:2875962 SYMPTOM: When an upgrade install is performed from VxVM 5.0MPx to VxVM 5.1(and higher) the installtion script may give the following message: The following files are already installed on the system and are being used by another package: /usr/lib/vxvm/root/kernel/drv/vxapm/dmpsvc.SunOS_5.10 Do you want to install these conflicting files [y,n,?,q] DESCRIPTION: A VxVM 5.0MPx patch incorrectly packaged the IBM SanVC APM with a VxVM patch, which was subsequently corrected in a later patch. Any upgrade performed from that 5.0MPx patch to 5.1 or higher will result in this packaging message. RESOLUTION: Added code to the packaging script of the VxVM package to remove the APM files so that a conflict between VRTSaslapm and VRTSvxvm packages are resolved. * INCIDENT NO:2978189 TRACKING ID:2948172 SYMPTOM: Execution of command "vxdisk -o thin,fssize list" can cause hang or panic. Hang stack trace might look like: pse_block_thread pse_sleep_thread .hkey_legacy_gate volsiowait vol_objioctl vol_object_ioctl voliod_ioctl volsioctl_real volsioctl Panic stack trace might look like: voldco_breakup_write_extents volfmr_breakup_extents vol_mv_indirect_write_start volkcontext_process volsiowait vol_objioctl vol_object_ioctl voliod_ioctl volsioctl_real vols_ioctl vols_compat_ioctl compat_sys_ioctl sysenter_dispatch DESCRIPTION: Command "vxdisk -o thin,fssize list" triggers reclaim I/Os to get file system usage from veritas file system on veritas volume manager mounted volumes. We currently do not support reclamation on volumes with space optimized (SO) snapshots. But because of a bug, reclaim IOs continue to execute for volumes with SO Snapshots leading to system panic/hang. RESOLUTION: Code changes are made to not to allow reclamation IOs to proceed on volumes with SO Snapshots. * INCIDENT NO:2979767 TRACKING ID:2798673 SYMPTOM: System panic is observed with the stacktrace given below: voldco_alloc_layout voldco_toc_updatesio_done voliod_iohandle voliod_loop DESCRIPTION: DCO (data change object) contains metadata information required to start DCO volume and decode further information from the DCO volume. This information is stored in the 1st block of DCO volume. If this metadata information is incorrect/corrupted, the further processing of volume start resulted into panic due to divide-by-zero error in kernel. RESOLUTION: Code changes are made to verify the correctness of DCO volumes metadata information during startup. If the information read is incorrect, volume start operations fails. * INCIDENT NO:2983679 TRACKING ID:2970368 SYMPTOM: SRDF-R2 WD(write-disabled)devices are shown in error state and lots of path enable/disable messages are generated in /etc/vx/dmpevents.log file. DESCRIPTION: DMP(dynamic multi-pathing driver) disables the paths of write protected devices. Therefore these devices are shown in error state. Vxattachd daemon tries to online these devices and executes partial device discovery for these devices. As part of partial device discovery, enabling and disabling the paths of such write protected devices generate lots of path enable/disable messages in /etc/vx/dmpevents.log file. RESOLUTION: This issue is addressed by not disabling paths of write protected devices in DMP. * INCIDENT NO:3004823 TRACKING ID:2692012 SYMPTOM: When moving subdisks using vxassist move (or using vxevac command which in turn call vxassist move),if the disk tag are not same for source & destination, the command used to fail with generic message which does not convey exactly why the operation failed. You will see following generic message: VxVM vxassist ERROR V-5-1-438 Cannot allocate space to replace subdisks DESCRIPTION: When moving subdisks using vxassist move, it uses available disks from disk group to move, if no target disk is specified. If these disks have site tag set and value of site tag attribute is not same, then vxassist move is expected to fail. But it fails with generic message that does not specify why the operation failed. Expectation is to introduce message that precisely convey user why the operation failed. RESOLUTION: New message is introduced which precisely conveys that disk failure is due to site tag attribute mismatch. You will see following message along with the generic message that conveys the actual reason for failure: VxVM vxassist ERROR V-5-1-0 Source and/or target disk belongs to site,can not move over sites * INCIDENT NO:3004852 TRACKING ID:2886333 SYMPTOM: "vxdg(1M) join" command allowed mixing clone and non-clone disk group. Subsequent import of new joined disk group fails. DESCRIPTION: Mixing of clone and non-clone disk group is not allowed. The part of the code where join operation is done is not validating the mix of clone and non-clone disk group and it was going ahead with the operation. This resulted in the new joined disk group having mix of clone & non-clone disks. Subsequent import of new joined disk group fails. RESOLUTION: During disk group join operation, both the disk groups are checked, if there is a mix of clone and non-clone disk group found, the join operation is failed. * INCIDENT NO:3005921 TRACKING ID:1901838 SYMPTOM: After addition of a license key that enables multi-pathing, the state of the controller is still shown as DISABLED in the vxdmpadm CLI output. DESCRIPTION: When the multi-pathing license key is added, the state of active paths of a LUN is changed to ENABLED but the state of the controller is not updated. RESOLUTION: As a fix, whenever multipathing license key is installed, the operation updates the state of the controller in addition to that of the LUN paths. * INCIDENT NO:3006262 TRACKING ID:2715129 SYMPTOM: Vxconfigd hangs during Master takeover in a CVM (Clustered Volume Manager) environment. This results in vx command hang. DESCRIPTION: During Master takeover, VxVM (Veritas Volume Manager) kernel signals Vxconfigd with the information of new Master. Vxconfigd then proceeds with a vxconfigd- level handshake with the nodes across the cluster. Before kernel could signal to vxconfigd, vxconfigd handshake mechanism got started, resulting in the hang. RESOLUTION: Code changes are done to ensure that vxconfigd handshake gets started only upon receipt of signal from the kernel. * INCIDENT NO:3011391 TRACKING ID:2965910 SYMPTOM: vxassist dumps core with following stack: setup_disk_order() volume_alloc_basic_setup() fill_volume() setup_new_volume() make_trans() vxvmutil_trans() trans() transaction() do_make() main() DESCRIPTION: When -o ordered is used, vxassist handles non-disk parameters in a different way. This scenario may result in invalid comparison, leading to a core dump. RESOLUTION: Code changes are made to handle the parameter comparison logic properly. * INCIDENT NO:3011444 TRACKING ID:2398416 SYMPTOM: vxassist dumps core with the following stack: merge_attributes() get_attributes() do_make() main() _start() DESCRIPTION: vxassist dumps core while creating volume when attribute 'wantmirror=ctlr' is added to the '/etc/default/vxassist' file. vxassist reads this default file initially and uses the attributes specified to allocate the storage during the volume creation. However, during the merging of attributes specified in the default file, it accesses NULL attribute structure causing the core dump. RESOLUTION: Necessary code changes have been done to check the attribute structure pointer before accessing it. * INCIDENT NO:3020087 TRACKING ID:2619600 SYMPTOM: Live migration of virtual machine having SFHA/SFCFSHA stack with data disks fencing enabled, causes service groups configured on virtual machine to fault. DESCRIPTION: After live migration of virtual machine having SFHA/SFCFSHA stack with data disks fencing enabled is done, I/O fails on shared SAN devices with reservation conflict and causes service groups to fault. Live migration causes SCSI initiator change. Hence I/O coming from migrated server to shared SAN storage fails with reservation conflict. RESOLUTION: Code changes are added to check whether the host is fenced off from cluster. If host is fenced off, then registration key is re-registered for dmpnode through migrated server and restart IO. * INCIDENT NO:3025973 TRACKING ID:3002770 SYMPTOM: The system panics with the following stack trace: vxdmp:dmp_aa_recv_inquiry vxdmp:dmp_process_scsireq vxdmp:dmp_daemons_loop unix:thread_start DESCRIPTION: The panic happens while handling the SCSI response for SCSI Inquiry command. In order to determine if the path on which SCSI Inquiry command was issued is read-only, the code needs to check the error buffer. However the error buffer is not always prepared. So the code should examine if the error buffer is valid before further checking. Without such error buffer examination, the system may panic with NULL pointer. RESOLUTION: The source code is modified to verify the error buffer to be valid. * INCIDENT NO:3026288 TRACKING ID:2962262 SYMPTOM: When DMP Native Stack support is enabled and some devices are being managed by a multipathing solution other than DMP, then uninstalling DMP fails with an error for not being able to turn off DMP Native Stack support. Performing DMP prestop tasks ...................................... Done The following errors were discovered on the systems: CPI ERROR V-9-40-3436 Failed to turn off dmp_native_support tunable on pilotaix216. Refer to Dynamic Multi-Pathing Administrator's guide to determine the reason for the failure and take corrective action. VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups The CLI 'vxdmpadm settune dmp_native_support=off' also fails with following error. # vxdmpadm settune dmp_native_support=off VxVM vxdmpadm ERROR V-5-1-15690 Operation failed for one or more volume groups DESCRIPTION: With DMP Native Stack support it is expected that devices which are being used by LVM are multipathed by DMP. Co-existence with other multipath solutions in such cases is not supported. Having some other multipath solution results in this error. RESOLUTION: Code changes have been made to not error out while turning off DMP Native Support if device is not being managed by DMP. * INCIDENT NO:3027482 TRACKING ID:2273190 SYMPTOM: The device discovery commands 'vxdisk scandisks' or 'vxdctl enable' issued just after license key installation may fail and abort. DESCRIPTION: After addition of license key that enables multi-pathing, the state of paths maintained at user level is incorrect. RESOLUTION: As a fix, whenever multi-pathing license key is installed, the operation updates the state of paths both at user level and kernel level. PATCH ID:148490-01 * INCIDENT NO:2860207 TRACKING ID:2859470 SYMPTOM: The EMC SRDF-R2 disk may go in error state when you create EFI label on the R1 disk. For example: R1 site # vxdisk -eo alldgs list | grep -i srdf emc0_008c auto:cdsdisk emc0_008c SRDFdg online c1t5006048C5368E580d266 srdf-r1 R2 site # vxdisk -eo alldgs list | grep -i srdf emc1_0072 auto - - error c1t5006048C536979A0d65 srdf-r2 DESCRIPTION: Since R2 disks are in write protected mode, the default open() call (made for read-write mode) fails for the R2 disks, and the disk is marked as invalid. RESOLUTION: As a fix, DMP was changed to be able to read the EFI label even on a write protected SRDF-R2 disk. * INCIDENT NO:2876865 TRACKING ID:2510928 SYMPTOM: The extended attributes reported by "vxdisk -e list" for the EMC SRDF luns are reported as "tdev mirror", instead of "tdev srdf-r1". Example, # vxdisk -e list DEVICE TYPE DISK GROUP STATUS OS_NATIVE_NAME ATTR emc0_028b auto:cdsdisk - - online thin c3t5006048AD5F0E40Ed190s2 tdev mirror DESCRIPTION: The extraction of the attributes of EMC SRDF luns was not done properly. Hence, EMC SRDF luns are erroneously reported as "tdev mirror", instead of "tdev srdf- r1". RESOLUTION: Code changes have been made to extract the correct values. * INCIDENT NO:2892499 TRACKING ID:2149922 SYMPTOM: Record the diskgroup import and deport events in the /var/adm/messages file. Following type of message can be logged in syslog: vxvm:vxconfigd: V-5-1-16254 Disk group import of succeeded. DESCRIPTION: With the diskgroup import or deport, appropriate success message or failure message with the cause for failure should be logged. RESOLUTION: Code changes are made to log diskgroup import and deport events in syslog. * INCIDENT NO:2892621 TRACKING ID:1903700 SYMPTOM: vxassist remove mirror does not work if nmirror and alloc is specified, giving an error "Cannot remove enough mirrors" DESCRIPTION: During remove mirror operation, VxVM does not perform correct analysis of plexes. Hence the issue. RESOLUTION: Necessary code changes have been done so that vxassist works properly. * INCIDENT NO:2892630 TRACKING ID:2742706 SYMPTOM: The system panic can happen with following stack, when the Oracle 10G Grid Agent Software invokes the command :- # nmhs get_solaris_disks unix:lock_try+0x0() genunix:turnstile_interlock+0x1c() genunix:turnstile_block+0x1b8() unix:mutex_vector_enter+0x428() unix:mutex_enter() - frame recycled vxlo:vxlo_open+0x2c() genunix:dev_open() - frame recycled specfs:spec_open+0x4f4() genunix:fop_open+0x78() genunix:vn_openat+0x500() genunix:copen+0x260() unix:syscall_trap32+0xcc() DESCRIPTION: The open system call code path of the vxlo (Veritas Loopback Driver) is not releasing the acquired global lock after the work is completed. The panic may occur when the next open system call tries to acquire the lock. RESOLUTION: Code changes have been made to release the global lock appropriately. * INCIDENT NO:2892643 TRACKING ID:2801962 SYMPTOM: Operations that lead to growing of volume, including 'vxresize', 'vxassist growby/growto' take significantly larger time if the volume has version 20 DCO(Data Change Object) attached to it in comparison to volume which doesn't have DCO attached. DESCRIPTION: When a volume with a DCO is grown, it needs to copy the existing map in DCO and update the map to track the grown regions. The algorithm was such that for each region in the map it would search for the page that contains that region so as to update the map. Number of regions and number of pages containing them are proportional to volume size. So, the search complexity is amplified and observed primarily when the volume size is of the order of terabytes. In the reported instance, it took more than 12 minutes to grow a 2.7TB volume by 50G. RESOLUTION: Code has been enhanced to find the regions that are contained within a page and then avoid looking-up the page for all those regions. * INCIDENT NO:2892650 TRACKING ID:2826125 SYMPTOM: VxVM script daemons are not up after they are invoked with the vxvm-recover script. DESCRIPTION: When the VxVM script daemon is starting, it will terminate any stale instance if it does exist. When the script daemon is invoking with exactly the same process id of the previous invocation, the daemon itself is abnormally terminated by killing one own self through a false-positive detection. RESOLUTION: Code changes are made to handle the same process id situation correctly. * INCIDENT NO:2892660 TRACKING ID:2000585 SYMPTOM: If 'vxrecover -sn' is run and at the same time one volume is removed, vxrecover exits with the error 'Cannot refetch volume', the exit status code is zero but no volumes are started. DESCRIPTION: vxrecover assumes that volume is missing because the diskgroup must have been deported while vxrecover was in progress. Hence, it exits without starting remaining volumes. vxrecover should be able to start other volumes, if the DG is not deported. RESOLUTION: Modified the source to skip missing volume and proceed with remaining volumes. * INCIDENT NO:2892665 TRACKING ID:2807158 SYMPTOM: During VM upgrade or patch installation on Solaris platform, sometimes the system can hang due to deadlock with following stack: genunix:cv_wait genunix:ndi_devi_enter genunix:devi_config_one genunix:ndi_devi_config_one genunix:resolve_pathname genunix:e_ddi_hold_devi_by_path vxspec:_init genunix:modinstall genunix:mod_hold_installed_mod genunix:modrload genunix:modload genunix:mod_hold_dev_by_major genunix:ndi_hold_driver genunix:probe_node genunix:i_ndi_config_node genunix:i_ddi_attachchild DESCRIPTION: During the upgrade or patch installation, the vxspec module is unloaded and reloaded. In the vxspec module initialization, it tries to lock root node during the pathname go-through while already holding the subnode, i.e, /pseudo. Meanwhile, if there is another process holding the lock of root node is acquiring the lock of the subnode /pseudo, the deadlock occurs since each process tries to get the lock already hold by peer. RESOLUTION: APIs which are introducing deadlock are replaced. * INCIDENT NO:2892689 TRACKING ID:2836798 SYMPTOM: 'vxdisk resize' fails with the following error on the simple format EFI (Extensible Firmware Interface) disk expanded from array side and system may panic/hang after a few minutes. # vxdisk resize disk_10 VxVM vxdisk ERROR V-5-1-8643 Device disk_10: resize failed: Configuration daemon error -1 DESCRIPTION: As VxVM doesn't support Dynamic Lun Expansion on simple/sliced EFI disk, last usable LBA (Logical Block Address) in EFI header is not updated while expanding LUN. Since the header is not updated, the partition end entry was regarded as illegal and cleared as part of partition range check. This inconsistent partition information between the kernel and disk causes system panic/hang. RESOLUTION: Added checks in VxVM code to prevent DLE on simple/sliced EFI disk. * INCIDENT NO:2892702 TRACKING ID:2567618 SYMPTOM: VRTSexplorer coredumps in checkhbaapi/print_target_map_entry which looks like: print_target_map_entry() check_hbaapi() main() _start() DESCRIPTION: checkhbaapi utility uses HBA_GetFcpTargetMapping() API which returns the current set of mappings between operating system and fibre channel protocol (FCP) devices for a given HBA port. The maximum limit for mappings was set to 512 and only that much memory was allocated. When the number of mappings returned was greater than 512, the function that prints this information used to try to access the entries beyond that limit, which resulted in core dumps. RESOLUTION: The code has been changed to allocate enough memory for all the mappings returned by HBA_GetFcpTargetMapping(). * INCIDENT NO:2922770 TRACKING ID:2866997 SYMPTOM: After applying Solaris patch 147440-20, disk initialization using vxdisksetup command fails with following error, VxVM vxdisksetup ERROR V-5-2-43 : Invalid disk device for vxdisksetup DESCRIPTION: A un-initialized variable gets a different value after OS patch installation, thereby making vxparms command outputs give an incorrect result. RESOLUTION: Initialize the variable with correct value. * INCIDENT NO:2922798 TRACKING ID:2878876 SYMPTOM: vxconfigd, VxVM configuration daemon dumps core with the following stack. vol_cbr_dolog () vol_cbr_translog () vold_preprocess_request () request_loop () main () DESCRIPTION: This core is a result of a race between two threads which are processing the requests from the same client. While one thread completed processing a request and is in the phase of releasing the memory used, other thread is processing a request "DISCONNECT" from the same client. Due to the race condition, the second thread attempted to access the memory which is being released and dumped core. RESOLUTION: The issue is resolved by protecting the common data of the client by a mutex. * INCIDENT NO:2924117 TRACKING ID:2911040 SYMPTOM: Restore operation from a cascaded snapshot succeeds even when it's one of the source is inaccessible. Subsequently, if the primary volume is made accessible for operation, IO operations may fail on the volume as the source of the volume is inaccessible. Deletion of snapshots would as well fail due to dependency of the primary volume on the snapshots. In such case, following error is thrown when try to remove any snapshot using 'vxedit rm' command: ""VxVM vxedit ERROR V-5-1-XXXX Volume YYYYYY has dependent volumes" DESCRIPTION: When a snapshot is restored from any snapshot, the snapshot becomes the source of data for regions on primary volume that differ between the two volumes. If the snapshot itself depends on some other volume and that volume is not accessible, effectively primary volume becomes inaccessible after restore operation. In such case, the snapshots cannot be deleted as the primary volume depends on it. RESOLUTION: If a snapshot or any later cascaded snapshot is inaccessible, restore from that snapshot is prevented. * INCIDENT NO:2924188 TRACKING ID:2858853 SYMPTOM: In CVM(Cluster Volume Manager) environment, after master switch, vxconfigd dumps core on the slave node (old master) when a disk is removed from the disk group. dbf_fmt_tbl() voldbf_fmt_tbl() voldbsup_format_record() voldb_format_record() format_write() ddb_update() dg_set_copy_state() dg_offline_copy() dasup_dg_unjoin() dapriv_apply() auto_apply() da_client_commit() client_apply() commit() dg_trans_commit() slave_trans_commit() slave_response() fillnextreq() vold_getrequest() request_loop() main() DESCRIPTION: During master switch, disk group configuration copy related flags are not cleared on the old master, hence when a disk is removed from a disk group, vxconfigd dumps core. RESOLUTION: Necessary code changes have been made to clear configuration copy related flags during master switch. * INCIDENT NO:2924207 TRACKING ID:2886402 SYMPTOM: When re-configuring dmp devices, typically using command 'vxdisk scandisks', vxconfigd hang is observed. Since it is in hang state, no VxVM(Veritas volume manager)commands are able to respond. Following process stack of vxconfigd was observed. dmp_unregister_disk dmp_decode_destroy_dmpnode dmp_decipher_instructions dmp_process_instruction_buffer dmp_reconfigure_db gendmpioctl dmpioctl dmp_ioctl dmp_compat_ioctl compat_blkdev_ioctl compat_sys_ioctl cstar_dispatch DESCRIPTION: When DMP(dynamic multipathing) node is about to be destroyed, a flag is set to hold any IO(read/write) on it. The IOs which may come in between the process of setting flag and actual destruction of DMP node, are placed in dmp queue and are never served. So the hang is observed. RESOLUTION: Appropriate flag is set for node which is to be destroyed so that any IO after marking flag will be rejected so as to avoid hang condition. * INCIDENT NO:2930399 TRACKING ID:2930396 SYMPTOM: The vxdmpasm/vxdmpraw command does not work on Solaris. For example: #vxdmpasm enable user1 group1 600 emc0_02c8 expr: syntax error /etc/vx/bin/vxdmpasm: test: argument expected #vxdmpraw enable user1 group1 600 emc0_02c8 expr: syntax error /etc/vx/bin/vxdmpraw: test: argument expected DESCRIPTION: The "length" function of expr command does not work on Solaris. This function was used in the script and used to give error. RESOLUTION: The expr command has been replaced by awk command. * INCIDENT NO:2933467 TRACKING ID:2907823 SYMPTOM: Unconfiguring devices in 'failing' or 'unusable' state (as shown by cfgadm utility) cannot be done using VxVM Dynamic reconfiguration(DR) tool. DESCRIPTION: If devices are not removed properly then they can be in 'failing' or 'unusable' state as shown below: c1::5006048c5368e580,255 disk connected configured failing c1::5006048c5368e580,326 disk connected configured unusable Such devices are ignored by DR Tool, and they need to be manually unconfigured using cgadm utility. RESOLUTION: To fix this, code changes are done so that DR Tool asks user if they wants to unconfigure 'failed' or 'unusable' devices and takes action accordingly. * INCIDENT NO:2933468 TRACKING ID:2916094 SYMPTOM: These are the issues for which enhancements are done: 1. All the DR operation logs are accumulated in one log file 'dmpdr.log', and this file grows very large. 2. If a command takes long time, user may think DR operations have stuck. 3. Devices controlled by TPD are seen in list of luns that can be removed in 'Remove Luns' operation. DESCRIPTION: 1. All the logs of DR operations accumulate and form one big log file which makes it difficult for user to get to the current DR operation logs. 2. If a command takes time, user has no way to know whether the command has stuck. 3. Devices controlled by TPD are visible to user which makes him think that he can remove those devices without removing them from TPD control. RESOLUTION: 1. Now every time user opens DR Tool, a new log file of form dmpdr_yyyymmdd_HHMM.log is generated. 2. A messages is displayed to inform user if a command takes longer time than expected. 3. Changes are made so that devices controlled by TPD are not visible during DR operations. * INCIDENT NO:2933469 TRACKING ID:2919627 SYMPTOM: While doing 'Remove Luns' operation of Dynamic Reconfiguration Tool, there is no feasible way to remove large number of LUNs, since the only way to do so is to enter all LUN names separated by comma. DESCRIPTION: When removing luns in bulk during 'Remove Luns' option of Dynamic Reconfiguration Tool, it would not be feasible to enter all the luns separated by comma. RESOLUTION: Code changes are done in Dynamic Reconfiguration scripts to accept file containing luns to be removed as input. * INCIDENT NO:2934259 TRACKING ID:2930569 SYMPTOM: The LUNs in 'error' state in output of 'vxdisk list' cannot be removed through DR(Dynamic Reconfiguration) Tool. DESCRIPTION: The LUNs seen in 'error' state in VM(Volume Manager) tree are not listed by DR(Dynamic Reconfiguration) Tool while doing 'Remove LUNs' operation. RESOLUTION: Necessary changes have been made to display LUNs in error state while doing 'Remove LUNs' operation in DR(Dynamic Reconfiguration) Tool. * INCIDENT NO:2942166 TRACKING ID:2942609 SYMPTOM: You will see following message as error message when quiting from Dynamic Reconfiguration Tool. "FATAL: Exiting the removal operation." DESCRIPTION: When user quits from an operation, Dynamic Reconfiguration Tool displays it is quiting as error message. RESOLUTION: Made changes to display the message as Info. INCIDENTS FROM OLD PATCHES: --------------------------- NONE