1.1
PHCO_43910, PHKL_43909
VRTSvxvm 5.0.1RP3P6
VRTSvxvm
5.0.31.5/ 5.0.31.6
PHCO_43579, PHKL_43580
PHKL_43909, PHCO_43910
NONE
hpux1131
NONE
2014-06-09
CRITICAL
YES
NONE
YES
YES
a)VxVM 5.0.1 (GA)version 5.0.31.5 or version 5.0.31.6 must be installed before applying these
patches.
b)All prerequisite/corequisite patches have to be installed.The Kernel patch requires a system reboot for both installation and removal.
c)To install the patch, enter the following command:
# swinstall -x autoreboot=true -s <patch_directory> PHCO_43910 PHKL_43909
Incase the patch is not registered, the patch can be registered
using the following command:
# swreg -l depot <patch_directory> ,
where <patch_directory> is the absolute path where the patch resides.
d) Please do swverify after installing the patches in order to make sure that the patches are installed correctly using:
$ swverify PHCO_43910 PHKL_43909
a)To remove the patch, enter the following command:
# swremove -x autoreboot=true PHCO_43910 PHKL_43909
NONE
Incidents fixed in RP3 P3:
============
2626223 (2626199) "vxdmpadm list dmpnode" printing incorrect path-type
2605861 (1398914) recurrence of 'vxdisk -o thin list' not reporting correct SIZE information for XIV luns on 5.0.1RP2 on HP 11.31
2353424 (2334534) In CVM environment, vxconfigd level join is hung when Master returns error "VE_NO_JOINERS" to a joining node and cluster nidmap is changed in new reconfiguration
2631106 (2441937) vxconfigrestore precommit fails with awk errors
2631109 (990338) FMR Refreshing a snapshot should keep the same name for the snap object
2631112 (2248730) vxdg import command hangs as vxrecover daemon (spawned by vxdg) doesn't close standard error stream
2631159 (2280624) Need to set site consistent only on mirrored-volumes.
2631359 (2324507) The manpage for vxrelayout(1M) command is incorrect
2631361 (2354046) man page for dgcfgrestore is incorrect.
2631364 (2359814) vxconfigbackup doesn't handle errors well
2631365 (1818780) Oakmont Linux consume many memories during dmp test.
2631367 (2386120) Enhancement request to add diagnostic logging to help triage a CVM master takeover failure situation
2631371 (2431470) vxpfto uses DM name when calling vxdisk, but vxdisk will match DA name first and thus cause corruption
2631375 (2481938) QXCR1001120138: vxconfigbackup throwing an error when DG contains a sectioned disk
2631376 (2492451) QXCR1001150541 VxVM 11.31/5.0: vxvm-startup2 launches vxesd without checking install-db.
2658078 (2070531) Campus cluster: Couldn't enable siteconsistency on a dcl volume, when trying to make the disk group and its volumes siteconsistent.
2631114 (2272959) vxio V-5-0-0 message missing a space
2631378 (2052459) CFS mount failed on slave node due to registration failure on one of the paths
Incidents fixed in RP3 P2:
============
2273214 (2233889) Parallel recovery of volumes is not happening if dcl is present.
2274277 (1715204) vxsnap operations leads to orphan snap obj in case of any failure occurs during operation, orphan snap object can't be removed.
2324001 (2323925) If rootdisk is encapsulated and if install-db is present, clear warning should be displayed on system boot.
2325119 (1545835) vxconfigd core dump during system boot after VxVM4.1RP4 applied.
2337436 (2339210) [HxRT][5.0.1][SF-RAC]Huasy S5600 failover slowly when there is no PGR keys on the secondary paths
2353417 (2349352) During LUN provisioning in single path IO mode environment a data corruption is observed
2383822 (2384850) 5.0.1 smapi_listener changes for new what string model.
2383936 (2312972) QXCR1000974131 on VxVM 11.23/4.1 : save_config cannot handle one side of VxVM mirror failing
2402531 (1083297) Pinnacle: Install_Upgrade: I/O policies not persistent after an upgrade from 5.0v2mp2rp3 - 6.0.
2414161 (2067182) vxesd core dumped with SIGILL
2424221 (2423608) panic in vol_dev_strategy()following FC problems
2521084 (2521083) Panic 'Fault when executing in kernel mode' on c8+mpath
2528119 (2054454) Nightly:vmcert:Hp-UX Tc /vmtest/tc/scripts/admin/cbr/backup.tc#2 is failing.
2528142 (2528133) vxdisk ERROR V-5-1-0 - Record in multiple disk groups
2535859 (2522006) ASL Request for HxRT SF-RAC 5.0.1 HP-UX 11.31 HP P6300/P6500
2578576 (2052659) MaxFlii:DMP:HP : machine with A/A array connected panicked after vm installation
2578829 (2557156) New command for vxdmpadm to exclude foreign devices from being discovered redundantly.
2578834 (2566315) VxVM 11.31/5.0.1 : vPar6.0 VxVM guest installation deadlock panic
2578837 (530741) CVM: vxconfigd assert in commit()
2578845 (2233611) HDS wants the ASL for the USP-V & VSP (R700) to check page 00 to see if E3 is supported and then if E3 is supported then issue inquiry on E3 and wants the R700 array name set as Hitachi VSP
2578858 (2215104) Customer requires hf to support thin Provision/reclamation on HP P9500 on 5.0.1RP3 on HP 11.31
2578863 (2067319) MaxFli:vxconfgid hung on master when join two slaves are trying to join the cluster
2578872 (2147922) Thin Reclaim not detected directly for VMAX Microcode Version 5875
2581508 (1097258) vxconfigd hung when an array is disconnected
2585238 (2585239) QXCR1001170643/600-725-172 VxVM 5.0.1/11.31: VxVM 5.0.1: vxdisk-alldgs-list running very slow with many lunpaths
Incidents fixed in RP3 P1:
============
2344566 (2344551) When upgrading VxVM on DRD clone, install-db is created even for encapsulated rootdisk.
HANG , PERFORMANCE
PHCO_43910, PHKL_43909
3375267
2054319 When there are a large number of un-initialized disks under VxVM, the system
boot requires more than 3 hours.
When there are a large number of uninitialized disks under VxVM, the system
boot requires more than 3 hours.
As part of the "vold" startup, VxVM creates multiple threads. Each thread is
online with only one disk at a time. Since the disks are uninitialized, VxVM
checks if the disks have any file system. The statvfsdev() function is called
on the DMP device path to perform this activity. The statvfsdev() function
calls few IOCTL's which fail. DMP checks the state of the path, by performing
an inquiry. Before finding the state of the path, DMP determines whether the
disks are non-SCSI. The io_search() function is called on the instance number
of the "lunpath" to determine if the disks are non-SCSI. Calls to the io_search
() function in the multithreaded environment get serialized over the
gio_spinlock() function. Thereby, the online activity takes a long time to
execute.
The code is modified so that the io_search() function call to determine whether
the disk is SCSI or not is skipped, to prevent any contention among the
threads. Because only the SCSI devices are supported on HP-UX under VxVM
control.
3410940
3390959 The vxconfigd(1M) daemon hangs in the kernel while
processing the I/O request.
The vxconfigd(1M) daemon hangs in the kernel, while processing the I/O request.
The following stack trace is observed :
slpq_swtch_core()
sleep_pc()
biowait()
physio()
dmpread()
spec_rdwr()
vno_rw()
read()
syscall()
The vxconfigd(1M) daemon hangs, while processing the I/O request.
The "dmp_close_path" failure message is displayed in the syslog before the
hang. Based on the probable cause analysis, the failure message gets displayed
in the syslog, during the path closure, which is related to the hang observed.
Also, if the I/O fails on a path, the "iocount" is not decremented properly.
The code is modified to add some debug messages, to confirm the current
probable cause analysis, when this issue occurs. Also, if the I/O fails on a
path, the "iocount" is decremented properly.
3466269
3461383 The vxrlink(1M) command fails when the "vxrlink -g <DGNAME> -a att <RLINK>"
command is executed.
The vxrlink(1M) command fails, when the "vxrlink -g <DGNAME> -a att <RLINK>"
command is executed. On PA machines the following error message is displayed:
VxVM VVR vxrlink ERROR V-5-1-5276 Cannot open shared
library/usr/lib/libvrascmd.sl, error: Can't dlopen() a library containing
Thread Local Storage: /usr/lib/libvrascmd.sl
To make "vxconfigd" and other VxVM binaries thread safe, these binaries are now
linked with HP's thread safe "libIOmt" library. The vxrlink(1M) command opens a
shared library, which is linked with the thread safe "libIOmt" library. There
is a limitation onHP-UX, that a shared library that contains Thread Local
Storage (TLS) cannot be loaded dynamically. This results in an error.
The code is modified, so that the library that is dynamically loaded through
the vxrlink(1M) command, is not linked with the "libIOmt" library, as the
vxrlink(1M) command and the library do not invoke any routines from
the "libIOmt" library.
3467626
2515070 When the I/O fencing is enabled, the Cluster Volume Manager (CVM) slave node
may fail to join the cluster node.
When the I/O fencing is enabled, the Cluster Volume Manager (CVM) slave node
may fail to join the cluster node. The following error message is displayed:
In Veritas Cluster Server(VCS) engine log:
VCS ERROR V-16-20006-1005 (abc) CVMCluster:cvm_clus:monitor:node - state: out
of cluster reason: SCSI-3 PR operation failed: retry to add a node failed
In syslog:
V-5-1-15908 Import failed for dg ebap01dg. Local node has data disk fencing
enabled, but master does not have PGR key set
Whenever a fix transaction is performed during disk group (DG) import, the SCSI-
3 Persistent Group Reservation (PGR) key is not uploaded from "vxconfigd" to
the kernel DG record. This leads to a NULL-PGR key in the kernel DG record. If
subsequently, "vxconfigd" gets restarted, it reads the DG configuration record
from the kernel. This leads to the PGR key being NULL in "vxconfigd" also. This
configuration record is sent to the node, when it wants to join the cluster.
The slave node fails to import the shared DG, because of the missing PGR key,
and therefore it fails to join the cluster node.
The code is modified to copy the disk group PGR key from "vxconfigd" to the
kernel, when a new disk group record is loaded to the kernel.
3470966
3438271 The vxconfigd(1M) daemon may hang when new LUNs are added.
The VxVM commands may hang when new LUNs are added and device discovery is
performed. Subsequent VxVM commands that request information from the vxconfigd
(1M) daemon may also hang. The "vxconfigd" stack trace is as following:
swtch_to_thread ()
slpq_swtch_core ()
sleep_pc ()
biowait_rp ()
biowait ()
dmp_indirect_io ()
gendmpioctl ()
dmpioctl ()
spec_ioctl ()
vno_ioctl ()
ioctl ()
syscall ()
syscallinit ()
When new LUNs are added device discovery is performed, where certain
operations are requested from the vxconfigd(1M) daemon. If the LUN to be added
is foundto be non-SCSI or faulty, the vxconfigd(1M) daemon may hang.
The code is modified to avoid the hang in the vxconfigd(1M) daemon and the
subsequent VxVM commands.
3483635
3482001 The 'vxddladm addforeign' command renders the system unbootable, after the
reboot, for a few cases.
The 'vxddladm addforeign' command renders the system unbootable, after the
reboot, for a few cases.
As part of the execution of the 'vxddladm addforeign' command, VxVM incorrectly
identifies the specified disk as the root disk. As a result, it replaces all
the entries pertaining to the 'root disk', with the entries of the specified
disk, thus rendering the system unbootable.
The code is modified to detect the root disk appropriately when it is
specified, as a part of the 'vxddladm addforeign' command.
PHCO_43579, PHKL_43580
2234292
2152830 A diskgroup (DG) import fails with a non-descriptive error message when
multiple copies (clones) of the same device exist and the original devices are
either offline or not available.
A diskgroup (DG) import fails with a non-descriptive error message when
multiple copies (clones) of the same device exist and the original devices are
either offline or not available.
For example:
# vxdg import mydg
VxVM vxdg ERROR V-5-1-10978 Disk group mydg: import
failed:
No valid disk found containing disk group
If the original devices are offline or unavailable, the vxdg(1M) command picks
up cloned disks for import.DG import fails unless the clones are tagged and the
tag is specified during the DG import. The import failure is expected, but the
error message is non-descriptive and does not specify the corrective action to
be taken by the user.
The code is modified to give the correct error message when duplicate clones
exist during import. Also, details of the duplicate clones are reported in the
system log.
2973525
2973522 At cable connect on port1 of dual-port Fibre Channel Host Bus Adapters (FC
HBA), paths via port2 are marked as SUSPECT.
At cable connect on one of the ports of a dual-port Fibre Channel Host Bus
Adapters (FC HBA), paths that go through the other port are marked as SUSPECT.
DMP does not issue I/O on such paths until the next restore daemon cycle
confirms that the paths are functioning.
When a cable is connected at one of the ports of a dual-port FC HBA, HBA-
Registered State Change Notification (RSCN) event occurs on the other port.
When the RSCN event occurs, DMP marks the paths
as SUSPECT that goes through that port.
The code is modified so that the RSCN events that goes through the other port
are not marked as SUSPECT.
2982087
2976130 Multithreading of the vxconfigd (1M) daemon for HP-UX 11i v3 causes the DMP
database to be deleted as part of the device-discovery commands.
The device-discovery commands such as "vxdisk scandisks" and "vxdctl enable"
may cause the entire DMP database to be deleted. This causes the VxVM I/O
errors and file systems to get disabled. For instances where VxVM manages the
root disk(s), a system hang occurs. In a Serviceguard/SGeRAC environment
integrated with CVM and/or CFS, VxVM I/O failures would typically lead to a
Serviceguard INIT and/or a CRS TOC (if the voting disks sit on VxVM volumes).
Syslog shows the removal of arrays from the DMP database as following:
vmunix: NOTICE: VxVM vxdmp V-5-0-0 removed disk array 000292601518, datype = EMC
In addition to messages that indicate VxVM I/O errors and file systems are
disabled.
VxVM's vxconfigd(1M) daemon uses HP's libIO(3X) APIs such as io_search() and
io_search_array() functions to claim devices that are attached to the host.
Although, vxconfigd(1M) is multithreaded, it uses a non-thread safe version of
the libIO(3X) APIs. A race condition may occur when multiple vxconfigd threads
perform device discovery. This results in a NULL value returned to the libIO
(3X) APIs call.VxVM interprets the NULL return value as an indication of none
of the devices being attached and proceeds to delete all the devices previously
claimed from the DMP database.
The vxconfigd(1M) daemon, as well as the event source daemon vxesd(1M), is now
linked with HP's thread-safe libIO(3X) library. This prevents the race
condition among multiple vxconfigd threads that perform device discovery.
Please refer to HP's customer bulletin c03585923 for a list of other software
components required for a complete
2983903
2907746 File Descriptor leaks are observed with the device-discovery command of VxVM.
At device discovery, the vxconfigd(1M) daemon allocates file descriptors for
open instances of "/dev/config", but does not always close them after use, this
results in a file descriptor leak over time.
Before any API of the "libIO" library is called, the io_init() function needs
to be called. This function opens the "/dev/config" device file. Each io_init()
function call should be paired with the io_end() function call. This function
closes the "/dev/config" device file. However, the io_end() function call is
amiss at some places in the device discovery code path. As a result, the file
descriptor leaks are observed with the device-discovery command of VxVM.
The code is modified to pair each io_init() function call with the io_end()
function call in every possible code path.
3028911
2390998 System panicked during SAN reconfiguration because of the inconsistency in dmp
device open count.
When running 'vxdctl enable' or 'vxdisk scandisks' command after the
configuration
changes in SAN ports, system panicked with the following stack trace:
.disable_lock()
dmp_close_path()
dmp_do_cleanup()
dmp_decipher_instructions()
dmp_process_instruction_buffer()
dmp_reconfigure_db()
gendmpioctl()
vxdmpioctl()
After the configuration changes in SAN ports, the configuration in VxVM also
needs to be updated. In the reconfiguration process, VxVM may temporarily have
the old dmp path nodes and the new dmp path nodes, both of which has the same
device number, to migrate the old ones to new ones. VxVM maintains two types
of open count to avoid platform dependency. However when openining/closing the
old dmp path nodes while the migration process is going on, VxVM wrongly
calculates
the open counts in the dmp path nodes; calculates an open count in the new node
and then calculates the other open count in the old node. This results in the
inconsistent open counts of the node and cause panic while checking open counts.
The code change has been done to maintain the open counts on the same dmp path
node database correctly while performing dmp device open/close.
3047804
2969844 The device discovery failure should not cause the DMP database to be destroyed
completely.
The DMP database gets destroyed if the discovery fails for some
reason. "ddl.log shows numerous entries as follows:
DESTROY_DMPNODE:
0x3000010 dmpnode is to be destroyed/freed
DESTROY_DMPNODE:
0x3000d30 dmpnode is to be destroyed/freed
Numerous vxio errors are seen in the syslog as all VxVM I/O's fail afterwards.
VxVM deletes the old device database before it makes the new device database.
If the discovery process fails for some reason, this results in a null DMP
database.
The code is modified to take a backup of the old device database before doing
the new discovery. Therefore, if the discovery fails we restore the old
database and display the appropriate message on the console.
3059067
1820179 "vxdctl debug <debuglevel>" command dumps core if vxconfigd
log file was modified when vxconfigd starts with "logfile=<file-name>" option.
The "vxdctl debug <debuglevel>" (1M) command core
dumps and the following stack trace is displayed:
strcpy.strcpy()
xfree_internal()
msg_logfile_disable()
req_vold_debug()
request_loop()
main()
The vxconfigd(1M) command uses a static memory for storing
the logfile information when it is started with
logfile=<file-name>. This logfile can be changed with the
vxdctl(1M) command. The "vxdctl debug" (1M) command uses
dynamically allocated memory for storing the path
information. If the default debug level is changed using
the vxdctl(1M) command, the allocated memory for path
information is deleted assuming that it is created by
the vxdctl(1M) debug command. Thus, it results in core
dump.
The code is modified to allocate dynamic memory for
storing the path information.
3059139
2979824 The vxdiskadm(1M) utility bug results in the exclusion of the unintended paths.
While excluding the controller using the vxdiskadm(1M) utility, the unintended
paths get excluded
The issue occurs due to a logical error related to the grep command, when the
hardware path of the controller to be retrieved is excluded. In some cases, the
vxdiskadm(1M) utility takes the wrong hardware path for the controller that is
excluded, and hence excludes unintended paths. Suppose there are two
controllers viz. c189 and c18 and the controller c189 is listed above c18 in
the command, and the controller c18 is excluded, then the hardware path of the
controller c189 is passed to the function and hence it ends up excluding the
wrong controller.
The script is modified so that the vxdiskadm(1M) utility now takes the hardware
path of the intended controller only, and the unintended paths do not get
excluded.
3068265
2994677 When the 'vxdisk scandisks' or 'vxdctl enable' commands
are run the system panics.
When the 'vxdisk scandisks' or 'vxdctl enable' commands
are run the system panics with the following stack trace:
panic_save_regs_switchstack+0x110 ()
panic+0x410 ()
bad_kern_reference+0xa0 ()
$cold_pfault+0x530 ()
vm_hndlr+0x12f0 ()
bubbleup+0x880 ()
dmp_decode_add_new_path+0x430 ()
dmp_decipher_instructions+0x490 ()
dmp_process_instruction_buffer+0x340 ()
dmp_reconfigure_db+0xc0 ()
gendmpioctl+0x920 ()
dmpioctl+0x100 ()
spec_ioctl+0xf0 ()
vno_ioctl+0x350 ()
ioctl+0x410 ()
syscall+0x5b0 ()
The problem occurs because the NULL value of the structure that causes the
panic was not checked.
The code is modified to check for the valid value of the
structure to prevent the system to panic.
3072892
2352517 The system panics while excluding a controller from Veritas Volume Manager
(VxVM) view.
Excluding a controller from Veritas Volume Manager (VxVM ) using the vxdmpadm
exclude ctlr=<ctlr-name>" command causes the system to panic with the following
stack trace:
gen_common_adaptiveminq_select_path
dmp_select_path
gendmpstrategy
voldiskiostart
vol_subdisksio_start
volkcontext_process
volkiostart
vxiostrategy
vx_bread_bp
vx_getblk_cmn
vx_getblk
vx_getmap
vx_getemap
vx_do_extfree
vx_extfree
vx_te_trunc_data
vx_te_trunc
vx_trunc_typed
vx_trunc_tran2
vx_trunc_tran
vx_trunc
vx_inactive_remove
vx_inactive_tran
vx_local_inactive_list
vx_inactive_list
vx_workitem_process
vx_worklist_process
vx_worklist_thread
thread_start
While excluding a controller from the VxVM view, all the paths must also be
excluded. The panic occurs because the controller is excluded before the paths
belonging to that controller are excluded. While excluding the path, the
controller of that path which is NULL is accessed.
The code is modified to exclude all the paths belonging to a controller before
excluding a controller.
3121041
2310284 In the Veritas Volume Manager (VxVM) versions prior to 5.1SP1, the Cross-
Platform-Data Sharing (CDS) disk initialization of Logical Unit Number (LUN)
size greater than 1 TB may lead to data corruption.
In the Veritas Volume Manager (VxVM) versions prior to 5.1SP1, the Cross-
Platform-Data Sharing (CDS) disk initialization of Logical Unit Number (LUN)
size greater than 1 TB may lead to data corruption.
The Veritas Volume Manager (VxVM) versions prior to 5.1SP1, allows CDS-disk
initialization of LUN size greater than 1TB. A CDS disk of size greater than
1TB can cause data corruption because of writing the backup labels.
and the label ids. in the data or the public regions.VxVM should not allow the
CDS-disk initialization for the LUN size that is greater than 1TB to support
the CDS compatibility across platform.
The code is modified to restrict the vxdisksetup(1M)command to perform the CDS-
disk initialization for a CDS disk of size greater than 1TB. The changes in the
vxdisk(1M) and vxresize(1M) command would be available in the subsequent patch.
3130376
3130361 Prevent disk initialization of size greater than 1TB for disks with the CDS
format.
Disks greater than 1 TB can be initialized with the Cross Platform Data sharing
(CDS) format.
CDS formatted disks use Sun Microsystems Label (SMI) for partitioning. SMI
partition table is capable of storing partition size of 1 TB only. EarlierVxVM
releases did not prevent initialization of disks greater than 1 TB with the CDS
format.
The code is modified so that VxVM can explicitly prevent initialization of
disks greater than 1 TB with the CDS format. Other VxVM utilities
like 'vxvmconvert', 'vxcdsconvert', and 'DLE (vxdisk resize)' either explicitly
fail the operation, if the environment involves greater than 1 TB disks, or use
the HPDISK format wherever possible.
3139305
3139300 Memory leaks are observed in the device discovery code path of VxVM.
At device discovery, the vxconfigd(1M) daemon allocates memory but does not
release it after use, causing a user memory leak. The Resident Memory Size
(RSS) of the vxconfigd(1M) daemon thus keeps growing and may reach maxdsiz(5)
in the extreme case that causes the vxconfigd(1M) daemon to abort.
At some places in the device discovery code path, the buffer is not freed. This
results in memory leaks.
The code is modified to free the buffers.
3315600
3315534 The vxconfigd(1M) daemon dumps core during start-up from the /sbin/pre_init_rc
file, after it is switched to native multi-pathing.
The vxconfigd(1M) daemon dumps core during system start-up on the IA machine
when the boot is performed from VxVM ROOT device under Native Multi-Pathing
(NMP) control. This causes boot failure as the root disk group 'rootdg' fails
to get imported. With the vxconfigd(1M) daemon's core file the following stack
trace is observed:
dg_creat_tempdb
mode_set
setup_mode
startup
main
When the disk group 'rootdg' which resides on disks that belong to NMP on IA
machines is imported, the vxconfigd(1M) daemon does not select the correct DSF
for the BOOT device. Thereby, it ignores the partition component. This causes
I/O errors when the root disk group is imported.
The code is modified to use the correct public and private disk partitions
during boot-up.
3318945
3248281 When the "vxdisk scandisks" or "vxdctle enable" commands are run consecutively
the "VxVM vxdisk ERROR V-5-1-0 Device discovery failed." error is encountered.
When the "vxdisk scandisks" or "vxdctl enable" commands are run consecutively,
an error is displayed as following:
VxVM vxdisk ERROR V-5-1-0 Device discovery failed.
The device discovery failure occurs because in some cases the variable that is
passed to the OS specific function is not set properly.
The code is modified to set the correct variable before the variable is passed
to the OS specific function.
3369341
3325371 Panic occurs in the vol_multistepsio_read_source() function when snapshots are
used.
Panic occurs in the vol_multistepsio_read_source() function when VxVM's
FastResync feature is used. The stack trace observed is as following:
vol_multistepsio_read_source()
vol_multistepsio_start()
volkcontext_process()
vol_rv_write2_start()
voliod_iohandle()
voliod_loop()
kernel_thread()
When a volume is resized, Data Change Object (DCO) also needs to be resized.
However, the old accumulator contents are not copied into the new accumulator.
Thereby, the respective regions are marked as invalid. Subsequent I/O on these
regions triggers the panic.
The code is modified to appropriately copy the accumulator contents during the
resize operation.
PHCO_43185, PHKL_43186
2575150
2617277 Man pages for the vxautoanalysis and vxautoconvert commands are missing from the base package.
Man pages missing for the vxautoanalysis and vxautoconvert commands.
The man pages for the vxautoanalysis and vxautoconvert commands are missing from
the base package.
Added the man pages for vxautoanalysis(1M) and vxautoconvert(1M) commands.
2631371
2431470 "vxdisk set" command operates on a wrong VxVM device and does not work with DA
(Disk Access) name correctly.
1. "vxpfto" command sets PFTO(Powerfail Timeout) value on a wrong VxVM device
when it passes DM(Disk Media) name to the "vxdisk set" command with -g option.
2. "vxdisk set" command does not work when DA name is specified either -g
option specified or not.
Ex.)
# vxdisk set [DA name] clone=off
VxVM vxdisk ERROR V-5-1-5455 Operation requires a disk group
# vxdisk -g [DG name] set [DA name] clone=off
VxVM vxdisk ERROR V-5-1-0 Device [Da name] not in configuration or associated
with DG [DG name]
1. "vxpfto" command invokes "vxdisk set" command to set the PFTO value. It
shall accept both DM and DA names for device specification. However DM and DA
names can have conflicts such that even within the same disk group, the same
name can refer to different devices - one as a DA name and another as a DM
name. "vxpfto" command uses a DM name with -g option when invoking the "vxdisk
set" command but it will choose a matching DA name before a DM name. This
causes incorrect device to be acted upon.
Both DM and DA name can be specified for the "vxdisk set" command with -g
option however the DM name are given preference with -g option from the design
perspective.
2. "vxdisk set" command shall accept DA name for device specification.
Without -g option, the command shall work only when DA name is specified.
However it doesn't work because the disk group name is not extracted from the
DA record correctly. Hence the first error.
With -g option the DA name specified is treated as a matching DM name wrongly,
hence the second error.
Code changes are made to make the "vxdisk set" command working correctly on DA
name without -g option and on both DM and DA names with -g option. The given
preference is DM name when -g option is specified. It resolves the "vxpfto"
command issue as well.
2674273
2647975 Serial Split Brain (SSB) condition caused Cluster Volume Manager(CVM) Master Takeover to fail.
Serial Split Brain (SSB) condition caused Cluster Volume Manager(CVM) Master
Takeover to fail. The below vxconfigd debug output was noticed when the issue
was noticed:
VxVM vxconfigd NOTICE V-5-1-7899 CVM_VOLD_CHANGE command received
V-5-1-0 Preempting CM NID 1
VxVM vxconfigd NOTICE V-5-1-9576 Split Brain. da id is 0.5, while dm id is 0.4
for
dm cvmdgA-01
VxVM vxconfigd WARNING V-5-1-8060 master: could not delete shared disk groups
VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgA: Disabled by errors
VxVM vxconfigd ERROR V-5-1-7934 Disk group cvmdgB: Disabled by errors
...
VxVM vxconfigd ERROR V-5-1-11467 kernel_fail_join() : Reconfiguration
interrupted: Reason is transition to role failed (12, 1)
VxVM vxconfigd NOTICE V-5-1-7901 CVM_VOLD_STOP command received
When Serial Split Brain (SSB) condition is detected by the new CVM master, on
Veritas Volume Manager (VxVM) versions 5.0 and 5.1, the default CVM behaviour
will cause the new CVM master to leave the cluster and causes cluster-wide downtime.
Necessary code changes have been done to ensure that when SSB is detected in a
diskgroup, CVM will only disable that particular diskgroup and keep the other
diskgroups imported during the CVM Master Takeover, the new CVM master will not
leave the cluster with the fix applied.
2753970
2753954 When a cable is disconnected from one port of a dual-port FC
HBA, the paths via another port are marked as SUSPECT PATH.
When cable is disconnected from one port of a dual-port FC HBA, only
paths going through the port should be marked as SUSPECT. But paths going
through other port are also getting marked as SUSPECT.
Disconnection of a cable from a HBA port generates a FC event.
When the event is generated, paths of all ports of the corresponding HBA are
marked as SUSPECT.
The code changes are done to mark the paths only going through the
port on which FC event is generated.
2779347
2779345 Data corruption seen. CDS backup label signature seen within PUBLIC region data. OR Creation of a volume failed on a disk indicating in-sufficient space available.
1. Creation of a volume failed on a disk indicating in-sufficient space available.
2. Data corruption seen. CDS backup label signature seen within PUBLIC region data.
Cross Platform Data (CDS) disk uses Solaris VTOC as Platform block. When a disk is
initialized as CDS, geometry obtained from SCSI MODE-SENSE/fake geometry
algorithm is written within the VTOC.
With operations like BCV cloning, firmware upgrade etc, geometry obtained from
MODE-SENSE/fake geometry can be different from the stamped geometry.
If disk size obtained using the geometry stored on disk label is lesser than the
disk size obtained from
MODE-SENSE/fake geometry, operation like creating a volume might fail due to
in-sufficient space available.
If disk size using fake geometry is greater than the disk size obtained from
MODE-SENSE geometry, data corruption might occur if CDS backup label is written
with operation like "vxdisk flush".
The fix is that the geometry on the disk is honored, i.e the geometry used
during initialization will be used for operation like "vxdisk flush".
2846151
2108993 "vxdisk list" command can't show newly added disks, dmp_do_reconfig() reports error
When an existing path/LUN is removed and at the same time new LUN is added on
executing 'vxdisk scandisks' dmp_do_reconfig() reports error and discovery gets
aborted. Following error message is displayed:
VxVM vxconfigd DEBUG V-5-1-0 dmp_do_reconfig: DMP_RECONFIGURE_DB failed: No such
file or directory
'vxdisk list' will not show the newly added disks.
The scenario where the newly added device reuses the device numbers of removed
devices, was not handled in DMP (Dynamic Multi Pathing) code.
The DMP driver has been changed to handle this case.
2860459
2257850 vxdiskadm leaks memory while performing operations related to enclosures.
Memory leak is observed when information about enclosure is accessed by vxdiskadm.
The memory allocated locally for a data structure keeping information about the
array specific attributes is not freed.
Code changes are made to avoid such memory leaks.
2860462
1874383 Memory leak in req_ddl_reconfig_fabric() .
Vxconfigd leaks memory during DMP device reconfiguration.
In vxconfigd code, during device reconfiguration, memory allocated for the
data structures related to device reconfiguration is not freed which led to memory
leak.
The memory is released after its scope is over.
2860464
1675599 Excluding and including a LUN in a loop triggers a huge memory leak for vxconfigd when EMC PowerPath is configured
Vxconfigd leaks memory while excluding and including a Third party Driver
controlled LUN in a loop. As part of this vxconfigd loses its license information
and following error is seen in system log:
"License has expired or is not available for operation"
In vxconfigd code, memory allocated for various data structures related to
device discovery layer is not freed which led to the memory leak.
The memory is released after its scope is over.
2872617
2413763 Uninitialized memory read results in a vxconfigd coredump
vxconfigd, the VxVM daemon dumps core with the following stack:
ddl_fill_dmp_info
ddl_init_dmp_tree
ddl_fetch_dmp_tree
ddl_find_devices_in_system
find_devices_in_system
mode_set
setup_mode
startup
main
__libc_start_main
_start
Dynamic Multi Pathing node buffer declared in the Device Discovery Layer was not
initialized. Since the node buffer is local to the function, an explicit
initialization is required before copying another buffer into it.
The node buffer is appropriately initialized using memset() to address the
coredump.
2872633
2680482 startup scripts use 'quit' instead of 'exit', causing empty directories in /tmp
There are many random directories not cleaned up in /tmp/, like
vx.$RANDOM.$RANDOM.$RANDOM.$$ on system startup.
In general the startup scripts should call quit(), in which it call do the
cleanup when errors detected. The scripts were calling exit() directly instead
of quit() leaving some random-created directories uncleaned.
These script should be restored to call quit() instead of exit() directly.
2872635
2792748 Node join fails because of closing of wrong file descriptor
In an HPUX cluster environment, the slave join fails with the
following error message in syslog:
VxVM vxconfigd ERROR V-5-1-5784 cluster_establish:kernel interrupted vold on
overlapping reconfig.
During the join, the slave node performs disk group import.
As part of the import, the file descriptor pertaining to "Port u" is closed
because of a wrong assignment of the return value of open(). Hence, the
subsequent write to the same port was returning EBADF.
Code changes are done to avoid closing the wrong file
descriptor
2872916
2801962 Growing a volume takes significantly large time when the volume has version 20
DCO attached to it.
Operations that lead to growing of volume, including 'vxresize', 'vxassist
growby/growto' take significantly larger time if the volume has version 20
DCO(Data Change Object) attached to it in comparison to volume which doesn't
have DCO attached.
When a volume with a DCO is grown, it needs to copy the existing map in DCO and
update the map to track the grown regions. The algorithm was such that for
each region in the map it would search for the page that contains that region
so as to update the map. Number of regions and number of pages containing them
are proportional to volume size. So, the search complexity is amplified and
observed primarily when the volume size is of the order of terabytes. In the
reported instance, it took more than 12 minutes to grow a 2.7TB volume by 50G.
Code has been enhanced to find the regions that are contained within a page and
then avoid looking-up the page for all those regions.
2913666
2000661 The enhanced noreonline needs to take care of diskgroup rename operation explicitly.
The enhanced noreonline needs to take care of diskgroup rename
operation explicitly.
While renaming a shared diskgroup, the master node updates the
on-disk information to reflect rename of diskgroup before actual import. With
NOREONLINE propagated to slaves, the slaves do not refresh the in-core
information belonging to the disks involved in the import operation. So when the
import operation on slaves tries to verify the (new) diskgroup information sent
by the master with the information available in-core, it does not match.
The shared diskgroup import operation explicitly re-onlines the
disks selected for import so that the latest information (if updated by the
master) is available to the slave nodes even if NOREONLINE is specified.
2913669
1959513 The shared diskgroup import times increase drastically with more number of disks attached to the CVM cluster.
The shared diskgroup import times increase drastically with more number
of disks attached to the CVM cluster
As part of regular shared diskgroup import, First the master node
followed by all slaves nodes do re-online of all the disks attached to the
cluster which are not part of any imported diskgroup. With more number of disks
attached to the cluster, the re-online times are higher which causes the import
times to be high as well. diskgroup import command supports a -o noreonline
option to skip the reonline of disks on the master node however the same is not
propogated to slaves. Hence the import times with this option specified are
still on the higher side.
The option to skip reonline of disks during import is propogated to
slave nodes because of which the REONLINE is skipped on entire cluster ensuring
very good shared diskgroup import times.