* * * READ ME * * * * * * InfoScale 7.1 * * * * * * Patch 300 * * * Patch Date: 2019-06-21 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- InfoScale 7.1 Patch 300 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- RHEL7 x86-64 PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSamf VRTSaslapm VRTSdbac VRTSgab VRTSglm VRTSgms VRTSllt VRTSodm VRTSveki VRTSvxfen VRTSvxfs VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * InfoScale Availability 7.1 * InfoScale Enterprise 7.1 * InfoScale Foundation 7.1 * InfoScale Storage 7.1 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: VRTSvxvm-7.1.0.4100 * 3880839 (3880936) vxvmconvert hangs on RHEL 6.8. * 3916822 (3895415) Due to illegal memory access, nodes in the cluster can observe a panic. * 3921280 (3904538) IO hang happens during slave node leave or master node switch because of racing between RV(Replicate Volume) recovery SIO(Staged IO) and new coming IOs. * 3966363 (3938549) Volume creation fails with error: Unexpected kernel error in configuration update for Rhel 7.5. * 3966379 (3966378) Sles11sp4 support * 3970407 (3970406) "vxclustadm nidmap" command fails with error on RHEL 6.10 . * 3977787 (3921994) Failure in the backup for disk group, Temporary files such as .bslist .cslist .perm are seen in the directory /var/temp. * 3977848 (3949954) Dumpstack messages are printed when vxio module is loaded for the first time when called blk_register_queue. * 3977852 (3900463) vxassist may fail to create a volume on disks having size in terabytes when '-o ordered' clause is used along with 'col_switch' attribute while creating the volume. * 3977854 (3895950) vxconfigd hang observed due to accessing stale/un-initiliazed lock. * 3977992 (3878153) VVR 'vradmind' deamon core dump. * 3977993 (3932356) vxconfigd dumping core while importing DG * 3977994 (3889759) Panic observed with xfs-vxvm-nvme stack configuration on rhel 7.x * 3977995 (3879234) dd read on the Veritas Volume Manager (VxVM) character device fails with Input/Output error while accessing end of device. * 3977999 (3868154) When DMP Native Support is set to ON, dmpnode with multiple VGs cannot be listed properly in the 'vxdmpadm native ls' command * 3978001 (3919902) vxdmpadm iopolicy switch command can fail and standby paths are not honored by some iopolicy * 3978010 (3886199) Allow VxVM package installation on EFI enabled Linux machines. * 3978012 (3827395) VxVM(Veritas Volume Manager) package upgrade may inconsistently fail to unload the kernel modules of previous installation in VVR(Veritas Volume Replicator) configuration. * 3978186 (3898069) System panic may happen in dmp_process_stats routine. * 3978188 (3878030) Enhance VxVM DR tool to clean up OS and VxDMP device trees without user interaction. * 3978199 (3891252) vxconfigd segmentation fault affecting the cluster related processes. * 3978200 (3935974) When client process shuts down abruptly or resets connection during communication with the vxrsyncd daemon, it may terminate vxrsyncd daemon. * 3978217 (3873123) If the disk with CDS EFI label is used as remote disk on the cluster node, restarting the vxconfigd daemon on that particular node causes vxconfigd to go into disabled state * 3978219 (3919559) IO hangs after pulling out all cables, when VVR is reconfigured. * 3978222 (3907034) The mediatype is not shown as ssd in vxdisk -e list command for SSD (solid state devices) devices. * 3978223 (3958878) vxrecover dumps core on system reboot. * 3978225 (3948140) System panic can occur if size of RTPG (Report Target Port Groups) data returned by underlying array is greater than 255. * 3978233 (3879324) VxVM DR tool fails to handle busy device problem while LUNs are removed from OS * 3978234 (3795622) With Dynamic Multipathing (DMP) Native Support enabled, Logical Volume Manager (LVM) global_filter is not updated properly in lvm.conf file. * 3978307 (3890415) Support growing mounted ext3/ext4 filesystem on top of VxVM volume using vxresize. * 3978309 (3906119) Failback didn't happen when the optimal path returned back in a cluster environment. * 3978310 (3921668) vxrecover command with -m option fails when executed on the slave nodes. * 3978324 (3754715) When dmp_native_support is enabled , kdump functionality does not work properly. * 3978325 (3922159) Thin reclaimation may fail on Xtremio SSD disks. * 3978328 (3931936) VxVM(Veritas Volume Manager) command hang on master node after restarting slave node. * 3978331 (3930312) Short CPU spike while running vxpath_links * 3978334 (3932246) vxrelayout operation fails to complete. * 3978603 (3925377) Not all disks could be discovered by DMP after first startup. * 3978612 (3956732) systemd-udevd message can be seen in journalctl logs. * 3978613 (3917636) Filesystems from /etc/fstab file are not mounted automatically on boot through systemd on RHEL7 and SLES12. * 3978708 (3897047) Filesystems are not mounted automatically on boot through systemd on RHEL7 and SLES12. * 3978722 (3947265) Delay added in vxvm-startup script to wait for infiniband devices to get discovered leads to various issues. * 3980027 (3921572) vxconfigd dumps core during cable pull test. * 3980112 (3980535) Warning messages observed while uninstalling the VxVM package. * 3980561 (3968915) VxVM support on RHEL 7.6 Patch ID: VRTSvxvm-7.1.0.300 * 3927513 (3925398) VxVM modules failed to load on RHEL7.4. Patch ID: VRTSvxvm-7.1.0.100 * 3902841 (3902851) VxVM module failed to load on RHEL7.3. Patch ID: VRTSaslapm-7.1.0.520 * 3980566 (3967122) Retpoline support for ASLAPM rpm on RHEL 7.6 retpoline kernel Patch ID: VRTSveki-7.1.0.2100 * 3980619 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSdbac-7.1.0.4100 * 3980438 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSdbac-7.1.0.300 * 3929519 (3925832) vcsmm module does not load with RHEL7.4 Patch ID: VRTSdbac-7.1.0.100 * 3901580 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3(RHEL7.3). Patch ID: VRTSamf-7.1.0.3100 * 3980437 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSamf-7.1.0.200 * 3929360 (3923100) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). Patch ID: VRTSamf-7.1.0.100 * 3899850 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL7.3). Patch ID: VRTSvxfen-7.1.0.6100 * 3980436 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSvxfen-7.1.0.500 * 3928663 (3923100) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). Patch ID: VRTSvxfen-7.1.0.300 * 3901438 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL7.3). Patch ID: VRTSvxfen-7.1.0.100 * 3879189 (3878333) In SCSI-3 mode, I/O fencing fails to automatically refresh registrations on coordination disks. Patch ID: VRTSgab-7.1.0.6100 * 3980435 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSgab-7.1.0.500 * 3929359 (3923100) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). Patch ID: VRTSgab-7.1.0.300 * 3901437 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL7.3). Patch ID: VRTSgab-7.1.0.100 * 3893802 (3893801) In some cases, when a node joins a large cluster, few GAB directed messages are delivered in the wrong order. Patch ID: VRTSllt-7.1.0.8100 * 3980434 (3967265) Support for RHEL 7.6 and RHEL 7.x RETPOLINE kernels. Patch ID: VRTSllt-7.1.0.700 * 3929358 (3923100) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). Patch ID: VRTSllt-7.1.0.400 * 3906299 (3905430) Application IO hangs in case of FSS with LLT over RDMA during heavy data transfer. Patch ID: VRTSllt-7.1.0.300 * 3898106 (3896877) Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3(RHEL7.3). Patch ID: VRTSgms-7.1.0.2200 * 3977608 (3947812) GMS support for RHEL7.5 Patch ID: VRTSglm-7.1.0.2200 * 3977609 (3947815) GLM support for RHEL7.5 * 3978606 (3909553) Use the GAB backenable flow control interface. * 3978608 (3898145) GLM has a bottleneck on Linux. Patch ID: VRTSodm-7.1.0.3200 * 3977421 (3958865) ODM module failed to load on RHEL7.6. * 3977610 (3938546) ODM module failed to load on RHEL7.5. * 3978714 (3978713) System panic with odm_tsk_exit. Patch ID: VRTSodm-7.1.0.300 * 3926164 (3923310) ODM module failed to load on RHEL7.4. Patch ID: VRTSvxfs-7.1.0.4200 * 3977313 (3973440) VxFS mount failed with error "no security xattr handler" on RHEL 7.6 when SELinux enabled (both permissive and enforcing) * 3977420 (3958853) VxFS module failed to load on RHEL7.6. * 3977424 (3959305) Fix a bug in security attribute initialisation of files with named attributes. * 3977425 (3959299) Improve file creation time on systems with Selinux enabled. * 3977611 (3938544) VxFS module failed to load on RHEL7.5. * 3978167 (3922986) Dead lock issue with buffer cache iodone routine in CFS. * 3978197 (3894712) ACL permissions are not inherited correctly on cluster file system. * 3978228 (3928046) VxFS kernel panic BAD TRAP: type=34 in vx_assemble_hdoffset(). * 3978248 (3927419) System panic due to a race between freeze and unmount. * 3978340 (3898565) Solaris no longer supports F_SOFTLOCK. * 3978347 (3896920) Secondary mount could get into a deadlock. * 3978355 (3914782) Performance drop caused by too many VxFS worker threads * 3978357 (3911048) LDH corrupt and filesystem hang. * 3978361 (3905576) CFS hang during a cluster wide freeze * 3978370 (3941942) Unable to handle kernel NULL pointer dereference while freeing fiostats. * 3978382 (3812330) slow ls -l across cluster nodes * 3978384 (3909583) Disable partitioning of directory if directory size is greater than upper threshold value. * 3978484 (3931026) umounting a CFS is hung on RHEL6.6 * 3978602 (3908954) Some writes could be missed causing data loss. * 3978604 (3909553) Use the GAB backenable flow control interface. * 3978609 (3921152) Performance drop caused by vx_dalloc_flush(). Patch ID: VRTSvxfs-7.1.0.400 * 3926290 (3923307) VxFS module failed to load on RHEL7.4. Patch ID: VRTSvxfs-7.1.0.100 * 3901320 (3901318) VxFS module failed to load on RHEL7.3. DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following incidents: Patch ID: VRTSvxvm-7.1.0.4100 * 3880839 (Tracking ID: 3880936) SYMPTOM: vxvmconvert hangs while analysing the LVM Volume Groups for further conversion. DESCRIPTION: The issue occurs due to additional messages in the duplicate warnings when executing "pvdisplay" command. The "PV UUID"s are not properly extracted because the keyword used to extract them is not correct. RESOLUTION: The code has been modified to extract PV UUIDs correctly. * 3916822 (Tracking ID: 3895415) SYMPTOM: Panic can occur on the nodes in the cluster all of a sudden. Following stack will be seen as a part of the thread list: bcopy() cvm_dc_hashtable_clear_udidentry() vol_test_connectivity() volconfig_ioctl() volsioctl_real() spec_ioctl() fop_ioctl() ioctl() _syscall_no_proc_exit32() DESCRIPTION: We are trying to access the memory which is not allocated to the pointer and trying to copy the same to another pointer. Since we are trying to access illegal memory which can lead to a panic kind of situation all of a sudden. The issue would occur when the UDID of the device is less than 377. RESOLUTION: Fix is to avoid accessing illegal memory in the code through the pointer. * 3921280 (Tracking ID: 3904538) SYMPTOM: RV(Replicate Volume) IO hang happens during slave node leave or master node switch. DESCRIPTION: RV IO hang happens because of SRL(Serial Replicate Log) header is updated by RV recovery SIO. After slave node leave or master node switch, RV recovery could be initiated. During RV recovery, all new coming IOs should be quiesced by setting NEED RECOVERY flag on RV to avoid racing. Due to a code defect, this flag is removed by transaction commit, result in conflicting between new IOs and RV recovery SIO. RESOLUTION: Code changes have been made to fix this issue. * 3966363 (Tracking ID: 3938549) SYMPTOM: Command vxassist -g make vol fails with error: Unexpected kernel error in configuration update for Rhel 7.5. DESCRIPTION: Due to changes in Rhel 7.5 source code the vxassist make volume command failed to create volume and returned with error "Unexpected kernel error in configuration update". RESOLUTION: Changes are done in VxVM code to solve the issue for volume creation. * 3966379 (Tracking ID: 3966378) SYMPTOM: volume creation fails with sles11 sp4. DESCRIPTION: The elevator definition is needed to support for Sles11 sp4. RESOLUTION: Code is modified to support Sles11 sp4 * 3970407 (Tracking ID: 3970406) SYMPTOM: While CVM testing below error is observed : [root@vmr720-34-vm9 ~]# vxclustadm nidmap VxVM vxclustadm ERROR V-5-1-9675 Error doing ioctl: clusterinfo DESCRIPTION: The error is coming because the command structure which is sent to the underlying IOCTL has not been initialized which leads to garbage value in the IOCTL argument and leads to failure while allocating the memory for IOCTL argument. RESOLUTION: the command structure which is sent to the underlying IOCTL has been initialized . Without Fix: [root@vmr720-34-vm9 ~]# vxclustadm nidmap VxVM vxclustadm ERROR V-5-1-9675 Error doing ioctl: clusterinfo With Fix: [root@vmr720-34-vm9 ~]# /root/vxclustadm nidmap Name CVM Nid CM Nid State vmr720-34-vm10 0 1 Joined: Master vmr720-34-vm9 1 0 Joined: * 3977787 (Tracking ID: 3921994) SYMPTOM: Temporary files such as .bslist .cslist .perm are seen in the directory /var/temp. DESCRIPTION: When ADD and REMOVE operations of disks of a disk group are done between the interval of two backups, a failure in the next backup of the same disk group is observed, which is why the files are left behind in the directory as specified in RESOLUTION: Corrected the syntax errors in the code, to handle the vxconfigbackup issue. * 3977848 (Tracking ID: 3949954) SYMPTOM: Dumpstack messages are printed when vxio module is loaded for the first time when called blk_register_queue. DESCRIPTION: In RHEL 7.5 a new check was added in kernel code in blk_register_queue where if QUEUE_FLAG_REGISTERED was already set on the queue a dumpstack warning message was printed. In vxvm the flag was already set as the flag got copied from the device queue which was earlier registered by the OS. RESOLUTION: Changes are done in VxVM code to avoid copying of QUEUE_FLAG_REGISTERED fix the dumpstack warnings. * 3977852 (Tracking ID: 3900463) SYMPTOM: vxassist may fail to create a volume on disks having size in terabytes when '-o ordered' clause is used along with 'col_switch' attribute while creating the volume. Following error may be reported: VxVM vxvol ERROR V-5-1-1195 Volume has more than one associated sparse plex but no complete plex DESCRIPTION: The problem is seen specially when user attempts to create a volume on large sized disks using '-o ordered' option along with the 'col_switch' attribute. The error reports the plex to be sparse because plex length is getting incorrectly calculated in the code due to an integer overflow of a variable which handles the col_switch attribute. RESOLUTION: The code is fixed to avoid the integer overflow. * 3977854 (Tracking ID: 3895950) SYMPTOM: vxconfigd hang may be observed all of a sudden. The following stack will be seen as part of threadlist: slock() .disable_lock() volopen_isopen_cluster() vol_get_dev() volconfig_ioctl() volsioctl_real() volsioctl() vols_ioctl() rdevioctl() spec_ioctl() vnop_ioctl() vno_ioctl() common_ioctl(??, ??, ??, ??) DESCRIPTION: Some of the critical structures in the code are protected with lock to avoid simultaneous modification. A particular lock structure gets copied to the local stack memory. In this case the structure might have information about the state of the lock and also at the time of copy that lock structure might be in an intermediate state. When function tries to access such type of lock structure, the result could lead to panic or hang since that lock structure might be in some unknown state. RESOLUTION: When we make local copy of the structure, no one is going to modify the new local structure and hence acquiring lock is not required while accessing this copy. * 3977992 (Tracking ID: 3878153) SYMPTOM: VVR (Veritas Volume Replicator) 'vradmind' deamon core dump in following stack. #0 __kernel_vsyscall () #1 raise () from /lib/libc.so.6 #2 abort () from /lib/libc.so.6 #3 __libc_message () from /lib/libc.so.6 #4 malloc_printerr () from /lib/libc.so.6 #5 _int_free () from /lib/libc.so.6 #6 free () from /lib/libc.so.6 #7 operator delete(void*) () from /usr/lib/libstdc++.so.6 #8 operator delete[](void*) () from /usr/lib/libstdc++.so.6 #9 inIpmHandle::~IpmHandle (this=0x838a1d8, __in_chrg=) at Ipm.C:2946 #10 IpmHandle::events (handlesp=0x838ee80, vlistsp=0x838e5b0, ms=100) at Ipm.C:644 #11 main (argc=1, argv=0xffffd3d4) at srvmd.C:703 DESCRIPTION: Under certain circumstances 'vradmind' daemon may core dump freeing a variable allocated in stack. RESOLUTION: Code change has been done to address the issue. * 3977993 (Tracking ID: 3932356) SYMPTOM: In a two node cluster vxconfigd dumps core while importing the DG - dapriv_da_alloc () in setup_remote_disks () in volasym_remote_getrecs () req_dg_import () vold_process_request () start_thread () from /lib64/libpthread.so.0 from /lib64/libc.so.6 DESCRIPTION: The vxconfigd is dumping core due to address alignment issue. RESOLUTION: The alignment issue is fixed. * 3977994 (Tracking ID: 3889759) SYMPTOM: On Hypervisor with RHEL 7.* xfs-vxvm-nvme, A XFS file system on a VxVM volume carved out from NVMe disks, While Performing any operation on that Virtual Machine, we may land into a panic having stack shown below : nvme_queue_rq+0x5f8/0x7e0 blk_mq_make_request+0x213/0x400 generic_make_request+0xe2/0x130 vxvm_submit_diskio+0x3fc/0x7a0 vxvm_get_bio_vec_from_req+0xbb/0xf0 voldmp_strategy+0x2b/0x30 vol_dev_strategy+0x1b/0x30 volfpdiskiostart+0x135/0x290 dequeue_entity+0x106/0x510 volkiostart+0x521/0x14b0 vol_cachealloc+0x4c/0x70 vxvm_process_request_queue+0xfb/0x1c0 voliod_loop+0x67a/0x7f0 finish_task_switch+0x108/0x170 __schedule+0x2d8/0x900 voliod_iohandle+0xd0/0xd0 voliod_iohandle+0xd0/0xd0 kthread+0xcf/0xe0 kthread_create_on_node+0x140/0x140 ret_from_fork+0x58/0x90 kthread_create_on_node+0x140/0x140 DESCRIPTION: A new flag QUEUE_FLAG_SG_GAPS was added in the RHEL 7.1 & 7.2 kernel. This flag is not set in VxVM driver request queue and due to this, xfs is sending us a buffer with such I/O requests. On the other hand, the nvme driver is setting the flag to indicate that it cannot handle such I/O requests, but VxVM is passing down such I/O request that it received from xfs. So the panic is seen with xfs-vxvm-nvme stack. RESOLUTION: Code Changes are done to set this flag in VxVM request queue to indicate that VxVM cannot handle such I/O requests. * 3977995 (Tracking ID: 3879234) SYMPTOM: dd read on the Veritas Volume Manager (VxVM) character device fails with Input/Output error while accessing end of device like below: [root@dn pmansukh_debug]# dd if=/dev/vx/rdsk/hfdg/vol1 of=/dev/null bs=65K dd: reading `/dev/vx/rdsk/hfdg/vol1': Input/output error 15801+0 records in 15801+0 records out 1051714560 bytes (1.1 GB) copied, 3.96065 s, 266 MB/s DESCRIPTION: The issue occurs because of the change in the Linux API generic_file_aio_read. Because of lot of changes in Linux API generic_file_aio_read, it does not properly handle end of device reads/writes. The Linux code has been changed to use blkdev_aio_read which is a GPL symbol and hence cannot be used. RESOLUTION: Made changes in the code to handle end of device reads/writes properly. * 3977999 (Tracking ID: 3868154) SYMPTOM: When DMP Native Support is set to ON, and if a dmpnode has multiple VGs, 'vxdmpadm native ls' shows incorrect VG entries for dmpnodes. DESCRIPTION: When DMP Native Support is set to ON, multiple VGs can be created on a disk as Linux supports creating VG on a whole disk as well as on a partition of a disk.This possibility was not handled in the code, hence the display of 'vxdmpadm native ls' was getting messed up. RESOLUTION: Code now handles the situation of multiple VGs of a single disk * 3978001 (Tracking ID: 3919902) SYMPTOM: VxVM(Veritas Volume Manager) vxdmpadm iopolicy switch command may not work. When issue happens, vxdmpadm setattr iopolicy finishes without any error, but subsequent vxdmpadm getattr command shows iopolicy is not correctly updated: # vxdmpadm getattr enclosure emc-vplex0 iopolicy ENCLR_NAME DEFAULT CURRENT ============================================ emc-vplex0 Balanced Balanced # vxdmpadm setattr arraytype VPLEX-A/A iopolicy=minimumq # vxdmpadm getattr enclosure emc-vplex0 iopolicy ENCLR_NAME DEFAULT CURRENT ============================================ emc-vplex0 Balanced Balanced Also standby paths are not honored by some iopolicy(for example balanced iopolicy). Read/Write IOs are seen against standby paths by vxdmpadm iostat command. DESCRIPTION: array's iopolicy field becomes stale when vxdmpadm setattr arraytype iopolicy command is used, hence when iopolicy was set back to the staled one, it will not work actually. Also, when paths are evaluated for issusing IOs, standby flag isn't taken into consideration hence standby paths are used for r/w IOs. RESOLUTION: Code changes have been done to address these issues. * 3978010 (Tracking ID: 3886199) SYMPTOM: VxVM package installation fails when Linux-server is having EFI support enabled. DESCRIPTION: In LINUX, VxVM install scripts assumes GRUB bootloader in BIOS mode, and tries to locate corresponding grub config file. In case system has GRUB bootloader in EFI mode, VxVM fails to locate required grub config file and so installation gets aborted. RESOLUTION: Code changes added to allow VxVM installation on LINUX machine, where EFI support is Enabled. * 3978012 (Tracking ID: 3827395) SYMPTOM: VxVM package upgrade may inconsistently fail to unload the kernel modules of previous installation in VVR configuration with following error: modprobe: FATAL: Module vxspec is in use. DESCRIPTION: During the rpm upgrade, the kernel modules of the previously installed packages are unloaded. Since vxvvrstatd daemon was not getting stopped during upgrade unload of the vxspec module may fail during upgrade. RESOLUTION: Code changes are done to stop the vxvvrstatd daemon during the rpm upgrade. * 3978186 (Tracking ID: 3898069) SYMPTOM: System panic may happen in dmp_process_stats routine with the following stack: dmp_process_stats+0x471/0x7b0 dmp_daemons_loop+0x247/0x550 kthread+0xb4/0xc0 ret_from_fork+0x58/0x90 DESCRIPTION: When aggregate the pending IOs per DMP path over all CPUs, out of bound access issue happened due to the wrong index of statistic table, which could cause a system panic. RESOLUTION: Code changes have been done to correct the wrong index. * 3978188 (Tracking ID: 3878030) SYMPTOM: Enhance VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool to clean up OS and VxDMP(Veritas Dynamic Multi-Pathing) device trees without user interaction. DESCRIPTION: When users add or remove LUNs, stale entries in OS or VxDMP device trees can prevent VxVM from discovering changed LUNs correctly. It even causes VxVM vxconfigd process core dump under certain conditions, users have to reboot system to let vxconfigd restart again. VxVM has DR tool to help users adding or removing LUNs properly but it requires user inputs during operations. RESOLUTION: Enhancement has been done to VxVM DR tool. It accepts '-o refresh' option to clean up OS and VxDMP device trees without user interaction. * 3978199 (Tracking ID: 3891252) SYMPTOM: vxconfigd core dumps and below message is seen in syslog vxconfigd[8121]: segfault at 10 ip 00000000004b85c1 sp 00007ffe98d64c50 error 4 in vxconfigd[400000+337000] vxconfigd core dump is seen with following stack: #0 0x00000000004b85c1 in dbcopy_open () #1 0x00000000004cad6f in dg_config_read () #2 0x000000000052cf15 in req_dg_get_dalist () #3 0x00000000005305f0 in request_loop () #4 0x000000000044f606 in main () DESCRIPTION: The issue is seen because NULL pointer was not handled for configuration copy while opening the configuration copy leading to vxconfigd segmentation fault. RESOLUTION: The code has been modified to handle that scenario when NULL pointer for configuration copy was used for opening the copy. * 3978200 (Tracking ID: 3935974) SYMPTOM: While communicating with client process, vxrsyncd daemon terminates and after sometime it gets started or may require a reboot to start. DESCRIPTION: When the client process shuts down abruptly and vxrsyncd daemon attempt to write on the client socket, SIGPIPE signal is generated. The default action for this signal is to terminate the process. Hence vxrsyncd gets terminated. RESOLUTION: This SIGPIPE signal should be handled in order to prevent the termination of vxrsyncd. * 3978217 (Tracking ID: 3873123) SYMPTOM: When remote disk on node is EFI disk, vold enable fails. And following message get logged, and eventually causing the vxconfigd to go into disabled state: Kernel and on-disk configurations don't match; transactions are disabled. DESCRIPTION: This is becasue one of the cases of EFI remote disk is not properly handled in disk recovery part when vxconfigd is enabled. RESOLUTION: Code changes have been done to set the EFI flag on darec in recovery code * 3978219 (Tracking ID: 3919559) SYMPTOM: IO hangs after pulling out all cables, when VVR(Veritas Volume Replicator) is reconfigured. DESCRIPTION: When VVR is configured and SRL(Storage Replicator Log) batch feature is enabled, after pulling out all cable, if more than one IO get queued in VVR before a header error, due to a bug in VVR, at least one IO won't be handled, hence the issue. RESOLUTION: Code has been modified to get every queued IO in VVR handled properly. * 3978222 (Tracking ID: 3907034) SYMPTOM: The mediatype is not shown as ssd in vxdisk -e list command for SSD (solid state devices) devices. DESCRIPTION: Some of the SSD devices does not have a ASL (Array Support Library) to claim them and are claimed as JBOD (Just a bunch of disks). In this case since there is no ASL, the attributes of the device like mediatype are not known. This is the reason mediatype is not shown in vxdisk -e list output. RESOLUTION: Code now checks the value stored in the file /sys/block//queue/rotational which signifies whether the device is SSD or not to detect mediatype. * 3978223 (Tracking ID: 3958878) SYMPTOM: vxrecover core dumps on system reboot with following stack: strncpy() dg_set_current_id() dgreconnect() main() DESCRIPTION: While booting system, vxrecover core dumps due to accessing one uninitialized disk group record. RESOLUTION: Changes are done in vxrecover to fix the core dump. * 3978225 (Tracking ID: 3948140) SYMPTOM: System may panic if RTPG data returned by the array is greater than 255 with below stack: dmp_alua_get_owner_state() dmp_alua_get_path_state() dmp_get_path_state() dmp_check_path_state() dmp_restore_callback() dmp_process_scsireq() dmp_daemons_loop() DESCRIPTION: The size of the buffer given to RTPG SCSI command is currently 255 bytes. But the size of data returned by underlying array for RTPG can be greater than 255 bytes. As a result incomplete data is retrieved (only the first 255 bytes) and when trying to read the RTPG data, it causes invalid access of memory resulting in error while claiming the devices. This invalid access of memory may lead to system panic. RESOLUTION: The RTPG buffer size has been increased to 1024 bytes for handling this. * 3978233 (Tracking ID: 3879324) SYMPTOM: VxVM(Veritas Volume Manager) DR(Dynamic Reconfiguration) tool fails to handle busy device problem while LUNs are removed from OS DESCRIPTION: OS devices may still be busy after removing them from OS, it fails 'luxadm - e offline ' operation and leaves staled entries in 'vxdisk list' output like: emc0_65535 auto - - error emc0_65536 auto - - error RESOLUTION: Code changes have been done to address busy devices issue. * 3978234 (Tracking ID: 3795622) SYMPTOM: With Dynamic Multi-Pathing (DMP) Native Support enabled, LVM global_filter is not updated properly in lvm.conf file to reject the newly added paths. DESCRIPTION: With DMP Native Support enabled, when new paths are added to existing LUNs, LVM global_filter is not updated properly in lvm.conf file to reject the newly added paths. This can lead to duplicate PV (physical volumes) found error reported by LVM commands. RESOLUTION: The code is modified to properly update global_filter field in lvm.conf file when new paths are added to existing disks. * 3978307 (Tracking ID: 3890415) SYMPTOM: Support growing the mounted ext3/ext4 filesystem on top of VxVM volume using vxresize. DESCRIPTION: Currently, 'vxresize' allows user to grow an ext file-system on top of VxVM volume if the FS is un-mounted. A 'resize2fs' utility is provided by Operating system which provides support for growing the mounted ext3/ext4 FS online for few kernel versions. Hence, the code changes are implemented to grow the mounted FS using resize2fs utility. However, online resize of file-systems outside of VxFS is reliant upon third party software. The OS guide for performing the operations should be consulted and VRTS would NOT be responsible held for any corruption or denial of service that could occur due to the use of third party software. RESOLUTION: Implemented the code changes to allow growing the mounted ext3/ext4 file-system on top of VxVM volume. * 3978309 (Tracking ID: 3906119) SYMPTOM: In CVM (Cluster Volume Manager) environment, failback didn't happen when the optimal path returned back. DESCRIPTION: For ALUA(Asymmetric Logical Unit Access) array, it supports implicit and explicit asymmetric logical unit access management methods. In a CVM environment, DMP(Dynamic Multi-pathing) failed to start failback for implicit ALUA only mode array, hence the issue. RESOLUTION: Code changes are added to handle this case for implicit ALUA only mode array. * 3978310 (Tracking ID: 3921668) SYMPTOM: Running the vxrecover command with -m option fails when run on the slave node with message "The command can be executed only on the master." DESCRIPTION: The issue occurs as currently vxrecover -g -m command on shared disk groups is not shipped using the command shipping framework from CVM (Cluster Volume Manager) slave node to the master node. RESOLUTION: Implemented code change to ship the vxrecover -m command to the master node, when its triggered from the slave node. * 3978324 (Tracking ID: 3754715) SYMPTOM: When dmp_native_support is enabled and kdump is triggered , system gets Hang while collecting crash. DESCRIPTION: When kdump is triggered with native support enabled , issue occurs when booting into the kdump kernel using kdump initrd. Since kdump kernel has limited memory, loading vxvm modules into the kdump kernel causes system to hang because of memory allocation failure. RESOLUTION: VxVM modules are added to blacklist in kdump.conf file which prevents them from loading when the kdump is triggered. * 3978325 (Tracking ID: 3922159) SYMPTOM: Thin reclaimation may fail on Xtremio SSD disks with following error: Reclaiming storage on: Disk : Failed. Failed to reclaim . DESCRIPTION: VxVM (Veritas Volume Manager) uses thin-reclaimation method in order to reclaim the space on Xtremio SSD disks. Few SSD arrays use TRIM method for reclaimation, instead of thin-reclaimation. A condition in code which checks whether TRIM is supported or not was incorrect and it was leading to reclaim failure on Xtremio disks. RESOLUTION: Corrected the condition in code which checks whether TRIM method is supported or not for reclaimation. * 3978328 (Tracking ID: 3931936) SYMPTOM: In FSS(Flexible Storage Sharing) environment, after restarting slave node VxVM command on master node hang result in failed disks on slave node could not rejoin disk group. DESCRIPTION: While lost remote disks on slave node comes back, online these disk and add them to disk group operations are performed on master node. Disk online includes operations from both master and slave node. On slave node these disks should be offlined then reonlined, but due to code defect reonline disks are missed result in these disks are kept in reonlining state. The following add disk to disk group operation needs to issue private region IOs on the disk. These IOs are shipped to slave node to complete. As the disks are in reonline state, busy error gets returned and remote IOs keep retrying, hence VxVM command hang on master node. RESOLUTION: Code changes have been made to fix the issue. * 3978331 (Tracking ID: 3930312) SYMPTOM: Short CPU spike while running vxpath_links DESCRIPTION: The UDEV event triggers the "vxpath_links" which uses "ls -l" to find a specific SCSI target. The CPU time consumed by "ls -l" would be high depending on the number of paths. Hence the issue. RESOLUTION: The code is modified to reduce the time consumption in "vxpth_lines" * 3978334 (Tracking ID: 3932246) SYMPTOM: vxrelayout operation fails to complete. DESCRIPTION: IF we lose connectivity to underlying storage while volume relayout is in progress, some intermediate volumes for the relayout could be in disabled or undesirable state either due to I/O error. Once the storage connectivity is back such intermediate volumes should be recovered by vxrecover utility and resume the vxrelayout operation automatically. But due to bug in vxrecover utility the volumes remained in disable state due to which the vxrelayout operation didn't complete. RESOLUTION: Changes are done in vxrecover utility to enable the intermediate volumes. * 3978603 (Tracking ID: 3925377) SYMPTOM: Not all disks could be discovered by Dynamic Multi-Pathing(DMP) after first startup.. DESCRIPTION: DMP is started too earlier in the boot process if iSCSI and raw haven't been installed. Till that point the FC devices are not recognized by OS, hence DMP misses FC devices. RESOLUTION: The code is modified to make sure DMP get started after OS disk discovery. * 3978612 (Tracking ID: 3956732) SYMPTOM: systemd-udevd messages like below can be seen in journalctl logs: systemd-udevd[7506]: inotify_add_watch(7, /dev/VxDMP8, 10) failed: No such file or directory systemd-udevd[7511]: inotify_add_watch(7, /dev/VxDMP9, 10) failed: No such file or directory DESCRIPTION: When there are some changes done to the underlying VxDMP device the messages are getting displayed in journalctl logs. The reason for the message is because we have not handled the change event of the VxDMP device in our UDEV rule. RESOLUTION: Code changes have been done to handle change event of VxDMP device in our UDEV rule. * 3978613 (Tracking ID: 3917636) SYMPTOM: Filesystems from /etc/fstab file are not mounted automatically on boot through systemd on RHEL7 and SLES12. DESCRIPTION: While bootup, when systemd tries to mount using the devices mentioned in /etc/fstab file on the device, the device is not accessible leading to the failure of the mount operation. As the device discovery happens through udev infrastructure, the udev-rules for those devices need to be run when volumes are created so that devices get registered with systemd. In the case udev rules are executed even before the devices in "/dev/vx/dsk" directory are created. Since the devices are not created, devices will not be registered with systemd leading to the failure of mount operation. RESOLUTION: Run "udevadm trigger" to execute all the udev rules once all volumes are created so that devices are registered. * 3978708 (Tracking ID: 3897047) SYMPTOM: Filesystems are not mounted automatically on boot through systemd on RHEL7 and SLES12. DESCRIPTION: When systemd service tries to start all the FS in /etc/fstab, the Veritas Volume Manager (VxVM) volumes are not started since vxconfigd is still not up. The VxVM volumes are started a little bit later in the boot process. Since the volumes are not available, the FS are not mounted automatically at boot. RESOLUTION: Registered the VxVM volumes with UDEV daemon of Linux so that the FS would be mounted when the VxVM volumes are started and discovered by udev. * 3978722 (Tracking ID: 3947265) SYMPTOM: vxfen tends to fail and creates split brain issues. DESCRIPTION: Currently to check whether the infiniband devices are present or not we check for some modules which on rhel 7.4 comes by default. RESOLUTION: TO check for infiniband devices we would be checking for /sys/class/infiniband directory in which the device information gets populated if infiniband devices are present. * 3980027 (Tracking ID: 3921572) SYMPTOM: The vxconfigd dumps core when an array is disconnected. DESCRIPTION: In a configuration where a Disk Group having disks from more than one array, the vxconfigd is dumping core when an array is disconnected followed by a command which attempts to get the details of all the disks of the Disk Group. Once the array is disconnected, the vxconfigd removes all the Disk Access (DA) records. While servicing the command which needs the details of the disks in the DG, the vxconfigd goes through the DA list. The code which services the command has a defect causing the core. RESOLUTION: The code is rectified to ignore the NULL records to avoid the core. * 3980112 (Tracking ID: 3980535) SYMPTOM: While uninstalling the VxVM package, below warning messages are observed Failed to execute operation: No such file or directory Failed to execute operation: No such file or directory DESCRIPTION: During VxVM (Veritas Volume Manager) package uninstallation, the code was trying to stop some services for which the service file was already removed earlier, which was causing these warning messages. RESOLUTION: The package uninstallation code is corrected to avoid these warning messages. * 3980561 (Tracking ID: 3968915) SYMPTOM: VxVM support on RHEL 7.6 DESCRIPTION: The RHEL 7.6 is new release and hence VxVM module is compiled with the RHEl 7.6 kernel . RESOLUTION: Compiled VxVM with RHEl 7.6 kernel bits . Patch ID: VRTSvxvm-7.1.0.300 * 3927513 (Tracking ID: 3925398) SYMPTOM: VxVM modules failed to load on RHEL7.4. DESCRIPTION: Since RHEL7.4 is new release therefore VxVM module failed to load on it. RESOLUTION: The VxVM has been re-compiled with RHEL 7.4 build environment. Patch ID: VRTSvxvm-7.1.0.100 * 3902841 (Tracking ID: 3902851) SYMPTOM: VxVM module failed to load on RHEL7.3. DESCRIPTION: Since RHEL7.3 is new release therefore VxVM module failed to load on it. RESOLUTION: Added VxVM support for RHEL7.3. Patch ID: VRTSaslapm-7.1.0.520 * 3980566 (Tracking ID: 3967122) SYMPTOM: Retpoline support for ASLAPM on RHEL 7.6 kernels DESCRIPTION: The RHEL7.6 is new release and it has Retpoline kernel. The APM module should be recompiled with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled APM with retpoline GCC. Patch ID: VRTSveki-7.1.0.2100 * 3980619 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSdbac-7.1.0.4100 * 3980438 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSdbac-7.1.0.300 * 3929519 (Tracking ID: 3925832) SYMPTOM: vcsmm module does not load with RHEL7.4 DESCRIPTION: Since RHEL7.4 is new release therefore vcsmm module failed to load on it. RESOLUTION: The VRTSdbac package is re-compiled with RHEL7.4 kernel (3.10.0- 693.el7.x86_64) in the build environment to mitigate the failure. Patch ID: VRTSdbac-7.1.0.100 * 3901580 (Tracking ID: 3896877) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3(RHEL7.3). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 2. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 3(RHEL7.3) is now introduced. Patch ID: VRTSamf-7.1.0.3100 * 3980437 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSamf-7.1.0.200 * 3929360 (Tracking ID: 3923100) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 3. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 4(RHEL7.4) is now introduced. Patch ID: VRTSamf-7.1.0.100 * 3899850 (Tracking ID: 3896877) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL6.8). DESCRIPTION: Veritas Infoscale Availability does not support RHEL versions later than RHEL7 Update 2. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 3 (RHEL7.3) is now introduced. Patch ID: VRTSvxfen-7.1.0.6100 * 3980436 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSvxfen-7.1.0.500 * 3928663 (Tracking ID: 3923100) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 3. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 4(RHEL7.4) is now introduced. Patch ID: VRTSvxfen-7.1.0.300 * 3901438 (Tracking ID: 3896877) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL6.8). DESCRIPTION: Veritas Infoscale Availability does not support RHEL versions later than RHEL7 Update 2. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 3 (RHEL7.3) is now introduced. Patch ID: VRTSvxfen-7.1.0.100 * 3879189 (Tracking ID: 3878333) SYMPTOM: With I/O fencing in SCSI-3 mode, if you enable the auto-refresh feature on the CoordPoint agent, then I/O fencing fails to automatically refresh registrations on the coordination points. This results in vxfen group to remain in faulted state. DESCRIPTION: Even to refresh registration on coordination disks, the vxfenswap script expects inputs about coordination disk groups from the CoordPoint agent. The CoordPoint agent does not provide this information to the vxfenswap script. This causes the vxfenswap script to fail and the vxfen group remains faulted until it is manually cleared. RESOLUTION: The I/O fencing program is modified such that in SCSI3 mode fencing, the vxfenswap script does not explicitly require the coordination disk group information for refreshing registration keys on disks. Patch ID: VRTSgab-7.1.0.6100 * 3980435 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSgab-7.1.0.500 * 3929359 (Tracking ID: 3923100) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 3. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 4(RHEL7.4) is now introduced. Patch ID: VRTSgab-7.1.0.300 * 3901437 (Tracking ID: 3896877) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3 (RHEL6.8). DESCRIPTION: Veritas Infoscale Availability does not support RHEL versions later than RHEL7 Update 2. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 3 (RHEL7.3) is now introduced. Patch ID: VRTSgab-7.1.0.100 * 3893802 (Tracking ID: 3893801) SYMPTOM: In large clusters, a node join for a gab port gets stuck. DESCRIPTION: When a node joins a cluster, a few GAB directed messages might be delivered in the incorrect order to the GAB clients. This incorrect message order is because of a bug in code where the messages are added to the GAB delivery queue in incorrect order. RESOLUTION: The program is modified to ensure that during a node join scenario, the GAB directed messages are inserted in the various GAB queues in the correct order. Patch ID: VRTSllt-7.1.0.8100 * 3980434 (Tracking ID: 3967265) SYMPTOM: RHEL 7.x RETPOLINE kernels and RHEL 7.6 are not supported DESCRIPTION: Red Hat has released RHEL 7.6 which has RETPOLINE kernel, and also released RETPOLINE kernels for older RHEL 7.x Updates. Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support RETPOLINE kernel. RESOLUTION: Support for RHEL 7.6 and RETPOLINE kernels on RHEL 7.x kernels is now introduced. Patch ID: VRTSllt-7.1.0.700 * 3929358 (Tracking ID: 3923100) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 4(RHEL7.4). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 3. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 4(RHEL7.4) is now introduced. Patch ID: VRTSllt-7.1.0.400 * 3906299 (Tracking ID: 3905430) SYMPTOM: Application IO hangs in case of FSS with LLT over RDMA during heavy data transfer. DESCRIPTION: In case of FSS using LLT over RDMA, sometimes IO may hang because of race conditions in LLT code. RESOLUTION: LLT module is modified to fix the race conditions arising due to heavy load with multiple application threads. Patch ID: VRTSllt-7.1.0.300 * 3898106 (Tracking ID: 3896877) SYMPTOM: Veritas Infoscale Availability does not support Red Hat Enterprise Linux 7 Update 3(RHEL7.3). DESCRIPTION: Veritas Infoscale Availability does not support Red Hat Enterprise Linux versions later than RHEL7 Update 2. RESOLUTION: Veritas Infoscale Availability support for Red Hat Enterprise Linux 7 Update 3(RHEL7.3) is now introduced. Patch ID: VRTSgms-7.1.0.2200 * 3977608 (Tracking ID: 3947812) SYMPTOM: GMS support for RHEL7.5 DESCRIPTION: The RHEL7.5 is new release and it has Retpoline kernel. So the GMS module should recompile with retpoline aware GCC. RESOLUTION: Compiled GMS with Retpoline GCC for RHEL7.5 support Patch ID: VRTSglm-7.1.0.2200 * 3977609 (Tracking ID: 3947815) SYMPTOM: GLM support for RHEL7.5 DESCRIPTION: The RHEL7.5 is new release and it has Retpoline kernel. So the GLM module should recompile with retpoline aware GCC. RESOLUTION: Compiled GLM with Retpoline GCC for RHEL7.5 support * 3978606 (Tracking ID: 3909553) SYMPTOM: Use the GAB backenable flow control interface. DESCRIPTION: VxFS/GLM used to sleep for fixed time duration before retrying if sending of message is failed over GAB. GAB provides interface named backenable which will notify if it is ok to send message again over GAB. This will avoid unnecessary sleeping in VxFS/GLM. RESOLUTION: Code is modified to use the GAB backenable flow control interface. * 3978608 (Tracking ID: 3898145) SYMPTOM: GLM on linux seems to enqueue the outgoing messages to a single thread. DESCRIPTION: This was done to prevent stack overflows. However this becomes a bottleneck and aggravates scenarios which require GLM locks like directory contention, FCL merge etc in CFS. RESOLUTION: The newer linux kernels support 16K stack size where we need not enqueue the outgoing messages to a single thread. Patch ID: VRTSodm-7.1.0.3200 * 3977421 (Tracking ID: 3958865) SYMPTOM: ODM module failed to load on RHEL7.6. DESCRIPTION: Since RHEL7.6 is new release therefore ODM module failed to load on it. RESOLUTION: Added ODM support for RHEL7.6. * 3977610 (Tracking ID: 3938546) SYMPTOM: ODM module failed to load on RHEL7.5. DESCRIPTION: Since RHEL7.5 is new release therefore ODM module failed to load on it. RESOLUTION: Added ODM support for RHEL7.5. * 3978714 (Tracking ID: 3978713) SYMPTOM: RHEL 7.3 system panic with RIP at odm_tsk_exit+0x20. DESCRIPTION: RHEL 7.3 has changed the definition of vm_operations_struct. The ODM module was built on earlier RHEL version. Since ODM also uses vm_operations_struct, this mismatch of structure size caused the painc. RESOLUTION: To resolve this issue recompile ODM for RHEL7.3 kernel Patch ID: VRTSodm-7.1.0.300 * 3926164 (Tracking ID: 3923310) SYMPTOM: ODM module failed to load on RHEL7.4. DESCRIPTION: Since RHEL7.4 is new release therefore ODM module failed to load on it. RESOLUTION: Added ODM support for RHEL7.4. Patch ID: VRTSvxfs-7.1.0.4200 * 3977313 (Tracking ID: 3973440) SYMPTOM: VxFS mount failed with error "no security xattr handler" on RHEL 7.6 when SELinux enabled (both permissive and enforcing) # mount -t vxfs /dev/vx/dsk/mydg/myvol /my UX:vxfs mount.vxfs: ERROR: V-3-23731: mount failed. /var/log/messages: Jan 7 12:18:57 server102 kernel: SELinux: (dev VxVM10000, type vxfs) has no security xattr handler DESCRIPTION: On RHEL7.6, VxFS mount failed if SElinux is enabled (both permissive and enforcing). During mount below error can be observed. # mount -t vxfs /dev/vx/dsk/mydg/myvol /my UX:vxfs mount.vxfs: ERROR: V-3-23731: mount failed. /var/log/messages: Jan 7 12:18:57 server102 kernel: SELinux: (dev VxVM10000, type vxfs) has no security xattr handler RESOLUTION: Added code to allow VxFS mount when SElinux is enabled. * 3977420 (Tracking ID: 3958853) SYMPTOM: VxFS module failed to load on RHEL7.6. DESCRIPTION: Since RHEL7.6 is new release therefore VxFS module failed to load on it. RESOLUTION: Added VxFS support for RHEL7.6. * 3977424 (Tracking ID: 3959305) SYMPTOM: When large number of files with named attributes are being created/written to/deleted in a loop, along with other operations on an Selinux enabled system, some files may end up without security attributes. This may lead to access being denied to such files later. DESCRIPTION: On an Selinux enabled system, during file creation, security initialisation happens and security attributes are stored. However when there are parallel create/write/delete operations on multiple files or on same files multiple times which have named attributes, due to a race condition, it is possible that a security attribute initialization may get skipped for some files. Since these file dont have security attributes set, at later time Selinux security module will prevent access to such files for other operations. These operations will fail with access denied error. RESOLUTION: If this is a file creation context, then while writing named attributes also attempt to do security initialisation of file by explicitly calling security initiailzation routine. This is an additional provision (in addition to security initialisation during default file create code) to ensure that security-initialization always happens (notwithstanding race conditions) in named attribute write codepath. * 3977425 (Tracking ID: 3959299) SYMPTOM: When large number of files are created at once, on a system with Selinux enabled, the file creation may take longer time as compared to on a system with Selinux disabled. DESCRIPTION: On an Selinux enabled system, during file creation Selinux security labels are needed to be stored as extended attributes. This requires allocation of attribute inode and it's data extent. The content of the extent are read synchronously into the buffer. If this is a newly allocated extent, it's content are anyway garbage. And it will get overwritten with the attribute data containing Selinux security labels. Thus it was found that, for newly allocated attribute extents, the read operation is redundant. RESOLUTION: As a fix, for newly allocated attribute extent the reading of the data from that extent is skipped. However, If the allocated extent gets merged with previously allocated extent, then extent returned by allocator could be a combined extent. In such cases, read of entire extent is allowed to ensure that previously written data is correctly loaded in-core. * 3977611 (Tracking ID: 3938544) SYMPTOM: VxFS module failed to load on RHEL7.5. DESCRIPTION: Since RHEL7.5 is new release therefore VxFS module failed to load on it. RESOLUTION: Added VxFS support for RHEL7.5. * 3978167 (Tracking ID: 3922986) SYMPTOM: System panic since Linux NMI Watchdog detected LOCKUP in CFS. DESCRIPTION: The vxfs buffer cache iodone routine interrupted the inode flush thread which was trying to acquire the cfs buffer hash lock with releasing the cfs buffer. And the iodone routine was blocked by other threads on acquiring the free list lock. In the cycle, the other threads were contending the cfs buffer hash lock with the inode flush thread. On Linux, the spinlock is FIFO tickets lock, so if the inode flush thread set ticket on the spinlock earlier, other threads cant acquire the lock. This caused a dead lock issue. RESOLUTION: Code changes are made to ensure acquiring the cfs buffer hash lock with irq disabled. * 3978197 (Tracking ID: 3894712) SYMPTOM: ACL permissions are not inherited correctly on cluster file system. DESCRIPTION: The ACL counts stored on a directory inode gets reset every time directory inodes ownership is switched between the nodes. When ownership on directory inode comes back to the node, which previously abdicated it, ACL permissions were not getting inherited correctly for the newly created files. RESOLUTION: Modified the source such that the ACLs are inherited correctly. * 3978228 (Tracking ID: 3928046) SYMPTOM: VxFS panic in the stack like below due to memory address not aligned: void vxfs:vx_assemble_hdoffset+0x18 void vxfs:vx_assemble_opts+0x8c void vxfs:vx_assemble_rwdata+0xf4 void vxfs:vx_gather_rwdata+0x58 void vxfs:vx_rwlock_putdata+0x2f8 void vxfs:vx_glm_cbfunc+0xe4 void vxfs:vx_glmlist_thread+0x164 unix:thread_start+4 DESCRIPTION: The panic issue happened on copying piggyback data from inode to data buffer for the rwlock under revoke processing. After some data has been copied to the data buffer, it reached to a 32-bits aligned address, but the value (large dir freespace offset) which is defined as 64-bits data type was being accessed at the address. Then it causes system panic due to memory address not aligned. RESOLUTION: The code changed by copy data to the 32-bits aligned address through bcopy() rather than access directly. * 3978248 (Tracking ID: 3927419) SYMPTOM: System panicked with this stack: - vxg_api_range_unlockwf - vx_glm_range_unlock - vx_imntunlock - vx_do_unmount - vx_unmount - generic_shutdown_super - kill_block_super - vx_kill_sb - amf_kill_sb - deactivate_super - mntput_no_expire - sys_umount DESCRIPTION: This is a race between freeze operation and unmount on a disabled filesyetm. The freeze converts the glm locks of the mount point directory inode. At the same time, unmount thread may be unlocking the same lock which is during the middle of the lock conversion, thus causes the panic. RESOLUTION: The lock conversion on the mount point directory inode is unnecessary, so skip it. * 3978340 (Tracking ID: 3898565) SYMPTOM: System panicked with this stack: - panicsys - panic_common - panic - segshared_fault - as_fault - vx_memlock - vx_dio_zero - vx_dio_rdwri - vx_dio_read - vx_read_common_inline - vx_read_common_noinline - vx_read1 - vx_read - fop_read DESCRIPTION: Solaris no longer supports F_SOFTLOCK. The vx_memlock() uses F_SOFTLOCK to fault in the page. RESOLUTION: Change vxfs code to avoid using F_SOFTLOCK. * 3978347 (Tracking ID: 3896920) SYMPTOM: Secondary mount could hang if VX_GETEMAPMSG is sent around the same time. DESCRIPTION: Fixing the deadlock between the secondary mount and VX_GETEMAPMSG. They stay on the same priority level, which could cause deadlock because secondary mount msgs(VX_FSADM_QUERY_MSG, VX_DEVICE_QUERY_MSG, VX_FDD_ADV_TBL_MSG). They are sent when the FS is frozen, so any messages(VX_GETEMAPMSG) dependent on these, could hang for it and these mount time messages could hang for primary to respond which has sent a VX_GETEMAPMSG broadcast, waiting for the node mounting the FS, to respond. RESOLUTION: Move the secondary mount messages to priority-13 queue, since the messages there take mayfrz lock, they won't collide with these messge. The only exception is VX_EDELEMODEMSG but it doesn't wait for the response, so it's safe. * 3978355 (Tracking ID: 3914782) SYMPTOM: VxFS 7.1 performance drops. Many VxFS threads occupy much CPU. There're lots of vx_worklist_* threads even when there's no activity on the file system at all. DESCRIPTION: In VxFS 7.1, when NAIO load is high, vx_naio_load_check work items will be added to pending work list. But these work items will put themselves back on the list when processed. Thus the total count of these items keeps increasing which eventually results in high count of worker threads. RESOLUTION: Modifications are made to ensure the added items will not be put back on the work list. * 3978357 (Tracking ID: 3911048) SYMPTOM: The LDH bucket validation failure message is logged and system hang. DESCRIPTION: When modifying a large directory, vxfs needs to find a new bucket in the LDH for this directory, and once the bucket is full, it will be be split to get more bucket to use. When the bucket is split to maximum amount, overflow bucket will be allocated. Under some condition, the available bucket lookup on overflow bucket will may got incorrect result and overwrite the existing bucket entry thus corrupt the LDH file. Another problem is that when the bucket invalidation failed, the bucket buffer is released without checking whether the buffer is already in a previous transaction, this may cause the transaction flush thread to hang and finally stuck the whole filesystem. RESOLUTION: Correct the LDH bucket entry change code to avoid the corrupt. And release the bucket buffer without throw it out of memory to avoid blocking the transaction flush. * 3978361 (Tracking ID: 3905576) SYMPTOM: Cluster file system hangs. On one node, all worker threads are blocked due to file system freeze. And there's another thread blocked with stack like this: - __schedule - schedule - vx_svar_sleep_unlock - vx_event_wait - vx_olt_iauinit - vx_olt_iasinit - vx_loadv23_fsetinit - vx_loadv23_fset - vx_fset_reget - vx_fs_reinit - vx_recv_cwfa - vx_msg_process_thread - vx_kthread_init DESCRIPTION: The frozen CFS won't thaw because the mentioned thread is waiting for a work item to be processed in vx_olt_iauinit(). Since all the worker threads are blocked, there is no free thread to process this work item. RESOLUTION: Change the code in vx_olt_iauinit(), so that the work item will be processed even with all worker threads blocked. * 3978370 (Tracking ID: 3941942) SYMPTOM: If fiostats_enabled filesystem is created, and if odmwrites are in progress, forcefully unmounting the filesystem can panic the system. crash_kexec oops_end no_context __bad_area_nosemaphore bad_area_nosemaphore __do_page_fault do_page_fault vx_fiostats_free fdd_chain_inactive_common fdd_chain_inactive fdd_odm_close odm_vx_close odm_fcb_free odm_fcb_rel odm_ident_close odm_exit odm_tsk_daemon_deathlist odm_tsk_daemon odm_kthread_init kernel_thread DESCRIPTION: When we are freeing fiostats assigned to an inode, when we unmount the filesystem forcefully, we have to validate fs field. Otherwise we may end up in a situation where we dereference NULL pointer for checks in this codepath, which panics. RESOLUTION: Code is modified to add checks to validate fs in such scenarios of force unmount. * 3978382 (Tracking ID: 3812330) SYMPTOM: slow ls -l across cluster nodes. DESCRIPTION: When we issue "ls -l" , VOP getattr is issued by which in VxFS, we update the necessary stats of an inode whose owner is some other node in the cluster. Ideally this update process should be done through asynchronous message passing mechanism which is not happening in this case.Instead the non-owner node, where we are issuing "ls -l", tries to pull strong ownership towards itself to update the inode stats.Hence a lot of time is consumed in this ping-pong of ownership. RESOLUTION: Avoiding pulling strong ownership for the inode when doing ls from some other node which is not current owner and doing inode stats update through asynchronous message passing mechanism using a module parameter "vx_lazyisiz". * 3978384 (Tracking ID: 3909583) SYMPTOM: Disable partitioning of directory if directory size is greater than upper threshold value. DESCRIPTION: If PD is enabled during mount, mount may take long time to complete. Because mount tries to partition all the directories hence looks like hung. To avoid such hangs, a new upper threshold value for PD is added which will disable partitioning of directory if directory size is above that value. RESOLUTION: Code is modified to disable partitioning of directory if directory size is greater than upper threshold value. * 3978484 (Tracking ID: 3931026) SYMPTOM: umounting a LM is hung on RHEL6.6 DESCRIPTION: There is a race window between linux fsnotify infrastructure and vxfs fsnotify watcher deletion process during umount time. As a result, this race may result in umount process hang. RESOLUTION: 1> VxFS processing of fsnotify/inotify should be done before calling in to vxfs unmount (vx_umount) handler. This assures, watches on VxFS inodes are cleaned before the OS fsnotify cleanup code, which expects all marks/watches on SB to zero. 2> Minor Tidy-up in core fsnotify/inotify cleanup code of vxfs. 3> Remove explicit (hard-setting) "s_fsnotify_marks" to 0. * 3978602 (Tracking ID: 3908954) SYMPTOM: Whilst performing vectored writes, using writev(), where two iovec-writes write to different offsets within the same 4K page-aligned range of a file, it is possible to find null data at the beginning of the 4Kb range when reading the data back. DESCRIPTION: Whilst multiple processes are performing vectored writes to a file, using writev(), The following situation can occur We have 2 iovecs, the first is 448 bytes and the second is 30000 bytes. The first iovec of 448 bytes completes, however the second iovec finds that the source page is no longer in memory. As it cannot fault-in during uiomove, it has to undo both iovecs. It then faults the page back in and retries the second iovec only. However, as the undo operation also undid the first iovec, the first 448 bytes of the page are populated with nulls. When reading the file back, it seems that no data was written for the first iovec. Hence, we find nulls in the file. RESOLUTION: Code has been changed to handle the unwind of multiple iovecs accordingly in the scenarios where certain amount of data is written out of a particular iovec and some from other. * 3978604 (Tracking ID: 3909553) SYMPTOM: Use the GAB backenable flow control interface. DESCRIPTION: VxFS/GLM used to sleep for fixed time duration before retrying if sending of message is failed over GAB. GAB provides interface named backenable which will notify if it is ok to send message again over GAB. This will avoid unnecessary sleeping in VxFS/GLM. RESOLUTION: Code is modified to use the GAB backenable flow control interface. * 3978609 (Tracking ID: 3921152) SYMPTOM: Performance drop. Core dump shows threads doing vx_dalloc_flush(). DESCRIPTION: An implicit typecast error in vx_dalloc_flush() can cause this performance issue. RESOLUTION: The code is modified to do an explicit typecast. Patch ID: VRTSvxfs-7.1.0.400 * 3926290 (Tracking ID: 3923307) SYMPTOM: VxFS module failed to load on RHEL7.4. DESCRIPTION: Since RHEL7.4 is new release therefore VxFS module failed to load on it. RESOLUTION: Added VxFS support for RHEL7.4. Patch ID: VRTSvxfs-7.1.0.100 * 3901320 (Tracking ID: 3901318) SYMPTOM: VxFS module failed to load on RHEL7.3. DESCRIPTION: Since RHEL7.3 is new release therefore VxFS module failed to load on it. RESOLUTION: Added VxFS support for RHEL7.3. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- Please be noted that the installation of this P-Patch will cause downtime. To install the patch perform the following steps on at least one node in the cluster: 1. Copy the patch infoscale-rhel7_x86_64-Patch-7.1.0.300.tar.gz to /tmp 2. Untar infoscale-rhel7_x86_64-Patch-7.1.0.300.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/infoscale-rhel7_x86_64-Patch-7.1.0.300.tar.gz # tar xf /tmp/infoscale-rhel7_x86_64-Patch-7.1.0.300.tar 3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.) # pwd /tmp/hf # ./installVRTSinfoscale710P300 [ ...] You can also install this patch together with 7.1 base release using Install Bundles 1. Download this patch and extract it to a directory 2. Change to the Veritas InfoScale 7.1 directory and invoke the installer script with -patch_path option where -patch_path should point to the patch directory # ./installer -patch_path [] [ ...] Install the patch manually: -------------------------- Manual installation is not recommended. REMOVING THE PATCH ------------------ Manual uninstallation is not recommended. SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE