* * * READ ME * * * * * * InfoScale 7.4 * * * * * * Patch 1200 * * * Patch Date: 2018-11-23 This document provides the following information: * PATCH NAME * OPERATING SYSTEMS SUPPORTED BY THE PATCH * PACKAGES AFFECTED BY THE PATCH * BASE PRODUCT VERSIONS FOR THE PATCH * SUMMARY OF INCIDENTS FIXED BY THE PATCH * DETAILS OF INCIDENTS FIXED BY THE PATCH * INSTALLATION PRE-REQUISITES * INSTALLING THE PATCH * REMOVING THE PATCH PATCH NAME ---------- InfoScale 7.4 Patch 1200 OPERATING SYSTEMS SUPPORTED BY THE PATCH ---------------------------------------- SLES11 x86-64 PACKAGES AFFECTED BY THE PATCH ------------------------------ VRTSamf VRTSaslapm VRTSdbac VRTSgab VRTSglm VRTSgms VRTSllt VRTSodm VRTSvxfen VRTSvxfs VRTSvxvm BASE PRODUCT VERSIONS FOR THE PATCH ----------------------------------- * InfoScale Availability 7.4 * InfoScale Enterprise 7.4 * InfoScale Foundation 7.4 * InfoScale Storage 7.4 SUMMARY OF INCIDENTS FIXED BY THE PATCH --------------------------------------- Patch ID: VRTSodm-7.4.0.1300 * 3951757 (3951754) ODM support for RHEL6.10 and RHEL/SUSE retpoline kernels * 3953235 (3953233) After installing the 7.4.0.1100 (on AIX and Solaris) and 7.4.0.1200 (on Linux) patch, the Oracle Disk Management (ODM) module fails to load. Patch ID: VRTSdbac-7.4.0.1200 * 3965306 (3965298) Support for SLES11.4 RETPOLINE kernels. Patch ID: VRTSamf-7.4.0.1200 * 3965305 (3965298) Support for SLES11.4 RETPOLINE kernels. Patch ID: VRTSgab-7.4.0.1200 * 3965303 (3965298) Support for SLES11.4 RETPOLINE kernels. Patch ID: VRTSllt-7.4.0.1300 * 3965302 (3965298) Support for SLES11.4 RETPOLINE kernels. * 3948761 (3948507) If RDMA dependencies are not fulfilled by the setup, the LLT init/systemctl script should load the non-RDMA module. Patch ID: VRTSvxvm-7.4.0.1300 * 3964346 (3954972) Retpoline support for VxVM on Sles11 retpoline kernels * 3949322 (3944259) The vradmin verifydata and vradmin ibc commands fail on private diskgroups with Lost connection error. * 3950578 (3953241) Messages in syslog are seen with message string "0000" for VxVM module. * 3950760 (3946217) In a scenario where encryption over wire is enabled and secondary logging is disabled, vxconfigd hangs and replication does not progress. * 3950799 (3950384) In a scenario where volume encryption at rest is enabled, data corruption may occur if the file system size exceeds 1TB. * 3951488 (3950759) The application I/Os hang if the volume-level I/O shipping is enabled and the volume layout is mirror-concat or mirror-stripe. * 3964916 (3965026) VRTSaslapm package installation for SLES11 retpoline kernel. Patch ID: VRTSvxfs-7.4.0.1300 * 3951753 (3951752) VxFS support for RHEL6.10 and RHEL/SUSE retpoline kernels * 3959065 (3957285) job promote operation from replication target node fails. * 3959996 (3938256) When checking file size through seek_hole, it will return incorrect offset/size when delayed allocation is enabled on the file. * 3949500 (3949308) In a scenario where FEL caching is enabled, application I/O on a file does not proceed when file system utilization is 100%. * 3949506 (3949501) When SmartIO FEL-based writeback caching is enabled, the "sfcache offline" command hangs during Inode deinit. * 3949507 (3949502) When SmartIO FEL-based writeback caching is enabled, memory leak of few bytes may happen during node reconfiguration. * 3949508 (3949503) When FEL is enabled in CFS environment, data corruption may occur after node recovery. * 3949509 (3949504) When SmartIO FEL-based writeback caching is enabled, memory leak of few bytes may happen during node reconfiguration. * 3949510 (3949505) When SmartIO FEL-based writeback caching is enabled, I/O operations on a file in filesystem can result in panic after cluster reconfiguration. * 3950740 (3953165) Messages in syslog are seen with message string "0000" for VxFS module. * 3952340 (3953148) Due to space reservation mechanism, extent delegation deficit may be seen on a node, although it may not be using any feature involving space reservation mechanism like CFS Delayed allocation/FEL. Patch ID: VRTSgms-7.4.0.1200 * 3951762 (3951761) GMS support for RHEL6.10 and RHEL/SUSE retpoline kernels Patch ID: VRTSglm-7.4.0.1200 * 3951760 (3951759) GLM support for RHEL6.10 and RHEL/SUSE retpoline kernels DETAILS OF INCIDENTS FIXED BY THE PATCH --------------------------------------- This patch fixes the following incidents: Patch ID: VRTSodm-7.4.0.1300 * 3951757 (Tracking ID: 3951754) SYMPTOM: ODM support for RHEL6.10 and RHEL/SUSE retpoline kernels DESCRIPTION: The RHEL6.10 is new release and it has Retpoline kernel. Also redhat/suse released retpoline kernel for older RHEL/SUSE releases. The ODM module should recompile with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled ODM with Retpoline GCC. * 3953235 (Tracking ID: 3953233) SYMPTOM: After installing the 7.4.0.1100 (on AIX and Solaris) and 7.4.0.1200 (on Linux) patch, the Oracle Disk Management (ODM) module fails to load. DESCRIPTION: As part of the 7.4.0.1100 (on AIX and Solaris) and 7.4.0.1200 (on Linux) patch, the VxFS version has been updated to 7.4.0.1200. Because of the VxFS version update, the ODM module needs to be repackaged due to an internal dependency on VxFS version. RESOLUTION: As part of this fix, the ODM module has been repackaged to support the updated VxFS version. Patch ID: VRTSdbac-7.4.0.1200 * 3965306 (Tracking ID: 3965298) SYMPTOM: SLES11.4 RETPOLINE kernels are not supported. DESCRIPTION: SUSE has released RETPOLINE kernel on SLES11.4, but Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support the RETPOLINE kernel. RESOLUTION: Support for SLES11.4 RETPOLINE kernels is now introduced. Patch ID: VRTSamf-7.4.0.1200 * 3965305 (Tracking ID: 3965298) SYMPTOM: SLES11.4 RETPOLINE kernels are not supported. DESCRIPTION: SUSE has released RETPOLINE kernel on SLES11.4, but Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support the RETPOLINE kernel. RESOLUTION: Support for SLES11.4 RETPOLINE kernels is now introduced. Patch ID: VRTSgab-7.4.0.1200 * 3965303 (Tracking ID: 3965298) SYMPTOM: SLES11.4 RETPOLINE kernels are not supported. DESCRIPTION: SUSE has released RETPOLINE kernel on SLES11.4, but Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support the RETPOLINE kernel. RESOLUTION: Support for SLES11.4 RETPOLINE kernels is now introduced. Patch ID: VRTSllt-7.4.0.1300 * 3965302 (Tracking ID: 3965298) SYMPTOM: SLES11.4 RETPOLINE kernels are not supported. DESCRIPTION: SUSE has released RETPOLINE kernel on SLES11.4, but Veritas Cluster Server kernel modules need to be recompiled with RETPOLINE aware GCC to support the RETPOLINE kernel. RESOLUTION: Support for SLES11.4 RETPOLINE kernels is now introduced. * 3948761 (Tracking ID: 3948507) SYMPTOM: LLT loads the RDMA module during its configuration, even if RDMA dependencies are not fulfilled by the setup. DESCRIPTION: LLT loads the RDMA module during its configuration, even if RDMA dependencies are not fulfilled by the setup. Moreover, the user is unable to manually unload the IB modules. This issue occurs because the LLT_RDMA module holds a use count on the ib_core module even though LLT is not configured to work over RDMA. RESOLUTION: LLT now loads the non-RDMA module if RDMA dependency fails during configuration. Patch ID: VRTSvxvm-7.4.0.1300 * 3964346 (Tracking ID: 3954972) SYMPTOM: Retpoline support for VxVM on Sles11 retpoline kernels DESCRIPTION: The Sles11 is new release and it has Retpoline kernel. The VxVM module should be recompiled with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled VxVM with retpoline GCC. * 3949322 (Tracking ID: 3944259) SYMPTOM: The vradmin verifydata and vradmin ibc commands fail on private diskgroups with Lost connection error. DESCRIPTION: This issue occurs because of a deadlock between the IBC mechanism and the ongoing I/Os on the secondary RVG. IBC mechanism expects I/O transfer to secondary in a sequential order, however to improve performance I/Os are now written in parallel. The mismatch in IBC behavior causes a deadlock and the vradmin verifydata and vradmin ibc fail due to time out error. RESOLUTION: As a part of this fix, IBC behavior is now improved such that it now considers parallel and possible out-of-sequence I/O writes to the secondary. * 3950578 (Tracking ID: 3953241) SYMPTOM: Customer may get generic message or warning in syslog with string as "vxvm:0000: " instead of uniquely numbered message id for VxVM module. DESCRIPTION: Few syslog messages introduced in InfoScale 7.4 release were not given unique message number to identify correct places in the product where they are originated. Instead they are marked with common message identification number "0000". RESOLUTION: This patch fixes syslog messages generated by VxVM module, containing "0000" as the message string and provides them with a unique numbering. * 3950760 (Tracking ID: 3946217) SYMPTOM: In a scenario where encryption over wire is enabled and secondary logging is disabled, vxconfigd hangs and replication does not progress. DESCRIPTION: In a scenario where encryption over wire is enabled and secondary logging is disabled, the application I/Os are encrypted in a sequence, but are not written to the secondary in the same order. The out-of-sequence and in-sequence I/Os are stuck in a loop, waiting for each other to complete. Due to this, I/Os are left incomplete and eventually hang. As a result, the vxconfigd hangs and the replication does not progress. RESOLUTION: As a part of this fix, the I/O encryption and write sequence is improved such that all I/Os are first encrypted and then sequentially written to the secondary. * 3950799 (Tracking ID: 3950384) SYMPTOM: In a scenario where volume data encryption at rest is enabled, data corruption may occur if the file system size exceeds 1TB and the data is located in a file extent which has an extent size bigger than 256KB. DESCRIPTION: In a scenario where data encryption at rest is enabled, data corruption may occur when both the following cases are satisfied: - File system size is over 1TB - The data is located in a file extent which has an extent size bigger than 256KB This issue occurs due to a bug which causes an integer overflow for the offset. RESOLUTION: As a part of this fix, appropriate code changes have been made to improve data encryption behavior such that the data corruption does not occur. * 3951488 (Tracking ID: 3950759) SYMPTOM: The application I/Os hang if the volume-level I/O shipping is enabled and the volume layout is mirror-concat or mirror-stripe. DESCRIPTION: In a scenario where an application I/O is issued over a volume that has volume-level I/O shipping enabled, the I/O is shipped to all target nodes. Typically, on the target nodes, the I/O must be sent only to the local disk. However, in case of mirror-concat or mirror-stripe volumes, I/Os are sent to remote disks as well. This at times leads in to an I/O hang. RESOLUTION: As a part of this fix, I/O once shipped to the target node is restricted to only locally connected disks and remote disks are skipped. * 3964916 (Tracking ID: 3965026) SYMPTOM: lsmod is not showing the required APM modules loaded. DESCRIPTION: VRTSaslapm packages for support of SLES11 retpoline kernel. RESOLUTION: ASLAPM package is recompiled with SLES11 retpoline kernel. Patch ID: VRTSvxfs-7.4.0.1300 * 3951753 (Tracking ID: 3951752) SYMPTOM: VxFS support for RHEL6.10 and RHEL/SUSE retpoline kernels DESCRIPTION: The RHEL6.10 is new release and it has Retpoline kernel. Also redhat/suse released retpoline kernel for older RHEL/SUSE releases. The VxFS module should recompile with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled VxFS with retpoline GCC. * 3959065 (Tracking ID: 3957285) SYMPTOM: job promote operation executed on replication target node fails with error message like: # /opt/VRTS/bin/vfradmin job promote myjob1 /mnt2 UX:vxfs vfradmin: INFO: V-3-28111: Current replication direction: :/mnt1 -> :/mnt2 UX:vxfs vfradmin: INFO: V-3-28112: If you continue this command, replication direction will change to: :/mnt2 -> :/mnt1 UX:vxfs vfradmin: QUESTION: V-3-28101: Do you want to continue? [ynq]y UX:vxfs vfradmin: INFO: V-3-28090: Performing final sync for job myjob1 before promoting... UX:vxfs vfradmin: INFO: V-3-28099: Job promotion failed. If you continue, replication will be stopped and the filesystem will be made available on this host for use. To resume replication when returns, use the vfradmin job recover command. UX:vxfs vfradmin: INFO: V-3-28100: Continuing may result in data loss. UX:vxfs vfradmin: QUESTION: V-3-28101: Do you want to continue? [ynq]y UX:vxfs vfradmin: INFO: V-3-28227: Unable to unprotect filesystem. DESCRIPTION: job promote from target node sends promote operation related message to source node. After this message is processed on source side, 'seqno' file updating/write is done. 'seqno' file is created on target side and not present on source side, hence 'seqno' file update returns error and promote fails. RESOLUTION: 'seqno' file write is not required as part of promote message. Passing SKIP_SEQNO_UPDATE flag in promote message so that seqno file write is skipped on source side during promote processing. Note: job should be stopped on source node before doing promote from target node. * 3959996 (Tracking ID: 3938256) SYMPTOM: When checking file size through seek_hole, it will return incorrect offset/size when delayed allocation is enabled on the file. DESCRIPTION: In recent version of RHEL7 onwards, grep command uses seek_hole feature to check current file size and then it reads data depends on this file size. In VxFS, when dalloc is enabled, we allocate the extent to file later but we increment the file size as soon as write completes. When checking the file size in seek_hole, VxFS didn't completely consider case of dalloc and it was returning stale size, depending on the extent allocated to file, instead of actual file size which was resulting in reading less amount of data than expected. RESOLUTION: Code is modified in such way that VxFS will now return correct size in case dalloc is enabled on file and seek_hole is called on that file. * 3949500 (Tracking ID: 3949308) SYMPTOM: In a scenario where FEL caching is enabled, application I/O on a file does not proceed when file system utilization is 100%. DESCRIPTION: When file system capacity is utilized 100%, application I/O on a file does not proceed. This issue occurs because the ENOSPC error handling code path tries to take the Inode ownership lock which is already held by the current thread. As a result, any operation on that file hangs. RESOLUTION: This fix releases the Inode ownership lock and reclaims it after ENOSPC error handling is complete. * 3949506 (Tracking ID: 3949501) SYMPTOM: The sfcache offline command hangs. DESCRIPTION: During Inode inactivation, Inode with FEL dirty flag set gets included in the cluster-wide inactive list instead of the local inactive list. This issue occurs due to an internal error. As a result, the FEL dirty flag does not get cleared and the "sfcache offline" command hangs. RESOLUTION: This fix now includes the Inodes with FEL dirty flag in a local inactive list. * 3949507 (Tracking ID: 3949502) SYMPTOM: When SmartIO FEL-based writeback caching is enabled, memory leak of few bytes may happen during node reconfiguration. DESCRIPTION: FEL recovery is initiated during node reconfiguration during which some internal data structure remains held. Due to this extra hold, the data structures are not freed, which leads to small memory leak. RESOLUTION: This fix now ensures that the hold on the data structure is handled correctly. * 3949508 (Tracking ID: 3949503) SYMPTOM: When FEL is enabled in CFS environment, after a node crash, stale data may be seen on a file. DESCRIPTION: While revoking RWlock after node recovery, the file system ensures that the FEL writes are flushed to the disks and a write completion record is written in FEL log. In scenarios where a node crashes and the write completion record is not updated, FEL writes get replayed during FEL recovery. This may overwrite the writes that may have happened after the revoke, on some other cluster node, resulting in data corruption. RESOLUTION: This fix now writes the completion record after FEL write flush. * 3949509 (Tracking ID: 3949504) SYMPTOM: When SmartIO FEL-based writeback caching is enabled, memory leak of few bytes may happen during node reconfiguration. DESCRIPTION: FEL recovery is initiated during node reconfiguration during which some internal data structure remains held. Due to this extra hold, the data structures are not freed that leads to small memory leak. RESOLUTION: This fix now ensures that the hold on the data structure is handled correctly. * 3949510 (Tracking ID: 3949505) SYMPTOM: When SmartIO FEL-based writeback caching is enabled, I/O operations on a file in filesystem can result in panic after cluster reconfiguration. DESCRIPTION: In an FEL environment, the file system may get mounted after a node recovery, but the FEL device may remain offline. In such a scenario, the FEL related data structures remain inaccessible and the node panics during an application I/O that attempts to access the FEL related data structures. RESOLUTION: This fix checks whether the FEL device recovery has completed before accessing the FEL related data structures. * 3950740 (Tracking ID: 3953165) SYMPTOM: Customer may get generic message or warning in syslog with string as "vxfs:0000:" instead of uniquely numbered message id for VxFS module. DESCRIPTION: Few syslog messages introduced in InfoScale 7.4 release were not given unique message number to identify correct places in the product where they are originated. Instead they are marked with common message identification number "0000". RESOLUTION: This patch fixes syslog messages generated by VxFS module, containing "0000" as the message string and provides them with a unique numbering. * 3952340 (Tracking ID: 3953148) SYMPTOM: Extra extent delegation messages exchanges may be noticed between delegation master and delegation client. DESCRIPTION: If a node is not using delayed allocation or SmartIO based write back cache (Front end Log) and if an extent allocation unit is being revoked while non-dalloc/non-FEL extent allocation is in progress then node may have delegation deficit. This is not correct as node is not using delayed allocation/FEL. RESOLUTION: Fix is to ignore delegation deficit if deficit is on account of non-delayed allocation/non-FEL allocation. Patch ID: VRTSgms-7.4.0.1200 * 3951762 (Tracking ID: 3951761) SYMPTOM: GMS support for RHEL6.10 and RHEL/SUSE retpoline kernels DESCRIPTION: The RHEL6.10 is new release and it has Retpoline kernel. Also redhat/suse released retpoline kernel for older RHEL/SUSE releases. The GMS module should recompile with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled GMS with Retpoline GCC. Patch ID: VRTSglm-7.4.0.1200 * 3951760 (Tracking ID: 3951759) SYMPTOM: GLM support for RHEL6.10 and RHEL/SUSE retpoline kernels DESCRIPTION: The RHEL6.10 is new release and it has Retpoline kernel. Also redhat/suse released retpoline kernel for older RHEL/SUSE releases. The GLM module should recompile with retpoline aware GCC to support retpoline kernel. RESOLUTION: Compiled GLM with Retpoline GCC. INSTALLING THE PATCH -------------------- Run the Installer script to automatically install the patch: ----------------------------------------------------------- Please be noted that the installation of this P-Patch will cause downtime. To install the patch perform the following steps on at least one node in the cluster: 1. Copy the patch infoscale-sles11_x86_64-Patch-7.4.0.1200.tar.gz to /tmp 2. Untar infoscale-sles11_x86_64-Patch-7.4.0.1200.tar.gz to /tmp/hf # mkdir /tmp/hf # cd /tmp/hf # gunzip /tmp/infoscale-sles11_x86_64-Patch-7.4.0.1200.tar.gz # tar xf /tmp/infoscale-sles11_x86_64-Patch-7.4.0.1200.tar 3. Install the hotfix(Please be noted that the installation of this P-Patch will cause downtime.) # pwd /tmp/hf # ./installVRTSinfoscale740P1200 [ ...] You can also install this patch together with 7.4 base release using Install Bundles 1. Download this patch and extract it to a directory 2. Change to the Veritas InfoScale 7.4 directory and invoke the installer script with -patch_path option where -patch_path should point to the patch directory # ./installer -patch_path [] [ ...] Install the patch manually: -------------------------- #Manual installation is not supported REMOVING THE PATCH ------------------ #Manual uninstallation is not supported SPECIAL INSTRUCTIONS -------------------- NONE OTHERS ------ NONE