access-rhel7_x86_64-Patch-7.3.1.001

 Basic information
Release type: Patch
Release date: 2018-04-09
OS update support: None
Technote: None
Documentation: None
Popularity: 3342 viewed    downloaded
Download size: 2.5 GB
Checksum: 2110019151

 Applies to one or more of the following products:
Access 7.3.1VA On RHEL7 x86-64

 Obsolete patches, incompatibilities, superseded patches, or other requirements:
None.

 Fixes the following incidents:
9843

 Patch ID:
None.

Readme file
README VERSION               : 1.1
README CREATION DATE         : 2018-04-03
PATCH-ID                     : 7.3.1.001
PATCH NAME                   : VA-7.3.1.001
REQUIRED PATCHES             : NONE
INCOMPATIBLE PATCHES         : NONE
SUPPORTED PADV               : rhel7.3_x86_64, rhel7.4_x86_64, OL7.3_x86_64, OL7.4_x86_64
(P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION)
PATCH CRITICALITY            : Optional
HAS KERNEL COMPONENT         : YES
ID                           : NONE

PATCH INSTALLATION INSTRUCTIONS:
-----------------------------------------
For detailed installation instructions :
Please refer to : https://origin-www.veritas.com/content/support/en_US/doc/130196629-130196633-1

For detailed instructions on Upgrading Veritas Access, please refer to 
"Chapter 10 : Upgrading Veritas Access using a rolling upgrade".

SPECIAL INSTRUCTIONS:
-----------------------------------------

1. Extract the tarball

2. Rolling upgrade can be started using below command 

	# ./installaccess -rolling_upgrade

3. Patch can be upgraded from VA-7.3.1 release only.
	
4. Make sure that upgrade is performed one node at a time even though installer tries to upgrade multiple nodes at a time.

For example,

	# ./installaccess -rolling_upgrade

											   Veritas Access 7.3.1.001 Rolling Upgrade Program

	Copyright (c) 2018 Veritas Technologies LLC.  All rights reserved.  Veritas and the Veritas Logo are trademarks or registered
	trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other countries. Other names may be trademarks of their
	respective owners.

	The Licensed Software and Documentation are deemed to be "commercial computer software" and "commercial computer software
	documentation" as defined in FAR Sections 12.212 and DFARS Section 227.7202.

	Logs are being written to /var/tmp/installaccess-201804030721cXT while installaccess is in progress.

	Enter the system name of the cluster on which you would like to perform rolling upgrade [q,?] (fss7310_01)

		Checking communication on fss7310_01 .................................................................................... Done
		Checking rolling upgrade prerequisites on fss7310_01 .................................................................... Done

											   Veritas Access 7.3.1.001 Rolling Upgrade Program

	Cluster information verification:

			Cluster Name: fss7310

			Cluster ID Number: 61886

			Systems: fss7310_01 fss7310_02 fss7310_03 fss7310_04

	Would you like to perform rolling upgrade on the cluster? [y, n, q] (y)

	Rolling upgrade phase 1 upgrades all VRTS product packages except non-kernel packages.
	Rolling upgrade phase 2 upgrades all non-kernel packages including: VRTSvcs VRTScavf VRTSvcsag VRTSvcsea VRTSvbs VRTSnas

		Checking communication on fss7310_02 .................................................................................... Done
		Checking rolling upgrade prerequisites on fss7310_02 .................................................................... Done
		Checking communication on fss7310_03 .................................................................................... Done
		Checking rolling upgrade prerequisites on fss7310_03 .................................................................... Done
		Checking communication on fss7310_04 .................................................................................... Done
		Checking rolling upgrade prerequisites on fss7310_04 .................................................................... Done

		Checking the product compatibility of the nodes in the cluster .......................................................... Done

	Rolling upgrade phase 1 is performed on the system(s) fss7310_03. It is recommended to perform rolling upgrade phase 1 on the
	remaining system(s) fss7310_01 fss7310_02 fss7310_04.

	Would you like to perform rolling upgrade phase 1 on the recommended system(s)? [y, n, q] (y) n

	Do you want to quit without phase 1 performed on all systems? [y, n, q] (n) n

	Enter the system names separated by spaces on which you want to perform rolling upgrade: [q,?] fss7310_02
	
5. If file systems are online during upgrade, please make sure that recovery is finished before starting the
   upgrade of new node.
   
	To check the recovery process, please check below command :
		# vxtask list
	
	Recovery will be triggered in ~3-5 minutes after node has joined the cluster.

6. Before starting upgrade please make sure that none of the services are in "FAILED/FAULTED/W_ONLINE" state.
	
7. Fresh installation can also be done using this patch. Please refer to
		https://origin-www.veritas.com/content/support/en_US/doc/130196629-130196633-1
   for more details


SUMMARY OF FIXED ISSUES:
-----------------------------------------
Patch ID: 7.3.1.001

IA-9843			"storage fs create" taking lot of time to create file systems.
IA-9839			"storage fs list" taking lot of time to list all the file systems.
IA-9838			vxprint/vxdisk commands running slowly
IA-11243		"Storage fs checkmirror"  taking longer in large environments.
IA-10216		GUI discovery taking longer affecting other system operations
IA-10973		CLISH commands hang when private NIC fails
IA-10942		Linux network OS tunables not persistent across reboots 
IA-11338		File system creation failing if the number of Volume objects getting created are very high.
IA-9840			Cluster reboot all leaving FSS cluster in inconsistent state
IA-11405		Some of the Plexes in the volumes may remain in IOFAIL state after reboot.
IA-10946		NIC failure event was not recorded in the event monitoring.
IA-11237		Inconsistent event monitoring in case NODE offline/online events.
IA-10375		Unable to online IP address on newly added node, if any filesystem has quota set on it
IA-11058		Recursive empty directories created in /shared/knfsv4 after a node reboots multiple times
IA-11034		striped-mirrored volumes are created with DCO by default
IA-11051		User not be able to set WORM retention.
IA-11072		Volume recoveries started after cluster stop operations.
IA-11307		User not able to destroy FS in an isolated pool.
IA-10379		sosreport is not collected in evidences
IA-11402		Display events related to disk/plex similar to GUI in clish also.
IA-9847			vxddladm addjbod was leading to random devices having udid_mismatch
IA-11502		Fix Corruption issue for Erasure coded volume after cluster restart


DETAILS OF INCIDENTS FIXED BY THE PATCH
-----------------------------------------

Patch ID: 7.3.1.001
	

* TRACKING ID: IA-9843

ONE_LINE_ABSTRACT: "storage fs create" taking lot of time to create file systems.

SYMPTOM : "storage fs create" taking lot of time to create file systems.

DESCRIPTION :
The time taken for command “storage fs create” was increasing as the number of File systems increase
because of redundant code.

RESOLUTION:
Optimized the "storage fs create" operation to reduce time taken by storage fs create.


* TRACKING ID: IA-9839

ONE_LINE_ABSTRACT: "storage fs list" taking lot of time to list all the file systems.

SYMPTOM: "storage fs list" taking lot of time to list the file systems.

DESCRIPTION: 
There was lot of redundant code which was invoking lot of back-end commands to fetch the data.
This was causing "storage fs list" to take long time.

RESOLUTION:
Code optimized to make "storage fs list" command to run faster.

* TRACKING ID: IA-9838

ONE_LINE_ABSTRACT : vxprint/vxdisk commands running slowly

SYMPTOM : vxprint/vxdisk commands running slowly

DESCRIPTION : 
These are internal commands which were taking longer to run as they were fetching unnecessary records.

RESOLUTION:
Optimized the commands to fetch required records only.

* TRACKING ID: IA-11243

SYMPTOM : "Storage fs checkmirror"  taking longer in large environments.

DESCRIPTION:
There was lot of redundant code which was invoking lot of back-end commands to fetch the data.
This was causing "storage fs checkmirror" to take long time.

RESOLUTION:
Code is optimized to run "storage fs checkmirror" faster.

* TRACKING ID: IA-10216

ONE_LINE_ABSTRACT : GUI discovery taking longer affecting other system operations

SYMPTOM : GUI discovery taking longer affecting other system operations

DESCRIPTION : 
GUI operations were running for very long time causing other CLISH commands to run slowly.

RESOLUTION :
Improved the GUI discovery performance and optimizations done to reduce time taken by GUI operations.


* TRACKING ID: IA-10973

ONE_LINE_ABSTRACT : CLISH commands hang when private NIC fails

SYMPTOM : CLISH commands hang when private NIC fails

DESCRIPTION :
In CLISH commands, the connectivity of the nodes was checked using the status of the nodes in the cluster. If private NIC on which IP address is plumbed is down,
the communication between the nodes is lost leading the commands getting hung.

RESOLUTION:
Changed the logic to check for status of private NIC rather than Node Status.


* TRACKING ID: IA-10942

ONE_LINE_ABSTRACT : Linux network OS tunables not persistent across reboots 

SYMPTOM : Linux network OS tunables not persistent across reboots

DESCRIPTION :
The network tunables on the cluster need to be changed for better performance. But, those were not persistent across the reboots. 
Access init scripts were changing the tunables to default state. 

RESOLUTION:
Modified code to make sure Access init scripts are changing the tunables to values recommended for better performance in FSS environment.


* TRACKING ID: IA-11338

ONE_LINE_ABSTRACT : File system creation failing if the number of Volume objects getting created are very high.

SYMPTOM : File System creation failed with “memory allocation” error

DESCRIPTION :
If number of volume objects getting created in the Access environment are too high, it may reach the memory limit leading to the memory allocation failure for file system getting created.

RESOLUTION :
Modified the internal memory limit to make sure File system creation will not fail with the memory allocation failure.



* TRACKING ID: IA-9840
ONE_LINE_ABSTRACT : Cluster reboot all leaving FSS cluster in inconsistent state
SYMPTOM : Cluster reboot all leaving cluster in inconsistent state
DESCRIPTION :
When all the nodes in the environment are rebooted, because of the issues with the startup scripts and service group dependency issues,
cluster services were not coming online in proper order leading to inconsistent state. Many of the services used to be in W_ONLINE/FAILED/FAULTED
state because of this issue after "cluster reboot all".

RESOLUTION :
Fixed the service group dependencies and strtup scripts to online cluster services in proper order.

* TRACKING ID: IA-11405

ONE_LINE_ABSTRACT : Some of the Plexes in the volumes may remain in IOFAIL state after reboot.

SYMPTOM : Some of the Plexes in the volumes may remain in IOFAIL state after reboot.

DESCRIPTION:
When one of the node in cluster is rebooted in FSS environment plexes need to be synced up when the nodes come back up.
Because of a bug in recovery, some of the plexes used to remain in IOFAIL state.

RESOLUTION :
Fixed the issue by triggering recovery for the failed plexes correctly.


* TRACKING ID: IA-10946

ONE_LINE_ABSTRACT : NIC failure event was not recorded in the event monitoring.

SYMPTOM : NIC failure event was not recorded in the event monitoring.

DESCRIPTION : The NIC failure event was not displayed in GUI as well as CLISH.

RESOLUTION:
Code changes were done to display the NIC failure event both in GUI as well as CLISH.

* TRACKING ID: IA-11237
ONE_LINE_ABSTRACT : Inconsistent event monitoring in case NODE offline/online events.

SYMPTOM : NODE offline/online events are not displayed in CLISH but shown in GUI.

DESCRIPTION:
Event reporting was missing in case of CLISH event monitoring framework.

RESOLUTION:
Code modified to make event monitoring framework consistent across GUI and CLISH

* TRACKING ID: IA-10375

ONE_LINE_ABSTRACT : Unable to online IP address on newly added node, if any filesystem has quota set on it

SYMPTOM:
Unable to online IP address on newly added node, if any filesystem has quota set on it

DESCRIPTION :
IP address was not coming online on the newly added node if any filesystem had user and group quota set before adding the node. 

RESOLUTION:
Code changes done to update VCS configuration file while adding the new node

* TRACKING ID: IA-11329

ONE_LINE_ABSTRACT : 
Add node failing if the existing node has VLAN and bond configured.

SYMPTOM:
Add node may fail if the existing cluster has bond and VLAN configured.

DESCRIPTION:
During addnode operation networking was not getting configured correctly, because of which after addnode may fail or
networking might not be configured correctly on the newly added node.

RESOLUTION:
Fix done to perform network configuration correctly during addnode.

* TRACKING ID: IA-11058 

ONE_LINE_ABSTRACT :Recursive empty directories created in /shared/knfsv4 after a node reboots multiple times

SYMPTOM: 
Recursive empty directories created in /shared/knfsv4 after a node reboots multiple times

DESCRIPTION: 
We were force copying directories, without checking if destination existed or not. 
If destination directories already exists while force copying directories (cp -rf src_dir dest_dir), entire src_dir is 
copied inside dest_dir, i.e. now dest_dir contains it’s original contents as well as src_dir and all its subdirectories. 
This results in nested subdirectories structure.

RESOLUTION:
Modified  the code to add the check if destination directory exists before copying.


* TRACKING ID: IA-11034

ONE_LINE_ABSTRACT: striped-mirrored volumes are created with DCO by default
SYMPTOM: striped-mirrored volumes are created with DCO by default
DESCRIPTION: When creating FS with mirrored configurations, volumes are created with DCO and detach map activated.
RESOLUTION: We created volumes with logtype=none so no DCO is created.

* TRACKING ID: IA-11051

ONE_LINE_ABSTRACT:User not be able to set WORM retention.
SYMPTOM: User  not be able to set WORM retention.
DESCRIPTION: When setting WORM retention for a particular directory via CLISH, it fails giving stack trace.
RESOLUTION: There is an undefined variable and this has been fixed to set WORM retention

* TRACKING ID: IA-11072

ONE_LINE_ABSTRACT: Volume recoveries started after cluster stop operations.
SYMPTOM: Volume recoveries started after cluster stop operations.
DESCRIPTION: In cases where we have to update kernel packages, we need a clean way to bring the cluster to a stop and perform the maintenance activity.
RESOLUTION: We have added command in CLISH that can stop either the entire cluster or just a node. Command is “cluster stop nodename|all”


* TRACKING ID: IA-11307

ONE_LINE_ABSTRACT:User not able to destroy FS in an isolated pool.
SYMPTOM: User not able to destroy FS in an isolated pool.
DESCRITION: There is an unknown error displayed when trying to destroy FS in an isolated pool.
RESOLUTION: Code has been fixed to do this.

* TRACKING ID: IA-10379
ONE_LINE_ABSTRACT : sosreport is not collected in evidences
SYMPTOM: sosreport is not collected in evidences
DESCRIPTION: When collecting the debuginfo sosreport is not collected is not getting collected as part of evidences.
RESOLUTION: Fixed the code to collect the sosreport

* TRACKING ID: IA-11402

ONE_LINE_ABSTRACT : Display events related to disk/plex similar to GUI in clish also.

SYMPTOM:
The events for Disk/Plex were seen only in GUI and not in clish.

DESCRIPTION:
Clish did not have events related to Disk offline/online and Plex failure. Similar events could be seen in GUI which was an inconsistent behaviour.

RESOLUTION:
Code changes done to display events related to Disk/Plex similar to GUI in Clish also.


* TRACKING ID: IA-9847

ONE_LINE_ABSTRACT : vxddladm addjbod was leading to random devices  having udid_mismatch

SYMPTOM:
After executing vxddladm addjbod command, random devices were having false udid_mismatch.

DESCRIPTION:
Because of the new changes added to support option “localdisks=yes”, garbage value was getting added to UDID of the device. This lead to the disk having inconsistent on disk and ASL UDID leading to udid_mismatch flag.

RESOLUTION:
Code changes have been made to avoid addition of garbage value to UDID.

* TRACKING ID: IA-11502	

ONE_LINE_ABSTRACT : Fix Corruption issue for Erasure coded volume after cluster restart
 
SYMPTOM: Data on Erasure Coded volume may get corrupted after restarting the cluster.
 
DESCRIPTION:
After cluster restart, during log replay operation invalid log entries might get replayed resulting in
data corruption.
 
RESOLUTION:
Code changes done to avoid flush of invalid log entries during log replay to avoid data corruption.


KNOWN ISSUES
-----------------------------------------
* TRACKING ID: IA-11385

SYMPTOM : Rolling upgrade may fail with CIFS server in online state.

WORK-Around : 
Please stop CIFS server before starting the upgrade.

* TRACKING ID: IA-11427
SYMPTOM : GUI is not displaying any data after upgrade operation with error "License is not installed"

WORK-Around :
To resolve this issue we need to execute following command from node where ManagementConsole service group is online,
/opt/VRTSnas/pysnas/bin/isaconfig