Veritas™ Services and Operations Readiness Tools (SORT)

An e-mail with instructions has been sent to your Veritas address. Follow the instruction there to complete verification. If for any reasons, you have not received the e-mail verification, go back and try again. If you still have not received it, please contact us.

Custom Reports Using Data Collectors

Risk Assessment Checklist Sample Report for

Product: InfoScale Availability

Platform: Linux

Product version: 7.0

Product component: All

Check category: All

Summary
VCS HAFD apache configuration file
VCS HAFD apache owner
VCS HAFD application checksum
VCS HAFD application program
VCS HAFD application user
Verify I/O fencing for cluster
SFHA Kernel module consistency
License keys Consistency
LLT links full duplex setting
LLT links high priority and private link
LLT link count
Package Consistency
Fencing Configuration
Fencing CP server Configuration
LLT Link jumbo frame setting
LLT Link MTU check
LLT configuration- cluster ID
LLT links cross-connection
LLT Link speed autonegotiation and MAC address check
System architecture type
Time synchronization
Temporary license keys
VCS critical resources
VCS HAFD disks
VCS HAFD VxVM license
VCS HAFD disk UDID comparison
VCS HAFD VxVM components
VCS HAFD DNS Domain Information Groper (DIG)
VCS HAFD DNS keyfile
VCS HAFD DNS master
VCS faulted agents
VCS faulted resources
VCS HAFD IP device
VCS HAFD IP route
VCS HAFD VxFS FsckOpt
VCS HAFD mount point availability

VCS HAFD mount point configuration
VCS HAFD mount point existence
VCS HAFD VxFS license
VCS HAFD NFS lock directory
VCS HAFD NFS configuration
VCS HAFD NIC
VCS ToleranceLimit
VCS HAFD Oracle home
VCS HAFD Oracle owner
VCS HAFD Oracle PFile
VCS HAFD process program
VCS HAFD share directory exists
VCS HAFD triggers checksum
VCS HAFD triggers program
VCS disk connectivity
VCS duplicate disk group name
VCS free swap space
VCS GAB Startup Configuration Check
VCS LLT startup configuration check
VCS OS version and patch
VCS Cluster ID
VCS ClusterAddress
VCS ClusterService OnlineRetryLimit
VCS configuration
VCS GAB jeopardy
VCS Cluster read-only status
VCS SysName
VCS unique cluster name
VCS Disabled resources
VCS frozen groups
VCS virtual host attributes
Verify support package
Verify software patch level
VCS IfconfigTwice attribute
VCS NetworkHosts attribute
Input/output wait state

Check category: Availability


Check description: Checks whether HttpDir and ConfFile exist.


Check procedure:

  • Checks whether the specified HttpDir directory is a valid directory on the target cluster systems.
  • Checks whether the specified ConfFile directory exists on the target cluster systems.


Check recommendation: Make sure that HttpDir is a valid directory and that the ConfFile file exists on the clustered system.


Learn More...

Action taken for Apache agent

Check category: Availability


Check description: Determines whether the resource owner defined by the User agent attribute exists.


Check procedure:

  • Check whether the user has valid UNIX login on the clustered system.


Check recommendation: Make sure that the User agent attribute specifies a valid UNIX account on the clustered system. To determine if the user name is valid, enter the following command:
# /usr/bin/id user_name.


Learn More...

Action taken for Apache agent

Check category: Availability


Check description: Compares the checksums of the executable files on all cluster nodes. The node on which the application is currently running (ONLINE) is assumed to be the canonical copy. The check is skipped on the node where the application is currently running. On all other cluster nodes, if the checksum differs, the check fails. If the application is not running (ONLINE) on any node, the check is skipped on all nodes.


Check procedure:

  • Fetches the list of applications configured for monitoring on the target cluster systems.
  • Checks whether the checksum binaries specified for the StartProgram, StopProgram and MonitorProcess attributes of the application resource exist on the target systems.
  • Verifies that the checksums of the specified binaries are the same as the checksums on the system where the group is online.


Check recommendation: The checksum of each executable file should be the same on all nodes. Identify the definitive and correct executable file on one of the cluster nodes, then synchronize the files on the remaining failover nodes.


Learn More...

Attributes required for configuring Application agent

Check category: Availability


Check description: Checks whether the application binaries exist and are executable.


Check procedure:

  • Fetches the list of application binaries that are specified for monitoring in the configuration on the target cluster systems.
  • Checks whether the application binaries exist on the target cluster systems.
  • Verifies whether the application binaries are executable on all the target cluster systems.


Check recommendation: Make sure that the scripts specified in the cluster configuration exist and that they are executable on all systems in the cluster.


Learn More...

Attributes required for configuring Application agent

Check category: Availability


Check description: Checks whether the application user account exists.


Check procedure:

  • Retrieves the user information for the applications configured on the target cluster systems.
  • Verifies that the user information is valid and that the user exists.


Check recommendation: Make sure that the application user has a valid UNIX login, and that the account is enabled for shell access.


Learn More...

Attributes required for configuring Application agent

Check category: Availability


Check description: Verifies whether I/O fencing is properly configured for cluster.


Check procedure:

  • Determines the fencing mode using the 'vxfenadm' command and verifies whether fencing mode is set to either SCSI-3 or SYBASE or CPS or customized.


Check recommendation: Either I/O fencing is not running on the system or fencing mode is not configured properly. It is recommended to configure I/O fencing with fencing mode configured as per product requirements when using SFCFS to avoid VxFS file system corruption.


Learn More...

I/O Fencing for SFCFS

Check category: Availability


Check description: Checks that SFHA Kernel modules loaded across all the nodes in a cluster are consistent


Check procedure:

  • Identifies the SFHA Kernel modules loaded on all the nodes in a cluster.
  • Verifies that the SFHA Kernel modules loaded are consistent across all the nodes in the cluster.


Check recommendation: Ensure that SFHA Kernel modules loaded on all the nodes in a cluster are consistent. Inconsistent SFHA Kernel modules can cause errors in application fail-over.



Check category: Availability


Check description: This check compares the license key, license key type and license key version across cluster nodes and highlight any inconsistencies.This check does not currently compare the individual feature bits in the license keys. This can result in the following conditions: 1. The check does not currently distinguish between Standard- and Enterprise-level license keys for Storage Foundation for Oracle and Storage Foundation HA for Oracle. This can result in a check that passes even when the license key enablement is not identical. 2. The check does not currently detect whether VVR is license key enabled across all cluster nodes. This can result in a check that passes even when the license key enablement is not identical. 3. The check does currently distinguish between Storage Foundation + VCS, and Storage Foundation HA, even though the features enabled by the license keys are identical. This results in a check that fails even when the license key enablement is identical.


Check procedure:

  • Identifies the license keys, license key types and license key versions installed on all the nodes in a cluster.
  • Verifies that the license keys, license ley types and license key versions installed are consistent across all the nodes in the cluster.


Check recommendation: Ensure that license keys, license key types and license key versions installed on all the nodes in a cluster are consistent. Inconsistent license keys, license key types and license key versions installed can cause errors in application fail-over.



Check category: Availability


Check description: Checks whether all the LLT links in the system are full-duplex.This check will be skipped for all bonded interfaces.


Check procedure:

  • Identifies all the LLT links configured on the system.
  • Checks whether all the LLT links are full duplex on the system.


Check recommendation: All the LLT links configured in the system should be full duplex.



Check category: Availability


Check description: Checks for the availability of LLT high-priority links and verifies if they are in the private network.


Check procedure:

  • Verifies whether the LLT Links are configured on a private link and as high priority by using the lltstat command.


Check recommendation: It is recommended that you have at least two high-priority LLT links in the private network which is required to configure LLT.



Check category: Availability


Check description: Checks if minimum two links configured as LLT links.


Check procedure:

  • Identifies all the LLT links configured on the system.
  • Checks whether the number of LLT links is greater than or equal to two.


Check recommendation: To ensure HA, it is mandatory that you should have minimum two links configured as LLT links.



Check category: Availability


Check description: Checks that packages installed across all the nodes in a cluster are consistent


Check procedure:

  • Identifies the packages installed on all the nodess in a cluster.
  • Verifies that the package installed and its version are consistent across all the nodes in the cluster.


Check recommendation: Ensure that packages installed on all the nodes in a cluster are consistent and package versions are identical. Inconsistent packages can cause errors in application fail-over.



Check category: Availability


Check description: Checks whether the fencing module is configured properly on all nodes in the cluster.


Check procedure:

  • Identifies the product installed - Storage Foundation for Oracle RAC (SF Oracle RAC) or Storage Foundation Cluster File System for Oracle RAC (SFCFSRAC).
  • In the case of SF Oracle RAC, checks whether I/O Fencing is enabled.
  • In the case of SFCFSRAC, checks whether I/O Fencing is disabled.


Check recommendation: You must configure fencing configuration for SF Oracle RAC. It is recommended configuring fencing in enabled mode for SF Oracle RAC and in disabled mode for SFCFSRAC.



Check category: Availability


Check description: Checks configuration for coordination point server based fencing on all nodes in the cluster.


Check procedure:

  • Checks ping status for coordination point server from all nodes in the cluster.


Check recommendation: You must configure fencing for Storage Foundation for Oracle RAC(SF Oracle RAC). It is recommended to configure fencing in enabled mode for SF Oracle RAC.



Check category: Availability


Check description: Checks whether all the Low Latency Transport(LLT) links in the Storage Foundation for Oracle RAC(SF Oracle RAC) node have mtu size set between 1500 and 9000 bytes.


Check procedure:

  • Identifies all the LLT links configured in the system.
  • Checks whether all the LLT links have the same jumbo frame sizes.


Check recommendation: All the llt links in the node should have mtu size between 1500 and 9000 bytes.



Check category: Availability


Check description: Checks whether all the Low Latency Transport (LLT) links in the Storage Foundation for Oracle RAC (SF Oracle RAC) cluster have identical mtu size.


Check procedure:

  • Identifies all the LLT links configured in the system.
  • Checks whether all the LLT links have the same mtu.


Check recommendation: All the nodes in the cluster should have the same mtu size.



Check category: Availability


Check description: Checks whether the cluster ID is identical on all nodes in VCS and SF Oracle RAC clusters.


Check procedure:

  • Determines the cluster ID in each system in the SF Oracle RAC cluster.
  • Checks whether the cluster ID is identical across all nodes in the SF Oracle RAC cluster.


Check recommendation: The cluster IDs should be identical on all nodes in the SF Oracle RAC cluster.



Check category: Availability


Check description: Checks whether the LLT links in the system are cross-connected.


Check procedure:

  • Identifies all the LLT links configured on the system.
  • Checks whether the LLT links are cross-connected (multiple links connected to a single switch or connected directly).


Check recommendation: It is recommended that the LLT links should not be cross-connected.



Check category: Availability


Check description: Checks speed autonegotiation and MAC address settings for all links.


Check procedure:

  • Verifies if the speed settings for all LLT links is same in the system
  • Verifies if the autonegotiation settings for all LLT links is same in the system.
  • Verifies if the MAC addresses of all LLT links in the system are unique.


Check recommendation: All links should have the same speed, autonegotition setting ,and unique MAC address.



Check category: Availability


Check description: Checks whether the system architecture type is identical across the SF Oracle RAC cluster.


Check procedure:

  • Determines the system architecture type of each system in the SF Oracle RAC cluster.
  • Verifies whether the system architecture type is same across all nodes in the cluster.


Check recommendation: All nodes in a cluster must use identical system architectures.



Check category: Availability


Check description: Checks whether the date and time are synchronized across the cluster.


Check procedure:

  • Determines the date and time set on each system in the SF Oracle RAC cluster.
  • Verifies whether the date and time are synchronized across all nodes in the cluster.


Check recommendation: It is recommended that the date and time settings are identical on all cluster nodes.



Check category: Availability


Check description: Checks for temporary product license keys that are about to expire.


Check procedure:

  • Identifies products with temporary license keys.
  • Checks whether there is a permanent license key installed for that product.


Check recommendation: Ensure that valid license keys are installed for Storage Foundation / InfoScale products.



Check category: Availability


Check description: Checks whether any VCS resource is marked as non-critical (Critical=0).


Check procedure:

  • Retrieves the details of all the resources of the configured groups.
  • Verifies that atleast one of the resources is marked as critical.


Check recommendation: A group cannot failover to an alternate system unless it has at least one resource marked as critical. Therefore, to ensure maximum high availability, login as root and execute the following command to set the affected resources to critical:
# hares -modify resource_name Critical 1


Learn More...

About critical and non-critical resources

Check category: Availability


Check description: Checks whether all the disks in the VxVM disk group are visible on the cluster node.


Check procedure:

  • Fetches all the disk groups configured on the target cluster nodes
  • Discovers all the disks in the disk group.


Check recommendation: Make sure that all VxVM disks have been discovered. Do the following:
1. Run an operating system-specific disk discovery command such as lsdev (AIX), ioscan (HP-UX), fdisk (Linux), or format or devfsadm (Solaris).
2. Run vxdctl enable.
# vxdctl enable.


Learn More...

Verifying the disk visibility using vxfenadm utility
Disk Group agent notes

Check category: Availability


Check description: Checks for valid Volume Manager (VxVM) licenses on the cluster systems.


Check procedure:

  • Uses the vxlicrep command to verify whether a valid Volume Manager (VxVM) license exists on the target cluster system.


Check recommendation: Use the /opt/VRTS/bin/vxlicinst utility to install a valid VxVM license key.


Learn More...

Installing a VCS license using vxlicinst utility
Troubleshooting for validating license keys
Disk Group agent notes

Check category: Availability


Check description: On the local system where the DiskGroup resource is offline, it checks whether the unique disk identifiers (UDIDs) for the disks match those on the online systems.


Check procedure:

  • Determines the UDID of the disks in the disk group on the local cluster system and system where the disk group is online.
  • Checks whether the discovered UDIDs of the disks match.


Check recommendation: Make sure that the UDIDs for the disks on the cluster nodes match. To find the UDID for a disk, enter the following command:
# vxdisk -s list disk_name.
Note: The check does not handle SRDF replication. In case of SRDF replication, user should make use of 'clearclone=1' attribute (SFHA 6.0.5 onwards) which will clear the clone flag and update the disk UDID.


Learn More...

Disk Group agent notes

Check category: Availability


Check description: Verifies that all the disks in the disk group in a campus cluster have site names. Also verifies that all volumes on the disk group have the same number of plexes on each site in the campus cluster.


Check procedure:

  • Verifies whether the disk group has the proper campus cluster configuration.
  • Verifies whether the disk group is online anywhere in the cluster.
  • Fetches the plex and volume information of all the disk group resources that are configured on target cluster systems.
  • Verify that the plexes and volumes are the same on all sites of the campus cluster.


Check recommendation: Make sure that the site name is added to each disk in a disk group. To verify the site name, enter the following command:
# vxdisk -s list disk_name
On each site in the campus cluster, make sure that all volumes on the disk group have the same number of plexes. To verify the plex and subdisk information of a volume created on a disk group, enter the following command:
# vxprint -g disk_group.


Learn More...

Setting up a campus cluster configuration

Check category: Availability


Check description: Checks if the dig binary is present and is executable on the system.


Check procedure:

  • Fetches the details of the DNS name server using the dig tool
  • Dynamic zone updates are done using the nsupdate command as per the configuration of the DNS resource on the target cluster systems.


Check recommendation: Make sure that the dig binary is present in at least one of the following locations:
* /usr/bin/dig
* /bin/dig
* /usr/sbin/dig

To make the dig binary executable, enter the following command:
# chmod +x dig_binary_path.


Learn More...

Action taken for DNS agent

Check category: Availability


Check description: Checks whether the Transaction Signature (TSIG) key file that is specified in the cluster configuration exists, is readable, and has a non-zero size.


Check procedure:

  • Retrieves the keyfile details from the DNS resource configuration on the target cluster systems.
  • Verifies that the specified keyfile exists and is not a zero-byte file.


Check recommendation: Make sure that the TSIG key file exists and is a non-zero sized file. To make the file readable, enter the following command:
# chmod +r absolute_key_file_path.


Learn More...

Action taken for DNS agent

Check category: Availability


Check description: Checks if stealth masters can reply to a Start of Authority (SOA) query for the configured domain.


Check procedure:

  • Retrieves the details about the DNS master server from the DNS resource configuration on the target cluster systems.
  • Verifies that the DNS master server is configured properly and is reachable from target cluster systems.


Check recommendation: Make sure that you configure the StealthMasters and Domain attributes with the correct values, and that the following SOA query for the domain works properly:
# dig @stealth_master -t SOA domain_name.


Learn More...

Action taken for DNS agent

Check category: Availability


Check description: Checks if any VCS agents have faulted and are not running.


Check procedure:

  • Retrieves the state of the agents on the target system nodes.
  • Verifies that agents are running and not faulted.


Check recommendation: VCS resources that belong to a type whose agent has faulted are not monitored. To restart the agent, as root do the following:
1. Start the agent: # haagent -start Agent -sys node
2. Confirm that the agent has restarted by:

i.Checking the engine log: /var/VRTSvcs/log/engine_A.log
ii. Doing:# ps -ef | grep Agent.


Learn More...

Troubleshooting resources

Check category: Availability


Check description: Checks whether any VCS resources are in a FAULTED state.


Check procedure:

  • Fetches the resources from all the systems specified for executing the check.
  • Determines whether the resources are in a FAULTED state.


Check recommendation: A group cannot failover to a system where the VCS resource has faulted. Fix the problem and use the following command to clear the FAULTED resource state:

hares -clear resource -sys node


Learn More...

Managing resource faults
Configure Restart Limit attribute for resource
State transitions for a resource

Check category: Availability


Check description: Checks whether the network interface that is specified in the cluster configuration exists on the system.


Check procedure:

  • Retrieves the information for the device on which the IP resources are configured.
  • Verifies that the network device that is specified exists on the target cluster systems.


Check recommendation: In the cluster configuration, make sure you specify the correct network device.


Learn More...

Attributes required for configuring IP agent

Check category: Availability


Check description: Checks whether the route to the IP address exists on the network interface specified in the cluster configuration.


Check procedure:

  • Fetches the route details from the configured IP resources.
  • Verifies that the route exits for the specified IP address on the associated network device.


Check recommendation: On the associated network device, add the route to the specified IP address.


Learn More...

Attributes required for configuring IP agent

Check category: Availability


Check description: Checks whether a valid fsck policy has been specified for all the Mount resources that are in the offline state to automatically recover the file systems.


Check procedure:

  • Retrieves the details of FsckOpt attribute for all the Mount resources.
  • Verifies that the value is set to either -Y or -N.


Check recommendation: Set the FsckOpt attribute for the affected Mount resource to either -Y (fix errors during fsck) or -N (do not fix errors during fsck).


Learn More...

Mount attributes

Check category: Availability


Check description: Checks whether the specified mount point is available for mounting after failover happens.


Check procedure:

  • Fetches the mount point location specified in the mount resource configuration.
  • Checks whether the mount point is already mounted.


Check recommendation: If the mount point is mounted, unmount it. Enter the following command: # umount mount_point.


Learn More...

VxFS file system lock

Check category: Availability


Check description: Verifies that the available mount point is not configured to mount a file when the system starts.


Check procedure:

  • Fetches the mount point location that is specified in the mount resource configuration.
  • Checks whether there is an entry in the fstab file for the specified mount point.


Check recommendation: On a cluster node, make sure that the operating system-specific file system table file does not contain an entry for the mount point. These files are /etc/filesystems (AIX), /etc/fstab (HP-UX and Linux), and /etc/vfstab (Solaris).


Learn More...

Samples of Configuration
Mount agent notes

Check category: Availability


Check description: Checks whether the specified mount point existing on a cluster node is available for mounting.


Check procedure:

  • Fetches the mount point location specified in the mount resource configuration.
  • Verifies that the mount point location exists on the target cluster node.


Check recommendation: Create the specified mount point, and make sure that is it not in use.


Learn More...

Offlining mount resource
Mount agent notes

Check category: Availability


Check description: Checks whether the File System (VxFS) installed on the cluster system where the Mount resource is currently offline has a valid license.


Check procedure:

  • Retrieves the list of target cluster system that require a VxFS license.
  • Verifies whether a valid license exists on the target cluster systems.


Check recommendation: Use the /opt/VRTS/bin/vxlicinst utility to install a valid VxFS license on the target cluster systems.


Learn More...

Action taken for Mount agent

Check category: Availability


Check description: Checks whether the lock directory specified in the cluster configuration is on shared storage.


Check procedure:

  • Retrieves the lock directory location from the LockPathName attribute of the NFSRESTART resource configured on the target cluster systems.
  • Verifies that the lock directory exists on shared storage.


Check recommendation: Make sure that the directory specified in the LocksPathName attribute is on shared storage.


Learn More...

Action taken for NFSRestart agent

Check category: Availability


Check description: Verifies that the NFS server does not start automatically when the system starts.


Check procedure:

  • Retrieves the NFS server details from the NFSRESTART resource that is configured on the target cluster systems.
  • Verifies that the NFS server is disabled in the rc.config.d init file.


Check recommendation: In the system configuration file, disable the NFS server so the NFS daemons do not start when the system boots. On Solaris 10 and later, make sure the svcadm command does not start the NFS daemon when the system boots.


Learn More...

Action taken for NFSRestart agent

Check category: Availability


Check description: Checks whether the UP flag is set for the network interface specified in the cluster configuration.


Check procedure:

  • Fetches the network interface details that are specified in the configuration of the target cluster.
  • Verifies that the device has been flagged as ONLINE.


Check recommendation: Make sure that you configure the Device attribute of the NIC resource type to a network interface that is configured on the system with the UP flag set. To set the UP flag on a configured device, use following command:!!Linux:!!# ip link set device_name up!!Solaris/AIX/HP:!!For IPv4:!!# ifconfig device_name inet up!!For IPv6:!!# ifconfig device_name inet6 up.


Learn More...

Attributes required for configuring NIC agent

Check category: Availability


Check description: Checks whether the ToleranceLimit attribute has been set for the VCS NIC resource type.


Check procedure:

  • Retrieves the value of the ToleranceLimit attribute for the VCS NIC, MultiNICA, and MultiNICB resource types.
  • Verifies that the ToleranceLimit attribute is set to a non-zero number.


Check recommendation: Setting the ToleranceLimit to a non-zero value prevents false failover in the case of a spurious network outage. To set the ToleranceLimit for the NIC resource type, login as root and enter the following command:
# hatype -modify NIC ToleranceLimit n
where n > 0.
Because this command prevents an immediate failover and may compromise the high availability of the affected resource groups, only use this command on transient networks.


Learn More...

About the ToleranceLimit attribute

Check category: Availability


Check description: Checks whether the ORACLE_HOME directory location specified in the cluster configuration exists on the system.


Check procedure:

  • Retrieves the ORACLE_HOME directory location specified in the cluster configuration.
  • Verifies that ORACLE_HOME directory exists on the target cluster systems and is mounted properly.


Check recommendation: Ensure that the target cluster system is configured to mount ORACLE_HOME.


Learn More...

Virtual Firedrill actions for Oracle agent

Check category: Availability


Check description: Checks whether the user ID (UID) and group ID (GID) of the owner specified in the Oracle owner attribute match the UID and GID of the owner on the VCS node.


Check procedure:

  • Verifies that the Oracle home directory exists on the target cluster systems.
  • Verifies that the UID/GID for owner is the same on the target cluster systems and on the system where the group is online.


Check recommendation: Make sure that the UID and GID of the Oracle owner match those specified for the owner on the VCS node.


Learn More...

Virtual Firedrill actions for Oracle agent

Check category: Availability


Check description: Verifies that the parameter file that is specified in the Oracle agent PFile or SPFile attribute exists.


Check procedure:

  • Retrieves the location of the parameter file from the Oracle resource configuration on the target cluster systems.
  • Verifies that the parameter file exists on the target cluster systems.


Check recommendation: Make sure that the parameter file (PFile or SPFile) exists, which is specified in the cluster configuration.


Learn More...

Virtual Firedrill actions for Oracle agent

Check category: Availability


Check description: Identifies and logs the application's checksum. The application is defined in the PathName attribute.


Check procedure:

  • Compares the checksum of the specified program with the checksum of the program on the online system.
  • Verifies that the program has executable permissions set.


Check recommendation: Make sure that the script specified in the PathName attribute value exists, and that it is executable on all systems in the cluster.


Learn More...

Attributes required for configuring Process agent

Check category: Availability


Check description: Checks if the path specified by the PathName attribute exists on the cluster node. If the path does not exist locally, the check determines if a Mount resource with a corresponding mount point is available to ensure that the path is on shared storage.


Check procedure:

  • Retrieves the details of the shared directory specified in the Share resource configuration on the target cluster systems.
  • Verifies that the shared directory exists on the target cluster systems.


Check recommendation: Make sure that the shared directory specified in the Share resource configuration exists either locally or through a Mount resource with a corresponding mount point.


Learn More...

Share agent notes
Action taken for File Share agent

Check category: Availability


Check description: Checks if the checksums of the VCS triggers are the same on all the nodes of the cluster.


Check procedure:

  • Checks if the trigger location and triggers exist in /opt/VRTSvcs/bin/triggers on the target nodes.
  • Verifies that the binaries of the triggers are the same across all the target cluster nodes.


Check recommendation: Verify that the specified binaries in /opt/VRTSvcs/bin/triggers are identical on all nodes in the cluster.


Learn More...

About VCS event triggers

Check category: Availability


Check description: Checks if installed VCS triggers are executable.


Check procedure:

  • Checks if the triggers exist in /opt/VRTSvcs/bin/triggers on the target nodes.
  • Verifies that the binaries in /opt/VRTSvcs/bin/triggers are executable by the root user.


Check recommendation: Ensure that the triggers installed in /opt/VRTSvcs/bin/triggers are executable by the root user.


Learn More...

About VCS event triggers

Check category: Availability


Check description: Checks whether all the disks are visible to all the nodes in a cluster.


Check procedure:

  • Fetches the shared disks configured for the cluster systems.
  • Validates that the shared storage is visible for all the cluster systems.


Check recommendation: Make sure that all the disks are connected to all the nodes in a cluster. Run operating system-specific disk discovery commands such as lsdev (AIX), ioscan (HP-UX), fdisk (Linux) or devfsadm (Solaris).

If the disks are not visible, connect the disks to the nodes.


Learn More...

VCS behavior on loss of storage connectivity

Check category: Availability


Check description: Checks whether duplicate disk groups are configured on the specified nodes.


Check procedure:

  • Fetches the disk group names configured for the cluster systems.
  • Verifies that the disk groups on a particular target cluster system are unique and no duplicate disk group names are configured.


Check recommendation: To facilitate successful failover, make sure that there is only one disk group name configured for the specified node. To list the disk groups on a system, enter the following command:

# vxdg list


Learn More...

Disk Group agent notes

Check category: Availability


Check description: Checks if free swap space is below the threshold value specified in the sortdc.conf file: !param!HC_VFD_CHK_FREE_SWAP_THRESHOLD!/param!.


Check procedure:

  • Fetches the free swap space present on the cluster system.
  • Checks whether the free swap space is less than the threshold value specified (in the HC_VFD_CHK_FREE_SWAP_THRESHOLD parameter in the sortdc.conf file).


Check recommendation: Increase the swap space by adding an additional swap device.


Learn More...

About the HostMonitor daemon

Check category: Availability


Check description: Checks if the GAB_START entry in the GAB configuration file is set to 1.


Check procedure:

  • Verifies whether the GAB_START entry in the GAB configuration file is set to 0.


Check recommendation: Make sure that the GAB_START entry in the GAB configuration file is set to 1 so that the GAB module is enabled and ready to startup after system reboot.


Learn More...

VCS environment variables

Check category: Availability


Check description: Checks if the LLT_START entry in the LLT configuration file is set to 1.


Check procedure:

  • If the LLT_START entry in the LLT configuration file is set to 0, the check reports failure.


Check recommendation: Make sure that the LLT_START entry in the LLT configuration file is set to 1 so that the LLT module is enabled and ready to startup after system reboot.



Check category: Availability


Check description: Checks whether the nodes in a cluster have the same operating system, operating system version, and operating system patch level. These attributes must be identical on all systems in a VCS cluster.


Check procedure:

  • Determines the OS, OS version, and the OS patches installed on the cluster systems.
  • Checks whether the OS, OS version, and OS patches on all cluster systems are the same.


Check recommendation: Use operating system-specific command to verify that the nodes in a cluster have the same operating system, version, and patch level. For example, 'uname -a', 'oslevel' (AIX).


Learn More...

Late Breaking News for Cluster Server Management Console 5.1
Format for engine version
VCS system requirements & Support Matrix

Check category: Availability


Check description: Checks if the VCS cluster ID is a non-zero value.


Check procedure:

  • Fetches the cluster Id N from the target VCS node, by parsing the /etc/llttab file for the string set-cluster N.
  • Reports failure if the cluster Id is 0.


Check recommendation: In the /etc/llttab file, set the VCS cluster ID to a unique, non-zero integer less than or equal to 65535.


Learn More...

Configuring the basic cluster

Check category: Availability


Check description: The ClusterAddress attribute is a prerequisite for GCO. This check verifies if the ClusterAddress Cluster attribute is set, and it is the same as the virtual IP address in the ClusterService service group.


Check procedure:

  • Executes the command haclus value ClusterAddress localclus on the target VCS node to get the virtual address assigned to the cluster.
  • Compares this value against the value of the Address attribute of the webip IP resource of the cluster.
  • Reports failure if they are not identical.


Check recommendation: Set the value of the Address attribute of the webip resource to that of the ClusterAddress Cluster atttribute.As root :
1. Get the value of the ClusterAddress Cluster attribute:
# haclus -value ClusterAddress -localclus
2. Modify the Address attribute of the webip resource:
# haconf -makerw
# hares -modify webip Address address
# hares -dump -makero, where address is the output of the first command.


Learn More...

Cluster setup

Check category: Availability


Check description: If the ClusterService service group is configured, it verifies that its OnlineRetryLimit is set.


Check procedure:

  • Retrieves the configuration details of the ClusterService service group.
  • Verifies that the OnlineRetryLimit is set for the ClusterService service group.


Check recommendation: Set the OnlineRetryLimit for the ClusterService service group.Enter:
hagrp -modify ClusterService OnlineRetryLimit N
where N >= 1


Learn More...

About the OnlineRetryLimit attribute

Check category: Availability


Check description: Checks whether the existing cluster configuration at the !param!HC_VFD_CHK_VCS_CONFIG_DIR!/param! directory can be used to start VCS on a system.


Check procedure:

  • Fetches the VCS configuration located at the directory specified by the HC_VFD_CHK_VCS_CONFIG_DIR parameter in the sortdc.conf file on the cluster node.
  • Verifies whether the configuration is valid using the hacf -verify command on the specified configuration directory.


Check recommendation: Fix the VCS configuration located at the directory specified by the HC_VFD_CHK_VCS_CONFIG_DIR parameter in the sortdc.conf on the cluster node.


Learn More...

Creating entry points in scripts
About configuring VCS

Check category: Availability


Check description: Checks if any of the private links in the cluster are in the jeopardy state.


Check procedure:

  • Checks the network connectivity of the target host with the other hosts in the cluster.
  • Checks if any of the private links are in JEOPARDY state.


Check recommendation: 1. Determine the connectivity of this node with the remaining nodes in the cluster. Enter:
/sbin/lltstat -nvv
If the status is DOWN this node cannot see that link to the other node(s).
2. Restore connectivity through this private link.
3. Verify that connectivity has been restored. Enter:
# /sbin/gabconfig -a | /bin/grep jeopardy
If this command does not have any output, the link has been restored.


Learn More...

About cluster membership

Check category: Availability


Check description: Checks if the VCS configuration is read-only.


Check procedure:

  • Executes the command 'haclus -value ReadOnly -localclus' on the target VCS node to verify if the configuration is closed (ReadOnly=1).
  • Reports failure if the configuration is not ReadOnly (ReadOnly=0).


Check recommendation: Close the cluster and save any configuration changes. As root, execute:
# haconf -dump -makero


Learn More...

Cluster Attributes

Check category: Availability


Check description: The VCS sysname defined in the /etc/VRTSvcs/conf/sysname file must be identical to the node name defined in the /etc/llttab file against the set-node attribute. This check also verifies that /etc/llthosts is consistent across all nodes of the cluster.


Check procedure:

  • Gets the system name defined in the /etc/VRTSvcs/conf/sysname file of the target VCS node.
  • Gets the node name nodename from the target VCS node, by parsing the /etc/llttab file for the string set-node nodename.
  • Reports failure if the system name and the node name are not identical.


Check recommendation: Make the contents of /etc/VRTSvcs/conf/sysname identical to node name defined in the /etc/llttab file against the set-node attribute.


Learn More...

How VCS identifies the local system

Check category: Availability


Check description: Checks if each cluster that is discovered in the set of input nodes has a unique name.


Check procedure:

  • Fetches the cluster names of all the clusters that are discovered in the input cluster nodes.
  • Verifies that the cluster names are unique.


Check recommendation: Cluster names should be unique. Change cluster names in case you plan to set up the Global Cluster Option(GCO) between clusters with identical cluster names.


Learn More...

Setting up a global cluster

Check category: Availability


Check description: Checks whether any VCS resource has been disabled.


Check procedure:

  • Executes hares -display -localclus 2>/dev/null and parses the output.
  • Reports failure if a resource is not enabled (Enabled = 0).


Check recommendation: Enable the VCS resource. Login as root and execute the following commands:
# haconf -makerw
# hares -modify resource_name Enabled 1
# haconf -dump -makero.


Learn More...

Addin, deleting, and modifying resource attribute

Check category: Availability


Check description: Checks whether any VCS service group with an enabled resource has been persistently frozen.


Check procedure:

  • Executes the command 'hagrp -list Frozen=1 -localclus' on the target VCS node to get a list of persistently frozen service groups.
  • Reports failure on these service groups that cannot failover.
  • Does not report on the service groups that have been temporarily frozen (TFrozen=1).


Check recommendation: Enable all VCS resource in the service group. As root:
1. Enable VCS resources in the service group:
# haconf -makerw
# hares -modify resource_name Enabled 1
2. Unfreeze the VCS service group:
# hagrp -unfreeze group_name -persistent
# haconf -dump -makero


Learn More...

Freezing and unfreezing service groups

Check category: Availability


Check description: Checks if the values of the VCS resource attributes for virtual hosts or address exist locally in the /etc/hosts file of the system. It is useful for name resolution in case of network connectivity loss to the DNS server.


Check procedure:

  • Checks if a particular VCS resource is of a type that has an attribute that accepts a virtual hostname or an IP address.
  • Reports failure if the attribute value cannot be found locally in the /etc/hosts file of that system.


Check recommendation: Add the value of specified VCS resource attributes to the system /etc/hosts file.


Learn More...

Virtual IPs

Check category: Best practices


Check description: Verify whether the VRTSspt package is present on the system.


Check procedure:

  • Checks whether the VRTSspt package is installed on the system.


Check recommendation: The VRTSspt package is not installed on the system. It is recommended to install the VRTSspt package, which provides time-saving troubleshooting tools. These tools do not run unless they are invoked by root.


Learn More...

How to download the VRTSspt package
How to use VRTSexplorer
How to collect a metasave from a mounted file system

Check category: Best practices


Check description: Checks whether the installed Storage Foundation / InfoScale products are at the latest software patch level.


Check procedure:

  • Identifies all the Storage Foundation / InfoScale products installed on the system.
  • Verifies whether the installed products have the latest software versions that are available for download.


Check recommendation: To avoid known risks or issues, it is recommended that you install the latest versions of the Storage Foundation / InfoScale products on the system.



Check category: Best practices


Check description: Checks whether the IfconfigTwice attribute for the VCS IP resource type is set to 1. Setting the attribute to 1 ensures that when the IP address is brought online or failed over, the system sends multiple Address Resolution Protocol (ARP) packets to the network clients. Sending multiple packets reduces the risk of connection problems after a failover event.


Check procedure:

  • Make sure that you set IfconfigTwice to 1. To set the IfconfigTwice for the IP resource type, login as root and enter the following command:
  • #haconf -makerw
  • # hares -modify res_name IfconfigTwice 1
  • #haconf -dump -makero
  • Note: This attribute only applies to Solaris and HP-UX platforms.


Check recommendation: Make sure you set the IfconfigTwice attribute has been set to a value of 1 or larger.


Learn More...

IPMultiNICA attributes

Check category: Best practices


Check description: Checks whether the NetworkHosts attribute for the VCS NIC resource type has been configured. This attribute specifies the list of hosts that are pinged to determine if the network is active. If you do not specify this attribute, the agent must rely on the NIC broadcast address. This causes a flood in network traffic.


Check procedure:

  • Retrieves the value of the NetworkHosts attribute for the VCS NIC, MultiNICA, and MultiNICB resource types.
  • Verifies that a list of IP addresses which can be pinged quickly is set for the NetworkHosts attributes.


Check recommendation: Make sure you configure the NetworkHosts attribute with a list of IP addresses that can be pinged to determine if the network connection is active. To set the NetworkHosts attribute for the NIC resource, login as root and enter the following command: # hares -modify res_name NetworkHosts ip_address where ip_address is a space separated list of IP addresses.


Learn More...

NIC attributes

Check category: Performance


Check description: Checks whether the system is experiencing I/O waits (that is, blocked processes).


Check procedure:

  • Collects the vmstat command output and determines the total number of processes blocked for I/O.


Check recommendation: Ensure that this system does not have one or more kernel threads/processes that are blocked for I/O resources. Narrow down the root cause using iostat or similar tools.



 
Read and accept Terms of Service