Error Code details
V-16-1-10600
Severity: Error 
Component: Cluster Server 
Message:
Cannot connect to VCS engine.
Description:

For UNIX, this error message is generated in response to an operation on objects such as clusters, systems, groups, and resources. The error message may display in the engine log when a Veritas Cluster Server (VCS) resource is configured for a Solaris local zone. The error may also occur after a power outage when one node fails to join a cluster after issuing the hastart command.

For Windows, the hacli command communicates with the high availability daemon (HAD) process for any operations on objects such as clusters, systems, service groups, and resources. If the hacli command cannot communicate with the HAD process, the error occurs.

 

Veritas solutions
Solution 1 Vote: [Useful] [Not useful]
Last Modified: 2014-07-17 09:43:04
Platform: Generic
Release: Generic
Content:

For UNIX

When a VCS resource is configured for a Solaris local zone

This instance of the error indicates that when a resource is configured to run inside a Solaris local zone, the VCS commands cannot connect to the VCS engine (HAD) directly. This error is most likely caused by running VCS commands (e.g. halog) when the halogin environment is not set up properly.  For example, in the VCS entry point scripts, it is common to call halog (through the VCSAG_LOG_MSG or VCSAG_LOGDBG_MSG functions) to log a message to the engine log.

In order for the VCS commands to connect to the VCS engine, you should set up the halogin environment in the local zone. Therefore, Veritas recommends that you use the hazonesetup command to set up a proper halogin environment for the VCS commands to run successfully inside a Solaris local zone.

To setup VCS zones using the hazonesetup command, enter the following:

# hazonesetup –g servicegroup_name –r zoneres_name –z zone_name –p password –a –s system1,system2, ….systemN  

Where:

    servicegroup_name  – the name of the service group
    zoneres_name – the name of the VCS Zone resource to be created
    zone_name – the name of the Solaris local zone
    password – a new password for the service group administrator

Note that after you run the hazonesetup command, you may find the .vcshost and .vcspwd files in the local zone. The /etc/VRTSvcs/.vcshost file contains the hostname of the global zone. The .vcspwd file contains the encrypted password for the VCS zone user. If the DeleteVCSZoneUser attribute of the Zone resource is set to 1, when VCS takes a zone resource offline, the .vcspwd file is removed.

When one node fails to join a cluster after a power outage

This instance of the error often occurs with the VCS WARNING V-16-1-11046 Local system not available error. This error occurs when a power outage causes VCS to not shutdown properly.

If this error occurs, you must manually start the Global Atomic Broadcast (GAB) process and other processes using the following commands:

    # svcadm enable gab
    # svcadm enable llt
    # svcadm enable vcs

 

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

For Windows

This message is displayed when:

  • The cluster cannot start, because GAB cannot seed the cluster. As a result, HAD cannot start on a node. Veritas Cluster High Availability Engine (Had.exe) hangs in the Starting state.

GAB requires a minimum number of nodes be online to successfully seed the cluster. The minimum number of nodes is defined in the C:\Program Files\VERITAS\comms\gab\gabtab.txt file. When all nodes go down and one or more nodes do not come online, it is possible that the minimum node requirement is not reached.

  • Veritas Product Authentication Service (vxat or vxeat depending on version) is having problems even if HAD is in the Running state.

To determine the cause of the error:

  1. Check if the HAD error message is displayed in the Application Event Viewer in the Event Properties window.
  2. Check the Services Control Panel to see if the Veritas High Availability Engine Startup service is in the Starting state.
  3. Use one of the following commands to determine the state of the cluster:
    C:\>hastatus -sum
    or
    C:\>hasys –state

To fix the error in the case where HAD cannot start:

  1. Check the gabtab.txt file for information that defines the seeding requirement: the minimum number of nodes for this cluster that must be available before HAD can start. HAD will not start until the cluster is seeded, even on only a subset of nodes. HAD must be running on a node for cluster operations to commence on that node.
  2. If the Veritas High Availability Engine Startup service is in the Starting state, you must manually seed GAB. Enter the following commands:

C:\>gabconfig -c –x    (This command configures GAB, if needed, and seeds a node or nodes, regardless of the minimum specified in the gabtab.txt file.)
C:\>hasys –state         (This command displays the current status of all cluster nodes that are online.)

Note: Starting with version 6.0.2,  the VCS Administrator’s Guide instructs users to use gabconfig -x to manually seed the cluster, while the VCS Administrator’s Guides for 6.0.1 and earlier versions specify using gabconfig -c -x  instead.

The -x option forces a subset of the required minimum nodes to seed. In a two node cluster that has only one node online, this seeds the one node and allows HAD to start on that node. Once HAD has started on one node, bringing the other node online allows it to join the cluster, as long as the private network communication between nodes is working.

If the configuration file is valid, the node should transition from LOCAL_BUILD to RUNNING. However, you must run the hasys –state command repeatedly to see if the status changes, as the command shows a snapshot of the current state.

In the case where Veritas Product Authentication Service is having issues, you need to fix and reconfigure the Veritas Product Authentication Service to resolve the HAD connectivity issue.