Remote cluster states

In global clusters, the "health" of the remote clusters is monitored and maintained by the wide-area connector process. The connector process uses heartbeats, such as Icmp, to monitor the state of remote clusters. The state is then communicated to HAD, which then uses the information to take appropriate action when required. For example, when a cluster is shut down gracefully, the connector transitions its local cluster state to EXITING and notifies the remote clusters of the new state. When the cluster exits and the remote connectors lose their TCP/IP connection to it, each remote connector transitions their view of the cluster to EXITED.

To enable wide-area network heartbeats, the wide-area connector process must be up and running. For wide-area connectors to connect to remote clusters, at least one heartbeat to the specified cluster must report the state as ALIVE.

There are three hearbeat states for remote clusters: HBUNKNOWN, HBALIVE, and HBDEAD.

Table: VCS state definitions provides a list of VCS remote cluster states and their descriptions.

Table: VCS state definitions

State

Definition

INIT

The initial state of the cluster. This is the default state.

BUILD

The local cluster is receiving the initial snapshot from the remote cluster.

RUNNING

Indicates the remote cluster is running and connected to the local cluster.

LOST_HB

The connector process on the local cluster is not receiving heartbeats from the remote cluster

LOST_CONN

The connector process on the local cluster has lost the TCP/IP connection to the remote cluster.

UNKNOWN

The connector process on the local cluster determines the remote cluster is down, but another remote cluster sends a response indicating otherwise.

FAULTED

The remote cluster is down.

EXITING

The remote cluster is exiting gracefully.

EXITED

The remote cluster exited gracefully.

INQUIRY

The connector process on the local cluster is querying other clusters on which heartbeats were lost.

TRANSITIONING

The connector process on the remote cluster is failing over to another node in the cluster.

More Information

Examples of system state transitions