Cluster reconfiguration


		< Previous \| TOC \| Index \| Next >

Cluster reconfiguration

Cluster reconfiguration occurs if a node leaves or joins a cluster. Each node's cluster monitor continuously watches the other cluster nodes. When the membership of the cluster changes, the cluster monitor informs VxVM for it to take appropriate action.

During cluster reconfiguration, VxVM suspends I/O to shared disks. I/O resumes when the reconfiguration completes. Applications may appear to freeze for a short time during reconfiguration.

If other operations, such as VxVM operations or recoveries, are in progress, cluster reconfiguration can be delayed until those operations have completed. Volume reconfigurations do not take place at the same time as cluster reconfigurations. Depending on the circumstances, an operation may be held up and restarted later. In most cases, cluster reconfiguration takes precedence. However, if the volume reconfiguration is in the commit stage, it completes first.

See "Volume reconfiguration" on page 417.

See "vxclust utility" on page 415.

See "vxclustadm utility" on page 415.

vxclust utility

vxclust is used when Sun Java System Cluster software acts as the cluster monitor.

Every time there is a cluster reconfiguration, every node currently in the cluster runs the vxclust utility at each of several well-orchestrated steps. The cluster monitor facilities ensure that the same step is executed on all nodes at the same time. A given step only starts when the previous one has completed on all nodes. At each step in the reconfiguration, the vxclust utility determines what the cluster functionality of VxVM should do next. After informing VxVM of its next action, the vxclust utility waits for the outcome (success, failure, or retry) and communicates that to the cluster monitor.

If a node does not respond to a the vxclust utility request within a specific timeout period, that node aborts. The vxclust utility then decides whether to restart the reconfiguration or give up, depending on the circumstances. If the cause of the reconfiguration is a local, uncorrectable error, vxclust gives up. If a node cannot complete an operation because another node has left, the surviving node times out. In this case, the vxclust utility requests a reconfiguration with the expectation that another node will leave. If no other node leaves, the vxclust utility causes the local node to leave.

If a reconfiguration step fails, the vxclust utility returns an error to the cluster monitor. The cluster monitor may decide to abort the node, causing its immediate departure from the cluster. Any I/O in progress to the shared disk fails and access to the shared disks is stopped.

vxclust decides what actions to take when it is informed of changes in the cluster. If a new master node is required (due to failure of the previous master), vxclust determines which node becomes the new master.

vxclustadm utility

The vxclustadm command provides an interface to the cluster functionality of VxVM when VCS is used as the cluster monitor. It is also called during cluster startup and shutdown. In the absence of a cluster monitor, vxclustadm can also be used to activate or deactivate the cluster functionality of VxVM on any node in a cluster.

The startnode keyword to vxclustadm starts cluster functionality on a cluster node by passing cluster configuration information to the VxVM kernel. In response to this command, the kernel and the VxVM configuration daemon, vxconfigd, perform initialization.

The stopnode keyword stops cluster functionality on a node. It waits for all outstanding I/O to complete and for all applications to close shared volumes.

The abortnode keyword terminates cluster activity on a node. It does not wait for outstanding I/O to complete nor for applications to close shared volumes.

The reinit keyword allows nodes to be added to or removed from a cluster without stopping the cluster. Before running this command, the cluster configuration file must have been updated with information about the supported nodes in the cluster.

The nidmap keyword prints a table showing the mapping between node IDs in VxVM's cluster-support subsystem and node IDs in the cluster monitor. It also prints the state of the node in the cluster.

The nodestate keyword reports the state of a cluster node and also the reason for the last abort of the node as shown in this example:

# /etc/vx/bin/vxclustadm nodestate

state: out of cluster

reason: user initiated stop

Node abort messages lists the various reasons that may be given for a node abort.

Node abort messages

Reason	Description
`cannot find disk on slave node`	Missing disk or bad disk on the slave node.
`cannot obtain configuration data`	The node cannot read the configuration data due to an error such as disk failure.
`cluster device open failed`	Open of a cluster device failed.
`clustering license mismatch with master node`	Clustering license does not match that on the master node.
`clustering license not available`	Clustering license cannot be found.
`connection refused by master`	Join of a node refused by the master node.
`disk in use by another cluster`	A disk belongs to a cluster other than the one that a node is joining.
`join timed out during reconfiguration`	Join of a node has timed out due to reconfiguration taking place in the cluster.
`klog update failed`	Cannot update kernel log copies during the join of a node.
`master aborted during join`	Master node aborted while another node was joining the cluster.
`minor number conflict`	Minor number conflicts exist between private disk groups and shared disk groups that are being imported.
`protocol version out of range`	Cluster protocol version mismatch or unsupported version.
`recovery in progress`	Volumes that were opened by the node are still recovering.
`transition to role failed`	Changing the role of a node to be the master failed.
`user initiated abort`	Node is out of cluster due to an abort initiated by the cluster monitor.
`user initiated stop`	Node is out of cluster due to a stop initiated by the user or by the cluster monitor.
`vxconfigd is not enabled`	The VxVM configuration daemon is not enabled.

See the vxclustadm(1M) manual page.


^ Return to Top	< Previous \| TOC \| Index \| Next >