The components of membership arbitration are the fencing module and the coordination points.
Each system in the cluster runs a kernel module called vxfen, or the fencing module. This module is responsible for ensuring valid and current cluster membership on a membership change through the process of membership arbitration. vxfen performs the following actions:
Coordination points provide a lock mechanism to determine which nodes get to fence off data drives from other nodes. A node must eject a peer from the coordination points before it can fence the peer from the data drives. Racing for control of the coordination points to fence data disks is the key to understand how fencing prevents split brain.
The coordination points can either be disks or servers or both. Typically, a cluster must have three coordination points.
Disks that act as coordination points are called coordinator disks. Coordinator disks are three standard disks or LUNs set aside for I/O fencing during cluster reconfiguration. Coordinator disks do not serve any other storage purpose in the VCS configuration.
You can configure I/O fencing to use either the DMP devices or the underlying raw character devices. Veritas Volume Manager Dynamic Multipathing (DMP) feature allows coordinator disks to take advantage of the path failover and the dynamic adding and removal capabilities of DMP. Based on the disk device type that you use, you must define the I/O fencing SCSI-3 disk policy as either raw or dmp. The disk policy is dmp by default.
See the Veritas Volume Manager Administrator's Guide for details on the DMP feature.
The coordination point server (CP server) is a software solution which runs on a remote system or cluster. CP server provides arbitration functionality by allowing the VCS cluster nodes to perform the following tasks:
Note With the CP server, the fencing arbitration logic still remains on the VCS cluster.
Multiple VCS clusters running different operating systems can simultaneously access the CP server. TCP/IP based communication is used between the CP server and the VCS clusters.
The fencing module starts up as follows:
This allows the fencing start up script to use Veritas Volume Manager (VxVM) commands to easily determine which disks are coordinator disks, and what paths exist to those disks. This disk group is never imported, and is not used for any other purpose.
For example, if the user has configured 3 coordinator disks with 2 paths to each disk, the /etc/vxfentab file will contain 6 individual lines, representing the path name to each disk, such as
The fencing driver examines GAB port B for membership information. If no other systems are up and running, it is the first system up and is considered the correct coordinator disk configuration. When a new member joins, it requests a coordinator disks configuration. The system with the lowest LLT ID will respond with a list of the coordinator disk serial numbers. If there is a match, the new member joins the cluster. If there is not a match, vxfen enters an error state and the new member is not allowed to join. This process ensures all systems communicate with the same coordinator disks.
This is done by verifying that any system that has keys on the coordinator disks can also be seen in the current GAB membership. If this verification fails, the fencing driver prints a warning to the console and system log and does not start.
Topology of coordinator disks in the cluster
Click the thumbnail above to view full-sized image.
Upon startup of the cluster, all systems register a unique key on the coordinator disks. The key is unique to the cluster and the node, and is based on the LLT cluster ID and the LLT system ID.
See About the I/O fencing registration key format
When there is a perceived change in membership, membership arbitration works as follows:
The preempt and abort command allows only a registered system with a valid key to eject the key of another system. This ensures that even when multiple systems attempt to eject other, each race will have only one winner. The first system to issue a preempt and abort command will win and eject the key of the other system. When the second system issues a preempt and abort command, it can not perform the key eject because it is no longer a registered system with a valid key.
Each system will repeat this race to all the coordinator disks. The race is won by, and control is attained by, the system that ejects the other system's registration keys from a majority of the coordinator disks.
Note Forcing a manual seed at this point will allow the cluster to seed. However, when the fencing module checks the GAB membership against the systems that have keys on the coordinator disks, a mismatch will occur. vxfen will detect a possible split brain condition, print a warning, and will not start. In turn, HAD will not start. Administrative intervention is required.