About the I/O fencing algorithm
To ensure the most appropriate behavior is followed in both common and rare corner case events, the fencing algorithm works as follows:
The fencing module is designed to never have systems in more than one subcluster remain current and valid members of the cluster. In all cases, either one subcluster will survive, or in very rare cases, no systems will.
The system with the lowest LLT ID in any subcluster of the original cluster races for control of the coordinator disks on behalf of the other systems in that subcluster.
If a system wins the race for the first coordinator disk, that system is given priority to win the race for the other coordinator disks.
Any system that loses a race will delay a short period of time before racing for the next disk. Under normal circumstances, the winner of the race to the first coordinator disk will win all disks.
This ensures a clear winner when multiple systems race for the coordinator disk, preventing the case where three or more systems each win the race for one coordinator disk.
If the cluster splits such that one of the subclusters has at least 51% of the members of the previous stable membership, that subcluster is given priority to win the race.
The system in the smaller subcluster(s) delay a short period before beginning the race.
This ensures that as many systems as possible will remain running in the cluster.
If the vxfen module discovers on startup that the system that has control of the coordinator disks is not in the current GAB membership, an error message indicating a possible split brain condition is printed to the console.
The administrator must clear this condition manually with the