When a node experiences a split brain condition and is ejected from the cluster, it panics and displays the following console message:
VXFEN:vxfen_plat_panic: Local cluster node ejected from cluster to prevent potential data corruption.
The vxfen driver functions to prevent an ejected node from rejoining the cluster after the failure of the private network links and before the private network links are repaired.
For example, suppose the cluster of system 1 and system 2 is functioning normally when the private network links are broken. Also suppose system 1 is the ejected system. When system 1 restarts before the private network links are restored, its membership configuration does not show system 2; however, when it attempts to register with the coordinator disks, it discovers system 2 is registered with them. Given this conflicting information about system 2, system 1 does not join the cluster and returns an error from vxfenconfig
that resembles:
vxfenconfig: ERROR: There exists the potential for a preexisting split-brain. The coordinator disks list no nodes which are in the current membership. However, they also list nodes which are not in the current membership.
Note During the system boot, because the HP-UX rc sequencer redirects the stderr of all rc scripts to the file /etc/rc.log, the error messages will not be printed on the console. It will be logged in the /etc/rc.log file.
Also, the following information is displayed on the console:
<date> <system name> vxfen: WARNING: Potentially a preexisting
<date> <system name> split-brain.
<date> <system name> Dropping out of cluster.
<date> <system name> Refer to user documentation for steps
<date> <system name> required to clear preexisting split-brain.
<date> <system name> I/O Fencing DISABLED!
<date> <system name> gab: GAB:20032: Port b closed
Note
If syslogd
is configured with the -D
option, then the informational message will not be printed on the console. The messages will be logged in the system buffer. The system buffer can be read with the dmesg
command.
However, the same error can occur when the private network links are working and both systems go down, system 1 restarts, and system 2 fails to come back up. From the view of the cluster from system 1, system 2 may still have the registrations on the coordinator disks.
Determine if system1 is up or not. If it is up and running, shut it down and repair the private network links to remove the split brain condition. restart system 1.
# vxfenadm -g all -f /etc/vxfentab
The output of this command identifies the keys registered with the coordinator disks.
/opt/VRTSvcs/rac/bin/vxfenclearpre
.
See Clearing keys after split brain using vxfenclearpre command