Symantec logo

System panics to prevent potential data corruption

When a node experiences a split brain condition and is ejected from the cluster, it panics and displays the following console message:

VXFEN:vxfen_plat_panic: Local cluster node ejected from cluster 
to prevent potential data corruption.
 
How vxfen driver checks for pre-existing split brain condition

The vxfen driver functions to prevent an ejected node from rejoining the cluster after the failure of the private network links and before the private network links are repaired.

For example, suppose the cluster of system 1 and system 2 is functioning normally when the private network links are broken. Also suppose system 1 is the ejected system. When system 1 restarts before the private network links are restored, its membership configuration does not show system 2; however, when it attempts to register with the coordinator disks, it discovers system 2 is registered with them. Given this conflicting information about system 2, system 1 does not join the cluster and returns an error from vxfenconfig that resembles:

vxfenconfig: ERROR: There exists the potential for a preexisting 
split-brain. The coordinator disks list no nodes which are in 
the current membership. However, they also list nodes which are 
not in the current membership.
 

I/O Fencing Disabled!

Also, the following information is displayed on the console:

<date> <system name> vxfen: WARNING: Potentially a preexisting

<date> <system name> split-brain.

<date> <system name> Dropping out of cluster.

<date> <system name> Refer to user documentation for steps

<date> <system name> required to clear preexisting split-brain.

<date> <system name>

<date> <system name> I/O Fencing DISABLED!

<date> <system name>

<date> <system name> gab: GAB:20032: Port b closed

However, the same error can occur when the private network links are working and both systems go down, system 1 restarts, and system 2 fails to come back up. From the view of the cluster from system 1, system 2 may still have the registrations on the coordinator disks.

Case 1: system 2 up, system 1 ejected (actual potential split brain)

Determine if system1 is up or not. If it is up and running, shut it down and repair the private network links to remove the split brain condition. restart system 1.

Case 2: system 2 down, system 1 ejected (apparent potential split brain)
  1. Physically verify that system 2 is down.
  2. Verify the systems currently registered with the coordinator disks. Use the following command:

    # vxfenadm -g all -f /etc/vxfentab

    The output of this command identifies the keys registered with the coordinator disks.

  3. Clear the keys on the coordinator disks as well as the data disks using the command /opt/VRTSvcs/rac/bin/vxfenclearpre.

    See Clearing keys after split brain using vxfenclearpre command

  4. Make any necessary repairs to system 2 and restart.