Symantec logo

System panic prevents potential data corruption

When a system experiences a split brain condition and is ejected from the cluster, it panics and displays the following console message:

VXFEN:vxfen_plat_panic: Local cluster node ejected from cluster to prevent potential data corruption.

How vxfen driver checks for pre-existing split brain condition

The vxfen driver functions to prevent an ejected node from rejoining the cluster after the failure of the private network links and before the private network links are repaired.

For example, suppose the cluster of galaxy and nebula is functioning normally when the private network links are broken. Also suppose galaxy is the ejected system. When galaxy reboots before the private network links are restored, its membership configuration does not show nebula; however, when it attempts to register with the coordinator disks, it discovers nebula is registered with them. Given this conflicting information about nebula, galaxy does not join the cluster and returns an error from vxfenconfig that resembles:

vxfenconfig: ERROR: There exists the potential for a preexisting

split-brain. The coordinator disks list no nodes which are in the

current membership. However, they also list nodes which are not

in the current membership.

I/O Fencing Disabled!

Also, the following information is displayed on the console:

<date> <system name> vxfen: WARNING: Potentially a preexisting

<date> <system name> split-brain.

<date> <system name> Dropping out of cluster.

<date> <system name> Refer to user documentation for steps

<date> <system name> required to clear preexisting split-brain.

<date> <system name>

<date> <system name> I/O Fencing DISABLED!

<date> <system name>

<date> <system name> gab: GAB:20032: Port b closed

However, the same error can occur when the private network links are working and both systems go down, galaxy reboots, and nebula fails to come back up. From the view of the cluster from galaxy, nebula may still have the registrations on the coordinator disks.

Case 1: nebula up, galaxy ejected (actual potential split brain)

 To respond to Case 1

  1. Determine if galaxy is up or not.
  2. If it is up and running, shut it down and repair the private network links to remove the split brain condition.
  3. Restart galaxy.
Case 2: nebula down, galaxy ejected (apparent potential
split brain)

 To respond to Case 2

  1. Physically verify that nebula is down.
  2. Verify the systems currently registered with the coordinator disks. Use the following command:

    # vxfenadm -g all -f /etc/vxfentab

    The output of this command identifies the keys registered with the coordinator disks.

  3. Clear the keys on the coordinator disks as well as the data disks using the command /opt/VRTSvcs/vxfen/bin/vxfenclearpre. See Clearing keys after split brain.
  4. Make any necessary repairs to nebula and reboot.