Overview of campus clustering with VCS

This overview focuses on the recovery with a VCS campus cluster. Automated recovery is handled differently in a VCS campus cluster than with a VCS local cluster.

The following table lists failure situations and the outcomes that occur with the two different settings for the ForceImport attribute of the VMDg resource. This attribute can be set to 1 (automatically forcing the import of the disk groups to the another node) or 0 (not forcing the import).

For information on how to set the ForceImport attribute:

See Setting the ForceImport attribute.

Table: Failure situations

Failure Situation

ForceImport set to 0

(import not forced)

ForceImport set to 1

(automatic force import)

1. Application fault

May mean the services stopped for an application, a NIC failed, or a database table went offline.

Application automatically moves to other site.

Service Group failover is automatic on the standby or preferred system or node.

2. Server failure

May mean that a power cord was unplugged, a system hang occurred, or another failure caused the system to stop responding.

Application automatically moves to other site. 100% of the disks are still available.

Service Group failover is automatic on the standby or preferred system or node. 100% of the mirrored disks are still available.

3. Failure of disk array or all disks

Remaining disks in mirror are still accessible from the other site.

No interruption of service. Remaining disks in mirror are still accessible from other site.

The Service Group does not failover. 50% of the mirrored disk is still available at remaining site.

4. Site failure

All access to the server and storage is lost.

Manual intervention required to move application. Can't import with only 50% of the disks available.

Application automatically moves to the other site.

5. Split-brain situation (loss of both heartbeats)

If the public network link is used as a low-priority heartbeat, it is assumed that link is also lost.

No interruption of service. Can't import disks because original site still has the SCSI reservation.

No interruption of service. Failover does not occur due to Service Group resources remaining online on the original nodes. Example: Online node has SCSI reservation to own disk.

6. Storage interconnect lost

Fibre interconnect severed.

No interruption of service. Disks on the same node are functioning. Mirroring is not working.

No interruption of service. Service Group resources remain online, but 50% of the mirror disk becomes detached.

7. Split-brain situation and storage interconnect lost

If a single pipe is used between buildings for the Ethernet and storage, this situation can occur.

No interruption of service. Can't import with only 50% of disks available. Disks on the same node are functioning. Mirroring is not working.

Automatically imports disks on secondary site. Now disks are online in both locations - data can be kept from only one.