Reinstating faulted hardware in a campus cluster

Once a failure occurs and an application is migrated to another node or site, it is important to know what will happen when the original hardware is reinstated.

Table: Behavior exhibited when hardware is reinstatedlists the behavior when various hardware components affecting the configuration (array or disks, site hardware, networking cards or cabling, storage interconnect, etc.) are reinstated after failure.

Table: Behavior exhibited when hardware is reinstated

Failure Situation, before Reinstating the Configuration

ForceImport set to 0 (import not forced)

ForceImport set to 1 (automatic force import)

3) Failure of disk array or all disks

Remaining disks in mirror are still accessible from the other site.

No interruption of service. Resync the mirror from the remote site.

Same behavior.

4) Site failure

All access to the server and storage is lost.

Inter-node heartbeat communication is restored and the original cluster node becomes aware that the application is online at the remote site. Resync the mirror from the remote site

Same behavior.

5) Split-brain situation (loss of both heartbeats)

No interruption of service.

Same behavior.

6) Storage interconnect lost

Fibre interconnect severed.

No interruption of service. Resync the mirror from the original site.

Same behavior.

7) Split-brain situation and storage interconnect lost

No interruption of service. Resync the mirror from the original site.

VCS alerts administrator that volumes are online at both sites. Resync the mirror from the copy with the latest data.

The numbers 3 through 7 in Table: Behavior exhibited when hardware is reinstated refer to the scenarios in Campus cluster failover using the ForceImport attribute.

Situations 1 and 2 have no effect when reinstated. Keep in mind that the cluster has already responded to the initial failure.

While the outcomes of using both settings of the ForceImport attribute for most scenarios are the same, the ForceImport option provides automatic failover in the event of site failure. This advantage comes at the cost of potential data loss if all storage and network communication paths between the sites are severed. Choose an option that is suitable given your cluster infrastructure, uptime requirements, and administrative capabilities.