How VCS attributes control behavior on loss of storage connectivity

If you use I/O fencing, you can configure VCS attributes to ensure that a node panics on losing connectivity to shared storage. The panic causes service groups to fail over to another node.

A system reboot or shutdown could leave the system in a hung state because the operating system cannot dump the buffer cache to the disk. The panic operation ensures that VCS does not wait for I/Os to complete before triggering the failover mechanism, thereby ensuring application availability. However, you might have to perform a file system check when you restart the node.

The following attributes define VCS behavior on loss of storage connectivity: -

PanicSystemOnDGLoss

Applies to the DiskGroup resource and defines whether the agent panics the system when storage connectivity is lost.

FaultOnMonitorTimeouts

The number of consecutive monitor timeouts after which VCS calls the Clean function to mark the resource as FAULTED or restarts the resource.

If you set the attribute to 0, VCS does not treat a Monitor timeout as a resource fault. By default, the attribute is set to 4. This means that the Monitor function must time out four times in a row before VCS marks the resource faulted.

When the Monitor function for the DiskGroup agent times out, (case 3 in the table above), the FaultOnMonitorTimeouts attribute defines when VCS interprets the resource as faulted and invokes the Clean function. If the CleanReason is "monitor hung", the system panics.