Troubleshooting service groups

This section cites the most common problems associated with bringing service groups online and taking them offline. Bold text provides a description of the problem. Recommended action is also included, where applicable.


VCS does not automatically start service group.

VCS does not automatically start a failover service group if the VCS engine (HAD) in the cluster was restarted by the hashadow process.

This behavior prevents service groups from coming online automatically due to events such as GAB killing HAD because to high load, or HAD committing suicide to rectify unexpected error conditions.


System is not in RUNNING state.

Recommended Action: Type hasys -display system to verify the system is running.

See System states.


Service group not configured to run on the system.

The SystemList attribute of the group may not contain the name of the system.

Recommended Action: Use the output of the command hagrp -display service_group to verify the system name.


Service group not configured to autostart.

If the service group is not starting automatically on the system, the group may not be configured to AutoStart, or may not be configured to AutoStart on that particular system.

Recommended Action: Use the output of the command hagrp -display service_group to verify the values of the AutoStart and AutoStartList attributes.


Service group is frozen.

Recommended Action: Use the output of the command hagrp -display service_group to verify the value of the Frozen and TFrozen attributes. Use the command hagrp -unfreeze to unfreeze the group. Note that VCS will not take a frozen service group offline.


Failover service group is online on another system.

The group is a failover group and is online or partially online on another system.

Recommended Action: Use the output of the command hagrp -display service_group to verify the value of the State attribute. Use the command hagrp -offline to offline the group on another system.


A critical resource faulted.

Output of the command hagrp -display service_group indicates that the service group has faulted.

Recommended Action: Use the command hares -clear to clear the fault.


Service group autodisabled.

When VCS does not know the status of a service group on a particular system, it autodisables the service group on that system. Autodisabling occurs under the following conditions:

Under these conditions, all service groups that include the system in their SystemList attribute are autodisabled. This does not apply to systems that are powered off.

Recommended Action: Use the output of the command hagrp -display service_group to verify the value of the AutoDisabled attribute.

Caution: To bring a group online manually after VCS has autodisabled the group, make sure that the group is not fully or partially active on any system that has the AutoDisabled attribute set to 1 by VCS. Specifically, verify that all resources that may be corrupted by being active on multiple systems are brought down on the designated systems. Then, clear the AutoDisabled attribute for each system:

# hagrp -autoenable service_group -sys system


Service group is waiting for the resource to be brought online/taken offline.

Recommended Action: Review the IState attribute of all resources in the service group to locate which resource is waiting to go online (or which is waiting to be taken offline). Use the hastatus command to help identify the resource. See the engine and agent logs in /var/VRTSvcs/log for information on why the resource is unable to be brought online or be taken offline.

To clear this state, make sure all resources waiting to go online/offline do not bring themselves online/offline. Use the command hagrp -flush to clear the internal state of VCS. You can now bring the service group online or take it offline on another system.


Service group is waiting for a dependency to be met.

Recommended Action: To see which dependencies have not been met, type hagrp -dep service_group to view service group dependencies, or hares -dep resource to view resource dependencies.


Service group not fully probed.

This occurs if the agent processes have not monitored each resource in the service group. When the VCS engine, HAD, starts, it immediately "probes" to find the initial state of all of resources. (It cannot probe if the agent is not returning a value.) A service group must be probed on all systems included in the SystemList attribute before VCS attempts to bring the group online as part of AutoStart. This ensures that even if the service group was online prior to VCS being brought up, VCS will not inadvertently bring the service group online on another system.

Recommended Action: Use the output of hagrp -display service_group to see the value of the ProbesPending attribute for the system's service group. (It should be zero.) To determine which resources are not probed, verify the local Probed attribute for each resource on the specified system. Zero means waiting for probe result, 1 means probed, and 2 means VCS not booted. See the engine and agent logs for information.