Symantec logo

Unstartable RAID-5 volumes

When a RAID-5 volume is started, it can be in one of many states. After a normal system shutdown, the volume should be clean and require no recovery. However, if the volume was not closed, or was not unmounted before a crash, it can require recovery when it is started, before it can be made available.

Under normal conditions, volumes are started automatically after a reboot and any recovery takes place automatically or is done through the vxrecover command.

A RAID-5 volume is unusable if some part of the RAID-5 plex does not map the volume length in the following circumstances:

When this occurs, the vxvol start command returns the following error message:

VxVM vxvol ERROR V-5-1-1236 Volume r5vol is not startable; RAID-5 plex does not map entire volume length.

At this point, the contents of the RAID-5 volume are unusable.

Another possible way that a RAID-5 volume can become unstartable is if the parity is stale and a subdisk becomes detached or stale. This occurs because within the stripes that contain the failed subdisk, the parity stripe unit is invalid (because the parity is stale) and the stripe unit on the bad subdisk is also invalid.

Invalid RAID-5 volume illustrates a RAID-5 volume that has become invalid due to stale parity and a failed subdisk.

Invalid RAID-5 volume

Invalid RAID-5 volume

Click the thumbnail above to view full-sized image.

There are four stripes in the RAID-5 array. All parity is stale and subdisk disk05-00 has failed. This makes stripes X and Y unusable because two failures have occurred within those stripes.

This qualifies as two failures within a stripe and prevents the use of the volume. In this case, the output display from the vxvol start command is as follows:

VxVM vxvol ERROR V-5-1-1237 Volume r5vol is not startable; some subdisks are unusable and the parity is stale.

This situation can be avoided by always using two or more RAID-5 log plexes in RAID-5 volumes. RAID-5 log plexes prevent the parity within the volume from becoming stale which prevents this situation.

See "System failures" on page 16.

Forcibly starting a RAID-5 volume with stale subdisks

You can start a volume even if subdisks are marked as stale: for example, if a stopped volume has stale parity and no RAID-5 logs, and a disk becomes detached and then reattached.

The subdisk is considered stale even though the data is not out of date (because the volume was in use when the subdisk was unavailable) and the RAID-5 volume is considered invalid. To prevent this case, always have multiple valid RAID-5 logs associated with the array whenever possible.

 To forcibly start a RAID-5 volume with stale subdisks

  1. Specify the -f option to the vxvol start command.

    # vxvol [-g diskgroup] -f start r5vol

    This causes all stale subdisks to be marked as non-stale. Marking takes place before the start operation evaluates the validity of the RAID-5 volume and what is needed to start it. You can mark individual subdisks as non-stale by using the following command:

# vxmend [-g diskgroup] fix unstale subdisk

If some subdisks are stale and need recovery, and if valid logs exist, the volume is enabled by placing it in the ENABLED kernel state and the volume is available for use during the subdisk recovery. Otherwise, the volume kernel state is set to DETACHED and it is not available during subdisk recovery. This is done because if the system were to crash or if the volume were ungracefully stopped while it was active, the parity becomes stale, making the volume unusable. If this is undesirable, the volume can be started with the -o unsafe start option.

Warning: The -o unsafe start option is considered dangerous, as it can make the contents of the volume unusable. It is therefore not recommended.

The volume state is set to RECOVER, and stale subdisks are restored. As the data on each subdisk becomes valid, the subdisk is marked as no longer stale. If the recovery of any subdisk fails, and if there are no valid logs, the volume start is aborted because the subdisk remains stale and a system crash makes the RAID-5 volume unusable. This can also be overridden by using the -o unsafe start option.

If the volume has valid logs, subdisk recovery failures are noted but they do not stop the start procedure.

When all subdisks have been recovered, the volume is placed in the ENABLED kernel state and marked as ACTIVE.