Disk failures

An uncorrectable I/O error occurs when disk failure, cabling or other problems cause the data on a disk to become unavailable. For a RAID-5 volume, this means that a subdisk becomes unavailable. The subdisk cannot be used to hold data and is considered stale and detached. If the underlying disk becomes available or is replaced, the subdisk is still considered stale and is not used.

If an attempt is made to read data contained on a stale subdisk, the data is reconstructed from data on all other stripe units in the stripe. This operation is called a reconstructing-read. This is a more expensive operation than simply reading the data and can result in degraded read performance. When a RAID-5 volume has stale subdisks, it is considered to be in degraded mode.

A RAID-5 volume in degraded mode can be recognized from the output of the vxprint -ht command as shown in the following display:

V   NAME       RVG/VSET/COKSTATE STATE    LENGTH   READPOL    PREFPLEX   UTYPE
PL  NAME       VOLUME   KSTATE   STATE    LENGTH   LAYOUT     NCOL/WID   MODE
SD  NAME       PLEX     DISK     DISKOFFS LENGTH   [COL/]OFF  DEVICE     MODE
SV  NAME       PLEX     VOLNAME  NVOLLAYR LENGTH   [COL/]OFF  AM/NM      MODE
...
v   r5vol      -        ENABLED  DEGRADED 204800   RAID       -          raid5
pl  r5vol-01   r5vol    ENABLED  ACTIVE   204800   RAID       3/16       RW
sd  disk01-01  r5vol-01disk01    0        102400   0/0        sda        ENA
sd  disk02-01  r5vol-01disk02    0        102400   1/0        sdb        dS
sd  disk03-01  r5vol-01disk03    0        102400   2/0        sdc        ENA
pl  r5vol-02   r5vol    ENABLED  LOG      1440     CONCAT     -          RW
sd  disk04-01  r5vol-02disk04    0        1440     0          sdd        ENA
pl  r5vol-03   r5vol    ENABLED  LOG      1440     CONCAT     -          RW
sd  disk05-01  r5vol-03disk05    0        1440     0          sde        ENA

The volume r5vol is in degraded mode, as shown by the volume state, which is listed as DEGRADED. The failed subdisk is disk02-01, as shown by the MODE flags; d indicates that the subdisk is detached, and S indicates that the subdisk's contents are stale.

Warning:

Do not run the vxr5check command on a RAID-5 volume that is in degraded mode.

A disk containing a RAID-5 log plex can also fail. The failure of a single RAID-5 log plex has no direct effect on the operation of a volume provided that the RAID-5 log is mirrored. However, loss of all RAID-5 log plexes in a volume makes it vulnerable to a complete failure. In the output of the vxprint -ht command, failure within a RAID-5 log plex is indicated by the plex state being shown as BADLOG rather than LOG.

In the following example, the RAID-5 log plex r5vol-02 has failed:

V   NAME       RVG/VSET/COKSTATE   STATE     LENGTH     READPOL      PREFPLEX    UTYPE
PL  NAME       VOLUME   KSTATE     STATE     LENGTH     LAYOUT       NCOL/WID    MODE
SD  NAME       PLEX     DISK       DISKOFFS  LENGTH     [COL/]OFF    DEVICE      MODE
SV  NAME       PLEX     VOLNAME    NVOLLAYR  LENGTH     [COL/]OFF    AM/NM       MODE
...
v   r5vol      -        ENABLED    ACTIVE    204800     RAID         -           raid5
pl  r5vol-01   r5vol    ENABLED    ACTIVE    204800     RAID         3/16        RW
sd  disk01-01  r5vol-01disk01      0         102400     0/0          sda         ENA
sd  disk02-01  r5vol-01disk02      0         102400     1/0          sdb         ENA
sd  disk03-01  r5vol-01disk03      0         102400     2/0          sdc         ENA
pl  r5vol-02   r5vol    DISABLED   BADLOG    1440       CONCAT       -           RW
sd  disk04-01  r5vol-02disk04      0         1440       0            sdd         ENA
pl  r5vol-03   r5vol    ENABLED    LOG       1440       CONCAT       -           RW
sd  disk05-01  r5vol-12disk05      0         1440       0            sde         ENA