Recovering data after a disaster using VFR

When a site disaster completely destroys the source file system, a newly created file system can be populated from the last known good Storage Checkpoint on the target system.

To recover from a Storage Checkpoint

  1. Promote the last known good Storage Checkpoint on the target system. All changes after the last Storage Checkpoint are deleted. If the intermediate and possibly inconsistent data is required, you can skip this step.
    • Unmount the file system:

      # umount target_mntpt

      where target_mntpt is the mount point on the target system.

    • Promote the last good Storage Checkpoint as displayed by the vfradmin getjobckpt command:

      # /opt/VRTS/bin/fsckpt_restore device_file checkpoint_name

      For example:

      # /opt/VRTS/bin/fsckpt_restore /dev/vx/dsk/replicatedg/target2
      vxfsrepl_ckpt_877167997_12Sep11_14_59
    • Mount the file system:

      # mount -t vxfs device_name target_mntpt
    • Rename the Storage Checkpoint name to "filesystem_root" or any other name:

      # /opt/VRTS/bin/fsckptadm rename old_checkpoint_name
      new_checkpoint_name target_mntpt

      For example:

      # /opt/VRTS/bin/fsckptadm rename 
      vxfsrepl_ckpt_877167997_12Sep11_14_59 filesystem_root /target2

    For more information, see the fsckpt_restore (1M) and fsckptadm (1M) manual pages.

  2. Delete the old job from target machine using the vfradmin destroyjob command. Note that this will remove the Storage Checkpoint created by the replication job. It is important to rename the checkpoint as mentioned in Step 1.
  3. Create a new replication job on the new source and target systems, where the old target system is the new source system.
  4. Start the replication job in one sync mode:
    # vfradmin syncjob name mntpt

    This command populates the new file system with the last good known image of the Storage Checkpoint.

  5. After the sync is complete, applications can begin using the new source file system. The newly created replication job can be started after changing the job direction.

If a disaster leaves the source file system intact and the target file system is not used during the disaster, the replication job can be started and it will continue replicating the changes to the target system. This is similar to a normal system restart.