Minimal downtime upgrade example


		< Previous \| TOC \| Index \| Next >

Minimal downtime upgrade example

In this example, you have four nodes: node01, node02, node03, and node04. You also have four service groups: sg1, sg2, sg3, and sg4. Each service group is running on one node.

node01 runs sg2.
node02 runs sg1.
node03 runs sg4.
node04 runs sg3.

In your system list, you have each service group failing over to one other node.

sg1 can fail over between node01 and node02.
sg2 can fail over between node01 and node03.
sg3 can fail over between node01 and node04.
sg4 can fail over between node03 and node04.
Four nodes, four service groups, and their failover paths

Click the thumbnail above to view full-sized image.

Minimal downtime example overview

This example presumes that you have at least one service group (in this case sg3), that cannot stay online on both nodes during the upgrade. In this situation, it is best if sg3 is a low-priority service group. The cluster is split with node02 and node03 together for the first upgrade, and node01 and node04 together for the next upgrade.

You switch sg1 to run on node01. Switch sg4 to run on node04. You then perform the upgrade on node02 and node03. When you finish the upgrade on node02 and node03, you need to upgrade node01 and node04.

Your cluster is down when you stop HAD on node01 and node04, but have not yet started node02 and node03.

You have to take your service groups offline manually on node01 and node04. When you start node02 and node03, the service groups come online. Reboot node01 and node04 when the upgrade completes. They then rejoin the cluster and you can balance the load on machines by switching service groups.

Performing the minimal downtime example upgrade

This upgrade uses four nodes with four service groups—note that in this scenario the service groups cannot stay online for part of the upgrade. Remember to not add, remove, or change resources or service groups on any nodes during the upgrade as these changes are likely to get lost after the upgrade.

To establish running service groups

Establish where your service groups are online.
  # hagrp -state

  #Group Attribute        System Value

   sg1 State            node01 |OFFLINE|

   sg1 State            node02 |ONLINE|

   sg2 State            node01 |OFFLINE|

   sg2 State            node03 |ONLINE|

   sg3 State            node01 |OFFLINE|

   sg3 State            node04 |ONLINE|

   sg4 State            node03 |ONLINE|

   sg4 State            node04 |OFFLINE|
Switch the service groups from all the nodes that you are first upgrading (node02 and node03) to the remaining nodes (node01 and node04).
  # hagrp -switch sg1 -to node01

  # hagrp -switch sg2 -to node01

  # hagrp -switch sg4 -to node04
Verify that your service groups are offline on the nodes targeted for upgrade.
  # hagrp -state

  #Group Attribute         System Value

   sg1 State             node01     |ONLINE|

   sg1 State             node02     |OFFLINE|

   sg2 State             node01     |ONLINE|

   sg2 State             node03     |OFFLINE|

   sg3 State             node01     |OFFLINE|

   sg3 State             node04     |ONLINE|

   sg4 State             node03     |OFFLINE|

   sg4 State             node04     |ONLINE|

During the next procedure, do not perform any configuration tasks. Do not start any modules.

To perform the minimum downtime upgrade on target nodes

On the target nodes, start the 5.0 installer for VCS.
Select the VCS installation.
Answer n when the installer asks:

  Do you want to upgrade to version 5.0 on these systems using 
  the current configuration? [y,n,q,?] (y) n

Answer with the names of the nodes that you want to upgrade:

  Enter the system names separated by spaces on which to install 
  VCS: node02 node03

Select either option 1 or 2 when the installer asks:
Select the packages to be installed on all systems? 2
Answer n when the installer completes and asks:

  Do you want to start Veritas Cluster Server processes now? 
  [y,n,q] (y) n

To edit the configuration and prepare for upgrade node01 and node04

When HAD is down on node02 and node03, you see this message:
Shutdown completed successfully on all systems.
After you see the above message, you can make the VCS configuration writable on node01 or node04. Note that you need to make the configuration writable because the installer froze the service groups during the upgrade.
# haconf -makerw
Unfreeze all service groups.
  # hagrp -unfreeze sg1 -persistent

  # hagrp -unfreeze sg2 -persistent

  # hagrp -unfreeze sg3 -persistent

  # hagrp -unfreeze sg4 -persistent
Dump the configuration and make it read-only.
# haconf -dump -makero

To edit the configuration on node02 and node03

Open the main.cf file, and delete the Frozen = 1 line for each service group as appropriate.
Save and close the file.
Reboot node02 and node03.
Wait for GAB to come up. In the console's output, look for a line that reads:
Starting GAB is done.

To upgrade and restart your clusters

On node01 and node04, take the service groups offline.
  # hagrp -offline sg1 -sys node01

  # hagrp -offline sg2 -sys node01

  # hagrp -offline sg3 -sys node04

  # hagrp -offline sg4 -sys node04
On node01 and node04, perform the upgrade.
See To perform the minimum downtime upgrade on target nodes.
When HAD is down on node01 and node04, you see this message:
Shutdown completed successfully on all systems.
Start vxfenconfig on node02 and node03.
# vxfenconfig -c
Start your cluster on node02 and node03.
# hastart
After the upgrade completes, reboot node01 and node04.

After you have rebooted the nodes, all four nodes now run the latest version of VCS.

In this example, you achieved minimal downtime because your service groups were down only from the point when you took them offline on node01 and node04, to the time VCS brought them online on node02 or node03 as appropriate.


^ Return to Top	< Previous \| TOC \| Index \| Next >