vcs-aix-GAB-LLT-5.0MP1EXT-HF3

 Basic information
Release type: Hot Fix
Release date: 2009-06-29
OS update support: None
Technote: 327355
Documentation: None
Popularity: 1972 viewed    downloaded
Download size: 3.43 MB
Checksum: 2696203557

 Applies to one or more of the following products:
Cluster Server 5.0MP1 On AIX 6.1
Storage Foundation Cluster File System 5.0MP1 On AIX 6.1
Storage Foundation for DB2 5.0MP1 On AIX 6.1
Storage Foundation for Oracle 5.0MP1 On AIX 6.1
Storage Foundation for Oracle RAC 5.0MP1 On AIX 6.1
Storage Foundation HA 5.0MP1 On AIX 6.1

 Obsolete patches, incompatibilities, superseded patches, or other requirements:
None.

 Fixes the following incidents:
1233409, 1274390, 1717248

 Patch ID:
VRTSgab.rte-05.00.0001.0101
VRTSllt.rte-05.00.0001.0202

Readme file
OS: AIX
OS Version: 6.1
Etrack Incidents: 1233409, 1274390, 1717248
Fixes Applied for Products:
    VRTSllt - Veritas Low Latency Transport by Symantec
    VRTSgab - Veritas Group Membership and Atomic Broadcast by Symantec

Additional Instructions:
Please read the instructions below before installing the patch.

    PATCH 5.0MP1EXT+e1712755 for VERITAS Low Latency Transport 
      & VERITAS Group Membership and Atomic Broadcast
===============================================================

                  Patch Date:  June 2009

This README provides information on:
   * BEFORE GETTING STARTED
   * CRC AND BYTE COUNT
   * FIXES AND ENHANCEMENTS INCLUDED IN THIS PATCH
   * PACKAGES AFFECTED BY THIS PATCH
   * INSTALLING THE PATCH(S) IN VCS ENVIRONMENT
   * UNINSTALLING THE PATCH(S) IN VCS ENVIRONMENT
   * INSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT
   * UNINSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT


BEFORE GETTING STARTED
----------------------
This patches only applies to:
	VRTSllt 5.0-MP1 EXT running on AIX 6.1
    and
	VRTSgab 5.0-MP1 EXT running on AIX 6.1

Ensure that you are running the supported configurations before
installing this patch.

CRC AND BYTE COUNT
------------------
Ensure that the file you have downloaded matches the following
checksum and byte count.
The following command can be used to ascertain this:

# cksum VRTSllt.rte.bff
1385450038      3840000 VRTSllt.rte.bff
# cksum VRTSgab.rte.bff
1167228987      8550400 VRTSgab.rte.bff

FIXES AND ENHANCEMENTS INCLUDED IN THIS PATCH:
---------------------------------------------
e1717248	During a heavy I/O load induced on SFHA/SFCFS-HA/SFRAC, 
		one or more nodes may halt due to an abend exception with 
		the following stack trace. The problem is more likely to 
		occur with AIX 6.1 on P6 with storage keys enabled but can 
		not be ruled out with storage keys disabled configuration as well.
		Stack Trace:
			[0001AD40]abend_trap+000000 ()
			[0007E5B8]tstart+000558 (??)
			[00014F50].kernel_add_gate_cstack+000030 ()
			[F1000000A02A6044].llt_aix_timeout+0000E4 ()
			[F1000000A02AFD64].llt_timer_handler+000470 ()
			[F1000000A02A62F0].llt_timer_procfunc+0000A0 ()
			[00014D70].hkey_legacy_gate+00004C ()
			[001A2DD0]procentry+000010 (??, ??, ??, ??)

		LLT and GAB internal timer implementations use tstart() 
		kernel service to submit timer requests, with the timer 
		request block as an input. The tstart() implementation 
		keeps information of the timer handler from the timer 
		request block. During a race condition, if LLT or GAB 
		attempt to submit another timer request through tstart() 
		before the previous timer handler is completed, the previous 
		handler's stale values may be accessed by tstart(),
		leading to abend exception and system panic.

		The race condition is avoided by modifying LLT and GAB drivers to call 
		tstop() before issuing a tstart() in their timer implementations.

e1233409        In Cluster setup, Veritas low latency transport(LLT) driver is used
                for communication. LLT communicate with AIX OS DLPI driver for
                sending and receiving network packets on physical network. The
                upcalls from DLPI driver to LLT use to be always in process context.
                With latest changes in AIX DLPI driver now calls to LLT comes in
                interrupt context. This causes panic or hang in LLT driver or in
                clients of LLT like GAB. The patch made the changes in LLT to be
                interrupt safe and calls to clients of LLT done in process context.
                The known AIX APAR which change the behavior of DLPI driver and causing
                panic in LLT or GAB are:

                5200-10 - AIX APAR IZ19838
                5300-06 - AIX APAR IZ05430
                5300-07 - AIX APAR IZ11726
                5300-08 - AIX APAR IZ09036
                6100-00 - AIX APAR IZ13304

                To find if your system has APAR which changes the DLPI behavior, run the
                intsfix command with APAR number or grep for string
                "BRING DLPI DRIVER "TO SPEC"". for eg :
                # instfix  -iv | grep "BRING DLPI DRIVER \"TO SPEC\""
                IZ11726 Abstract: BRING DLPI DRIVER "TO SPEC"

e1294686        LLT-DLPI changes can cause hang on single cpu machine as the thread
                holding lock is swapped out of cpu and another thread spin on cpu
                for the same lock. Changes are done in the locking mechanism for LLT.

PACKAGES AFFECTED BY THIS PATCH:
-------------------------------
This patch updates the following VCS packages 
	VRTSllt.rte.bff fileset to 5.0.1.202 level
and 
	VRTSgab.rte.bff fileset to 5.0.1.101 level


INSTALLING THE PATCH(S) IN VCS ENVIRONMENT :
-----------------------------------------
The following steps should be run on all nodes in the VCS cluster:

Stopping a node :
---------------
1. Offline all applications, which are configured on CVM/CFS 
   and are outside VCS control.

    After all applications using CFS and CVM have been taken down,
    run 'slibclean' to unload the libraries from memory.

2. Stop VCS on the current node.
	# /opt/VRTSvcs/bin/hastop -local 
   Verify that ports 'f' (CFS), 'v' and 'w' (CVM), 'h' (VCS) have been closed,
	# /sbin/gabconfig -a
   The display should not have port 'f', 'v', 'w' and 'h' listed	

3. If VXFEN is not configured, please go to step 5

4. Unconfigure VxFen:
	#  /sbin/vxfenconfig -U
   Verify that port 'b' has been closed
	# /sbin/gabconfig -a
   The display should not have port 'b' listed	

5. Unconfigure GAB:
	# /sbin/gabconfig -U

6. Unconfigure LLT:
	# /sbin/lltconfig -Uo

7. Unload the GAB driver:
	# /etc/methods/gabkext -stop

   Unload the LLT driver:
	# /usr/sbin/strload -ud /usr/lib/drivers/pse/llt

8. Verify that the LLT driver has been unloaded
	# /usr/sbin/strload -qd /usr/lib/drivers/pse/llt
	/usr/lib/drivers/pse/llt: no
   If llt is still loaded "yes" will show up in the output above.

   Verify that the GAB driver has been unloaded:
	# /etc/methods/gabkext -status
	gab: unloaded

NOTE: If you are unable to successfully unload either the GAB or LLT driver,
the server must be rebooted AFTER the installation of the patches. 
This is so that the new GAB driver gets loaded in the AIX kernel.


Installing Patch(s) :
-------------------
1.  Change directory to the patch location and gunzip 
    the VRTS*.bff.gz files. Install the LLT & GAB patch from 
    the bff files from the same location:
	# installp -a -d ./VRTSllt.rte.bff VRTSllt.rte
	# installp -a -d ./VRTSgab.rte.bff VRTSgab.rte

2.  Verify that the new fileset has been installed:
	# lslpp -l VRTSllt.rte
VRTSllt.rte                5.0.1.202  APPLIED  Veritas Low Latency Transport
                                               by Symantec
	# lslpp -l VRTSgab.rte
VRTSgab.rte                5.0.1.101  APPLIED  Veritas Group Membership and
                                               Atomic Broadcast by Symantec

Re-starting the node :
--------------------
1. Verify that the new LLT driver has been loaded:
	# strload -qd /usr/lib/drivers/pse/llt
	/usr/lib/drivers/pse/llt: yes
   Verify that the new GAB driver has been loaded:
	# /etc/methods/gabkext -status
	gab: loaded

2. If not already loaded, load the newly installed LLT driver:
	# strload -d /usr/lib/drivers/pse/llt
   If not already loaded, load the newly installed GAB driver:
	# /etc/methods/gabkext -start

3. Configure LLT:
	# /sbin/lltconfig -c

4. Verify that LLT has been configured properly
	# /sbin/lltconfig
	LLT is running

5. Configure GAB:
	# sh /etc/gabtab

6. Verify that the GAB membership shows up correctly:
	# /sbin/gabconfig -a
   The display should have Port 'a' listed

7. Configure VxFen (if VxFEN was configured previously)
	# /sbin/vxfenconfig -c
   Verify that vxfen has been configured
	# /sbin/gabconfig -a
   The output should list port 'b'

8. Start VCS:
	# /opt/VRTSvcs/bin/hastart
   Verify that VCS is up and running:
	# /sbin/gabconfig -a
   The display should show port 'f', 'v', 'w' and 'h' listed.
   The 'f', 'v' and 'w' port will be listed if CVM and CFS are configured.

9.  Start applications (stopped earlier), which are outside VCS control.

Committing the Patch(s) :
----------------------
1. To commit the patch:
(Note: That the patch cannot be backed out once it is committed)
	# installp -c VRTSllt.rte
	# installp -c VRTSgab.rte

2.  Verify that the fileset is committed:
	# lslpp -l VRTSllt.rte
VRTSllt.rte                5.0.1.202  COMMITTED  Veritas Low Latency Transport
                                                 by Symantec
	# lslpp -l VRTSgab.rte
VRTSgab.rte                5.0.1.101  COMMITTED  Veritas Group Membership and
                                                 Atomic Broadcast by Symantec

UNINSTALLING THE PATCH(S) IN VCS ENVIRONMENT :
---------------------------------------------
The VRTSllt.rte.bff & VRTSgab.rte.bff patch can ONLY be backed out 
if it has NOT been committed.

NOTE: Before uninstalling patch, make sure that the APAR changing 
DLPI behavior is not installed on the system by running 
following commands:
  # instfix  -iv | grep "BRING DLPI DRIVER \"TO SPEC\""

If above mentioned command returns an APAR then backing out
this point patch will move llt to older version which will 
cause panic or hang.

Steps to Backout the Patch(s) :
-----------------------------
1. Follow the steps provided under "Stopping the node" section above,
   to stop the node & unload the drivers.

2. Backout the patches by the following command:
	# installp -r VRTSllt.rte 5.0.1.202
	# installp -r VRTSgab.rte 5.0.1.101

3. Verify that the patch has been backed out:
(Note: The previously installed fileset(s) will be in committed state again.
       It may differ from the mentioned, if a Hotfix was installed on top 
       of VCS 5.0MP1 EXT)
	# lslpp -l VRTSllt.rte
VRTSllt.rte                5.0.1.100  COMMITTED  Veritas Low Latency Transport
                                                 by Symantec
	# lslpp -l VRTSgab.rte
VRTSgab.rte                5.0.1.100  COMMITTED  Veritas Group Membership and
                                                 Atomic Broadcast by Symantec

4. Restart the node following the steps under 
   "Re-Start the node" section above.

 Note: The llt & gab drivers will now refer to the old ones.


INSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT:
--------------------------------------------
The following steps should be run on all nodes in the cluster,
with SFRAC stack installed:

1. Offline all applications, which are configured on CVM/CFS 
   and are outside VCS control.

2. If Oracle database is not configured in VCS, stop it using following command:
	$ srvctl stop instance -d <database name> -i <instance name>

3(a). For Oracle 9iR2, stop 'gsd' using the follwing command as Oracle user
   	$ gsdctl stop
   To check the status of gsdctl, run the following command:
	$ gsdctl stat
   The gsdctl command is typically found in $ORACLE_HOME/bin.

3(b). For Oracle 10gR1 and 10gR2, Stop CRS manually, 
      if CRS is not under VCS control.
         # /etc/init.crs stop

4. After all the oracle instances and other applications using 
   CFS and CVM have been stopped, run 'slibclean' to unload 
   the libraries from memory.

5. Stop VCS on the current node.
	# /opt/VRTSvcs/bin/hastop -local 

6. Verify that ports 'h', 'f', 'v' and 'w' have been closed
	# /sbin/gabconfig -a
   The display should not have ports 'h', 'f', 'v' and 'w' listed

7. Unconfigure VCSMM:
	# /sbin/vcsmmconfig -U
   Verify that port 'o' has been closed
	# /sbin/gabconfig -a
   The display should not have port 'o' listed.
   If it does ensure that Oracle instances are offline.

8. Unconfigure LMX:
	# /sbin/lmxconfig -U

9. Unconfigure VxFen:
	# /sbin/vxfenconfig -U
    Verify that port 'b' has been closed
	# /sbin/gabconfig -a
    The display should not have port 'b' listed

10. Unmount ODM:
	# umount /dev/odm
    Verify that port 'd' has been closed
	# /sbin/gabconfig -a
    The display should not have port 'd' listed

11. At this point all gab ports except port 'a' should have been closed
    Verify this as follows:
	# /sbin/gabconfig -a

12. Follow steps 5 to 8 of "Stopping the cluster" section from 
    "INSTALLING THE PATCH(S) IN VCS ENVIRONMENT" chapter above.

13. Follow all the instruction in "Installing the patch" section
    from "INSTALLING THE PATCH(S) IN VCS ENVIRONMENT" chapter above.

14. Follow steps 1 to 7 of "Re-starting the cluster" section from 
    "INSTALLING THE PATCH(S) IN VCS ENVIRONMENT" chapter above.

15. Configure LMX:
	# /sbin/lmxconfig -c

16. Configure VCSMM:
	# /sbin/vcsmmconfig -c
    Verify that vxfen has been configured
	# /sbin/gabconfig -a
    The output should list port 'o'

17. Mount ODM:
	# mount /dev/odm

18. Start VCS:
	# /opt/VRTSvcs/bin/hastart

19. Check if all ports are now open
	# /sbin/gabconfig -a
    The output should list ports 
    'a', 'b', 'd', 'f', 'h', 'o', 'v', and 'w'.

20(a). For Oracle 10gR1 and 10gR2, start CRS manually,
       if CRS is not under VCS control.
         # /etc/init.crs start

20(b). For Oracle 9iR2, start 'gsd' using the follwing command as Oracle user
   	$ gsdctl start
   To check the status of gsdctl, run the following command:
	$ gsdctl stat
   The gsdctl command is typically found in $ORACLE_HOME/bin.

21. If Oracle database is not configured in VCS, 
    start it using following procedure.
	$ srvctl start instance -d <database name> -i <instance name>

22. Online all applications, which are configured on CVM/CFS 
    and are outside VCS control (stopped earlier).

23. To commit the patches follow "Committing the Patch" section from 
    "INSTALLING THE PATCH(S) IN VCS ENVIRONMENT" chapter above.

UNINSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT:
----------------------------------------------
The VRTSllt.rte.bff & VRTSgab.rte.bff patch can ONLY be 
backed out if it has not been committed.

NOTE: Before uninstalling patch, make sure that the APAR 
changing DLPI behavior is not installed on the system by 
running following commands:
   # instfix  -iv | grep "BRING DLPI DRIVER \"TO SPEC\""

If above mentioned command returns an APAR then backing out
this point patch will move llt to older version which will 
cause panic or hang.

Steps to Backout the Patch:
1.  Follow the steps outlined 1 through 17 of chapter 
    "INSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT"
    to stop and unload the drivers.

2. Backout the patches by the following command:
	# installp -r VRTSllt.rte 5.0.1.202
	# installp -r VRTSgab.rte 5.0.1.101

3. Verify that the patch has been backed out:
(Note: The previously installed fileset(s) will be in committed state again.
       It may differ from the mentioned, if a Hotfix was installed on top 
       of VCS 5.0MP1 EXT)
	# lslpp -l VRTSllt.rte
VRTSllt.rte                5.0.1.100  COMMITTED  Veritas Low Latency Transport
                                                 by Symantec
	# lslpp -l VRTSgab.rte
VRTSgab.rte                5.0.1.100  COMMITTED  Veritas Group Membership and
                                                 Atomic Broadcast by Symantec

4. Next as before go through the process of loading and configuring
   LLT, GAB and bringing up SFRAC (steps 19 through 27 above of 
   chapter "INSTALLING THE PATCH(S) IN SFRAC ENVIRONMENT").

 Note: The llt & gab drivers will now refer to the old ones.