OS: Linux OS Version: SLES10 (x86_64) Fixes Applied for Products: VRSTllt - Veritas Low Latency Transport by Symantec Additional Instructions: Please read the instructions below before installing the patch. PATCH LLT 5.0MP3RP2HF2 for VRTSllt on 5.0MP3RP2 ====================================================================== Patch Date: September, 2009 This README provides information on: * BEFORE GETTING STARTED * CRC AND BYTE COUNT * FIXES AND ENHANCEMENTS INCLUDED IN THE PATCH * PACKAGES AFFECTED BY THE PATCH * INSTALLING THE PATCH * UNINSTALLING THE PATCH BEFORE GETTING STARTED: ---------------------- This patch applies only to: VRTSllt 5.0MP3RP2 running on Linux (SLES10 x86_64) Ensure that you are running the supported configurations before installing this patch. CRC AND BYTE COUNT: ------------------ Ensure that the patch file you have downloaded matches the following checksum and byte count. The following command can be used to ascertain this: # cksum VRTSllt-5.0.30.22-MP3RP2HF2_SLES10.x86_64.rpm 3980824752 4918722 VRTSllt-5.0.30.22-MP3RP2HF2_SLES10.x86_64.rpm FIXES AND ENHANCEMENTS INCLUDED IN THE PATCH: -------------------------------------------- Etrack Incidents: 1831925, 1702445, 1749396, 1631012 SDR's of Fixed Symantec Incidents: -------------------------------- Symantec Incident : 1831925 Symptom: An LLT node (say A) misses heartbeats from a peer node (say B) and declares node B as dead. However, the other node B is not actually dead, but just that it has not been able to send heartbeats to node A. Defect Description: Node B is somehow very busy and is not able to schedule OS timeouts fast enough. This causes node B to delay sending heartbeats out. Resolution: From various LLT threads, check if the LLT timer handler has executed in the last few seconds (LLT runtime tunable: timetosendhb) or not. If not, send out heartbeats immediately. Symantec Incident : 1702445 Symptom: When a node is going down, and at the same time another is coming up, its possible that GAB induces a panic on a running node with a message similar to the following: "Port b halting system due to network failure" Description: After LLT manages to open a node-level connection with a peer node, it takes some time before LLT marks its links as UP for the peer node. If the peer node goes down before the links are marked UP for it, then the local LLT node is not able to detect the peer node's death. This causes inconsistency in the GAB membership calculation, and can lead to GAB panicking one of the running nodes. Resolution: The window in which LLT has opened a connection with a peer node, but has not yet marked the links UP for that peer node, is now closed. Symantec Incident : 1749396 Symptom: When LLT and GAB are started on both the nodes of a two node cluster simultaneously, GAB shows jeopardy even if two links are configured under LLT. Description: During LLT startup, LLT sends a broadcast packet, which is used in the duplicate node-id detection mechanism. This packet contains link information for the peer nodes to consume. At startup time, this link information can be potentially incorrect. When the peer node receives this packet, it learns the potentially incorrect link information from it. This causes incorrect notification of link state, and can cause GAB to report jeopardy membership even though more than one link has been configured under LLT. Resolution: The broadcast packet sent at startup time will now have the link information set to an invalid value instead of an incorrect value. Hence, when the peer node recieves this packet, it does not consume the invalid link information. This has no adverse effect on the duplicate node-id detection mechanism. Symantec Incident : 1631012 Symptom: LLT over UDP gets configured successfully, even if the IP addresses specified for LLT links are not plumbed on the system. Description: During LLT configuration, we don't consider IP addresses specified in the file /etc/llttab. If some other IP address (not defined for LLT) is plumbed on the system, LLT will come up without any error. Resolution: Checks have been put-in to make sure that LLT gets configured only if the IP addresses specified in /etc/llttab are actually plumbed on some network interface. PACKAGES AFFECTED BY THE PATCH: ------------------------------ This patch updates the following VCS package: VRTSllt from 5.0MP3RP2(5.0.30.20) to 5.0MP3RP2HF2 (5.0.30.22) INSTALLING THE PATCH: -------------------- This patch must be installed after installing Veritas Cluster Server 5.0MP3RP2. The following steps should be run on all nodes in the VCS cluster: Stopping VCS on the cluster node: -------------------------------- 1. Log on as superuser on the system on which the point patch is to be installed. 2. Presistently freeze that node in VCS: # /opt/VRTSvcs/bin/haconf -makerw # /opt/VRTSvcs/bin/hasys -freeze [nodename] -persistent # /opt/VRTSvcs/bin/haconf -dump -makero 3. Stop all clients of GAB. For example: a) For VCS, run the following command: # /etc/init.d/vcs stop b) For VxFEN: # /etc/init.d/vxfen stop Take care to also shutdown any components dependant on these clients. Take care to observe correct order. 4. Stop GAB: # /etc/init.d/gab stop 5. Stop LLT: # /etc/init.d/llt stop Installing the Patch: -------------------- 1. Un-compress the downloaded patch from Symantec. Change directory to the unzipped patch location. Install the VRTSllt 5.0.30.22 patch using the following command: # rpm -Uvh VRTSllt-5.0.30.22-MP3RP2HF2_SLES10.x86_64.rpm 2. Verify that the new patch has been installed: # rpm -q VRTSllt You will find the following output on display with the patch installed properly: VRTSllt-5.0.30.22-MP3RP2HF2 Re-starting VCS on the cluster node: ----------------------------------- 1. Start LLT by using the following command: # /etc/init.d/llt start 2. Start GAB by using the following command: # /etc/init.d/gab start 3. Start all the stopped GAB clients. For example: a) For VxFEN, run the following command: # /etc/init.d/vxfen start b) For VCS, run the following command: # /etc/init.d/vcs start Take care to also bring up any components that are dependent on the clients that were stopped in step 3. Take care to observe correct order. 4. Persistently unfreeze the system in VCS: # /opt/VRTSvcs/bin/haconf -makerw # /opt/VRTSvcs/bin/hasys -unfreeze [nodename] -persistent # /opt/VRTSvcs/bin/haconf -dump -makero UNINSTALLING THE PATCH: ---------------------- Un-install the LLT patch from each node following the steps given below: Steps to remove the Patch from a cluster node: --------------------------------------------- 1. Follow the steps provided under "Stopping VCS on the cluster node" section above, to stop VCS on the node & unload any drivers, as required. 2. Remove the patch by the following command: # rpm -e --nodeps VRTSllt 3. Verify that the patch has been removed from the system: # rpm -q VRTSllt You should see output similar to: package VRTSllt is not installed 4. Install VRTSllt package from VCS 5.0MP3 Installer CD and upgrade to LLT 5.0MP3RP2 or higher. 5. Restart the node following the steps under "Re-starting VCS on the cluster node" section above.