README VERSION : 1.1 README CREATION DATE : 2013-09-30 PATCH-ID : PVCO_03991 PATCH NAME : VRTSvcs 5.1.103.000 BASE PACKAGE NAME : VRTSvcs BASE PACKAGE VERSION : 5.1.100.000 SUPERSEDED PATCHES : NONE REQUIRED PATCHES : NONE INCOMPATIBLE PATCHES : NONE SUPPORTED PADV : hpux1131 (P-PLATFORM , A-ARCHITECTURE , D-DISTRIBUTION , V-VERSION) PATCH CATEGORY : CORE , CORRUPTION PATCH CRITICALITY : CRITICAL HAS KERNEL COMPONENT : NO ID : NONE REBOOT REQUIRED : NO REQUIRE APPLICATION DOWNTIME : Yes PATCH INSTALLATION INSTRUCTIONS: -------------------------------- Please refer to Release Notes for install instructions PATCH UNINSTALLATION INSTRUCTIONS: ---------------------------------- Please refer to Release Notes for uninstall instructions SPECIAL INSTRUCTIONS: --------------------- NONE SUMMARY OF FIXED ISSUES: ----------------------------------------- PATCH ID:PVCO_03991 2330038 (2279845) Veritas Cluster Server (VCS) does not restart the application (configured in a parent service group) running inside the container (configured in a child service group) after the container recovers from a fault. 2358600 (2650264) The command "hares -display " fails if a resource is part of a global service group. 2415578 (2423680) The Veritas Cluster Server (VCS) commands do not work when VCS object (i.e. Group, Resource, or Cluster) name is G, A, O, E, S, or C. 2486415 (2486413) Global Atomic Broadcast (GAB) errors are observed in the engine log while running a single node and a standalone Veritas Cluster Server (VCS) cluster where GAB is disabled. 3139423 (2210717) When a non-critical resource of a service group faults, the service group remains in the STARTING|PARTIAL state. 3139432 (2987868) When a resource faults, a service group does not fail over as the TargetCount becomes less than the CurrentCount. 3140118 (3042450) A parent service group which is frozen and configured with online local hard dependency is brought offline when its child service group faults. 3140124 (3079893) The value of LastSuccess attribute of the service group equals the GlobalCounter value of the cluster if the resource faults while you online the service group. Hence the service group fails to come online. 3140132 (3090710) The High Availability Daemon (HAD) starts and stops before the VxFEN driver configuration completes. 3140523 (3137377) High Availability Daemon (HAD) dumps core due to failure in the allocation memory. 3140631 (2736627) The remote cluster remains in INIT state, and the Internet Control Message Protocol (ICMP) heartbeat status is UNKNOWN. 3140789 (2556350) Veritas Cluster Server (VCS) generates core when the command "hagrp -clear" is executed on a group in "OFFLINE|FAULTED" state. 3207674 (3207663) Incorrect user privileges are set in case of incorrect use of the '-group' option in command "hauser -addprive". 3216554 (3266168) During Veritas Cluster Server (VCS) patch upgrade, the file "/opt/VRTSvcs/bin/vcsenv" is overwritten. SUMMARY OF KNOWN ISSUES: ----------------------------------------- NONE KNOWN ISSUES : -------------- NONE FIXED INCIDENTS: ---------------- PATCH ID:PVCO_03991 * INCIDENT NO:2330038 TRACKING ID:2279845 SYMPTOM: Veritas Cluster Server (VCS) does not restart the application (configured in a parent service group) running inside the container (configured in a child service group) after the container recovers from a fault. DESCRIPTION: Consider scenarios like application is running inside a container (Zone on Solaris, WPAR on AIX)and the child service group's OnlineRetryLimit attribute is set to more than 0. In this case, if the container faults but VCS detects the fault in the application earlier than the container fault, when VCS restarts the child group, and after the child group comes online, it cannot bring the parent group online because the parent group is now marked as faulted. This scenario can happen typically when you have a single system in the systemlist of child and parent groups. RESOLUTION: You can set OnlineClearParent = 1 for the child group, so when the child group is detected online (for instance, when the child group is restarted by the engine) then the engine clears the fault of the parent group, which enables the parent group to go online. * INCIDENT NO:2358600 TRACKING ID:2650264 SYMPTOM: The command "hares -display " fails with the following message: VCS WARNING V-16-1-40139 No information found for resource ??? in cluster ??? DESCRIPTION: If a resource is part of a global service group, the command "hares -display " fails to correctly process the resource attributes, so that it displays a warning message. RESOLUTION: The code has been modified to ensure that the command "hares -display " can correctly process the resource attributes. * INCIDENT NO:2415578 TRACKING ID:2423680 SYMPTOM: The Veritas Cluster Server (VCS) commands do not work when VCS object (i.e. Group, Resource, or Cluster) name is G, A, O, E, S, or C. DESCRIPTION: The characters G, A, O, E, S, and C have special meaning in VCS internal messages. If the object name is set to one of these characters, VCS commands for that object don't work due to a problem in processing the internal data structures. RESOLUTION: The library used by VCS command line is fixed. So the output for VCS commands is proper even when VCS object name is G, A, O, E, S, or C. * INCIDENT NO:2486415 TRACKING ID:2486413 SYMPTOM: Global Atomic Broadcast (GAB) displays the following error in the engine log while running a single node and a standalone Veritas Cluster Server (VCS) cluster where GAB is disabled. "Excessive delay between successive calls to GAB heartbeat". DESCRIPTION: GAB heartbeat log messages are logged as an information when there is delay between heartbeats (HAD being stuck). In case of High Availability Daemon (HAD) running in '-onenode', GAB need not be enabled. When HAD is running in '-onenode'; for self-check purpose, 'HAD' simulates heartbeat with an internal component of 'HAD' itself. These log messages are logged because of delay in the simulated heartbeats. RESOLUTION: Log messages seen are for informational purpose only. The code has been modified so that when 'HAD' is running in '-onenode', no action will be taken on excessive delay between the heartbeats. * INCIDENT NO:3139423 TRACKING ID:2210717 SYMPTOM: When a non-critical resource of a service group faults, the service group remains in the STARTING|PARTIAL state. DESCRIPTION: If a non-critical resource in a service group faults when the service group is going online, the service group state remains STARTING|PARTIAL. Thus, you cannot fail over the service group as the group's state is in transition. RESOLUTION: The code is modified so that if a non-critical resource faults while the service group is going online, the group's state is calculated. This ensures that if this is the last resource waiting to go online, the STARTING flag is removed from the service group state. * INCIDENT NO:3139432 TRACKING ID:2987868 SYMPTOM: When a resource faults, a service group does not fail over as the TargetCount becomes less than the CurrentCount. DESCRIPTION: When a frozen failover service group that is online on one of the nodes in the cluster comes online on another node, it leads to concurrency violation. The concurrency violation trigger cannot bring the service group offline as it is frozen, and the failover service group becomes online on both the nodes. When the service group faults on one of the nodes, its TargetCount is decreased to 0; when the service group faults on the other node and is no longer frozen, the MigrateQ for the group gets populated but since the TargetCount is 0, the failover does not happen. RESOLUTION: The code is modified so that the TargetCount is incremented if required whenever a service group is expected to fail over, thus solving this problem. * INCIDENT NO:3140118 TRACKING ID:3042450 SYMPTOM: A parent service group which is frozen and configured with online local hard dependency is brought offline when its child service group faults. DESCRIPTION: When a service group is frozen in Veritas Cluster Server (VCS), all online and offline procedures on it are stopped. However, when you configure two service groups in an online local hard dependency and if the child service group faults, the parent service group is brought offline. RESOLUTION: The code is modified to check whether the parent service group is frozen before bringing it offline when its child service group faults. If the parent service group is detected as frozen, it remains unaffected. * INCIDENT NO:3140124 TRACKING ID:3079893 SYMPTOM: A resource in the service group faults when a resource in the service group is brought online and the OnlineRetryLimit and OnlineRetryInterval for the service group is set to non -zero values. Veritas Cluster Server (VCS) fails to bring the service group online. DESCRIPTION: When you try to bring a service group online, if the resource faults, the LastSuccess attribute for the service group gets set to the current value of the GlobalCounter of the cluster. Hence, the attempt to bring the service group online fails as VCS treats the fault within the OnlineRetryInterval. RESOLUTION: The code has been modified to fix the High Availability Daemon (HAD) binary. Additional checks have been added so that LastSuccess attribute does not get set during resource fault. * INCIDENT NO:3140132 TRACKING ID:3090710 SYMPTOM: The High Availability Daemon (HAD) starts and stops before the VxFEN driver configuration completes. DESCRIPTION: When you configure Veritas Cluster Server (VCS) with cluster level attribute UseFence set to SCSI3, HAD starts after the VCS Fencing module (VxFEN) at system startup. HAD checks whether VxFEN is configured when it starts. HAD tries to detect the VxFEN configuration and stops, if it fails to detect VxFEN within 180 seconds. The VxFEN configuration gets delayed beyond 180 seconds in some scenarios and hence HAD stops even before the VxFEN configuration is complete. RESOLUTION: The code has been modified such that HAD binary waits indefinitely for the Veritas Fencing (VxFEN) driver configuration to complete. * INCIDENT NO:3140523 TRACKING ID:3137377 SYMPTOM: High Availability Daemon (HAD) dumps core due to failure in the allocation memory. DESCRIPTION: HP-UX operating system provides executables with various address space layouts. This restricts the amount of memory that can be allocated by the process. You can change the address space layout by using appropriate compile-time options. The HAD binary is built as a normal executable which allows only 1GB of private address space. When the memory consumption increases beyond this limit, the memory allocation in HAD fails and dumps core. RESOLUTION: The code has been modified to change the address space layout of HAD binary. It is now built as Mostly Private Address Space (MPAS) binary. This increases the maximum limit of memory that can be allocated by the process. * INCIDENT NO:3140631 TRACKING ID:2736627 SYMPTOM: If IPv6 is disabled on the cluster, the remote cluster remains in INIT state, and the ICMP heartbeat status remains UNKNOWN. DESCRIPTION: When the ICMP agent opens a connection with Wide Area Connector (WAC), it first tries with IPv6. If IPv6 connection fails, the ICMP agent checks if the error is specific to protocol. If yes, it then tries with IPv4. But for the error EADDRNOTAVAIL (i.e. address not found), the ICMP agent does not try with IPv4. RESOLUTION: The code has been modified, so that the ICMP agent retries the connection with IPv4 if the error is EADDRNOTAVAIL. * INCIDENT NO:3140789 TRACKING ID:2556350 SYMPTOM: The VCS High Availability Daemon (HAD) dumps core when the command "hagrp -clear" is executed on a group in "OFFLINE|FAULTED" state. DESCRIPTION: Assume that the resources in a service group are configured with resource dependency. After the child resource comes online, the online operation for the parent resource is initiated. If the child resource faults during this time, it causes the group to enter into OFFLINE|FAULTED state while the parent resource is still waiting to go online. If you now run the command "hagrp -clear" on any node, VCS dumps core due to assert condition on all nodes. RESOLUTION: The code has been modified, so that the clear operation is not allowed if the group has been marked for failover. * INCIDENT NO:3207674 TRACKING ID:3207663 SYMPTOM: When you run the command "hauser -addpriv" to set user privileges for a group, if you use the '-group' option incorrectly and leave out the dash (-) in this option, the incorrect privileges for the group are set. DESCRIPTION: You can set group privileges with the command "hauser -addpriv -group ". If you don't provide the dash (-) in the "-group" option, Veritas Cluster Server (VCS) does not detect this error. It instead sets Cluster Administrator privileges for that user. RESOLUTION: An enhancement is made to the hauser (1M) binary, so that proper error checks are added for the command line options. * INCIDENT NO:3216554 TRACKING ID:3266168 SYMPTOM: During VCS patch upgrade, the file "/opt/VRTSvcs/bin/vcsenv" is overwritten. DESCRIPTION: VCS environment variables can be defined in the file "/opt/VRTSvcs/bin/vcsenv". If you make changes to this file, then upgrade the VCS package, this file is overwritten, so that the changes are lost. RESOLUTION: An enhancement is made to back up the file as "/opt/VRTSvcs/bin/vcsenv.previous" during the preinstall of the patch. You can merge these changes into the file "/opt/VRTSvcs/bin/vcsenv". INCIDENTS FROM OLD PATCHES: --------------------------- NONE