README VERSION : 1.0 README Creation Date : 2011-09-09 Patch-ID : 145456-03 Patch Name : VRTSvxfen 5.1SP1RP2-sol10-x86 BASE PACKAGE NAME : Veritas I/O Fencing by Symantec BASE PACKAGE VERSION : VRTSvxfen 5.1SP1 Obsolete Patches : NONE Superseded Patches : NONE Required Patches : NONE Incompatible Patches : NONE Supported PADV : sol10_x86 (P-Platform , A-Architecture , D-Distribution , V-Version) Patch Category : OTHER Reboot Required : NO KNOWN ISSUES : FIXED INCIDENTS: ---------------- Patch Id::145456-03 * Incident no::2276622 Tracking ID ::2252385 Symptom::The VxFEN (fencing) component of Veritas Cluster Server (VCS) fails to start using coordinator disks from certain disk arrays. Even if you configure multiple coordinator disks, the component displays the following error message: "V-11-2-1003 At least three coordinator disks must be defined" The vxfentsthdw utility does not report any problem with the coordinator disks. However, if you run the SCSI-extended inquiry command "vxfenadm -i " on the disks, it reports same serial number for all the coordinator disks. In contrast, the VxVM component's utilities report different and unique serial numbers for the same coordinator disks. Description::You may face this problem with some disk arrays that are running more recent firmware. On such disk arrays, the fencing component does not retrieve the correct serial number of a disk via SCSI-extended inquiry command. The SCSI-extended inquiry from the fencing component is based on an old SPC-3 specification. According to the old specification, vendor-specific information on page 0x83 of a disk may contain multiple, but globally unique serial numbers associated with the addressed logical unit. The fencing component picks one serial number out of all available serial numbers for a disk. On disk arrays with new firmware, the SCSI information on page 0x83 of a disk may contain several serial numbers associated with different entities. These entities include: o The addressed logical unit o The SCSI target device that contains the addressed logical unit o The target port that receives the request The current issue occurs when the fencing component incorrectly picks up the serial number associated with the target port, which is the same for all the configured coordinator disks. Resolution::Symantec has updated the fencing library code to pick the serial number associated with the addressed logical unit, when running an SCSI-extended inquiry on a disk array. The fencing component now ignores the serial numbers associated with the other entities in the array. * Incident no::2366201 Tracking ID ::2365394 Symptom::When a VCS node starts up, if even one of the coordination points for the Veritas fencing module (VxFEN) is inaccessible to the node, then VxFEN fails to start on that node. It is desirable that VxFEN must be able to start on the node as long as a majority of the coordination points are accessible to the node when the node starts up. Description::When a node starts up, the following conditions must occur before VxFEN starts on that node: aC/ VxFEN must be able to get the Universal Unique Identifier (UUID) or serial number of each coordination point specified in the /etc/vxfenmode file. aC/ VxFEN must be able to register the node with a majority of CPs specified in the /etc/vxfenmode file. However, due to accessibility issues, if VxFEN fails to get the UUID or serial number of one of the specified CPs, then VxFEN treats it as a fatal failure. The node cannot then join a cluster or start a cluster. As a result, every coordination point becomes a potential single point of failure, and compromises high availability (HA). Resolution::Symantec has modified the fencing module to fix the issue. Each VCS node now stores the UUIDs or serial numbers of all the coordination points that the node registers with. As a result, if a node is later unable to access a specified coordination point, VxFEN can use the stored UUIDs/serial numbers. By design, the fix works only when a majority of coordination points are accessible to the node when the node starts. At the time of a fencing race, the racer needs to have its keys registered on a majority of coordination points in order to be able to win the race. In order to enable this, fencing is designed to not start if a majority of coordination points are not available at the time of startup. This fix applies only to clusters that use customized fencing. As part of the fix, Symantec has introduced two optional attributes to the /etc/vxfenmode file. db_ignore_list : Specifies the type/s of coordination points for which a node must not store the UUID/serial number. To specify multiple values, use a comma- separated list. VxFEN supports the values "none", "disk", and "server". Note: By default, this feature is available only for coordination point servers. To turn it on for disks, you must set the value of the db_ignore_list to "none". db_entries_limit: Specifies the maximum number of UUIDs/serial numbers that a node can store. The default value for this attribute is 1000. If the default value is used, the node approximately requires 1MB of disk space to store the UUIDs/serial numbers. * Incident no::2382335 Tracking ID ::2208802 Symptom::In a shared diskgroup that contains more than one disk, the 'vxfentsthdw -g ' command fails to map a shared disk correctly to the nodes that share it. Description::When you run the 'vxfentsthdw -g ' command, the Veritas fencing module (VxFEN) uses the serial number of a shared disk to map that disk to the nodes that share it. To determine the serial number of the shared disk, the module runs queries in the /usr/bin/ksh shell on all nodes. On certain platforms, the ksh shell may not exist in the default path. As a result, the queries for the serial number may return a null result. A null result will match with every other serial number queried in the cluster. As a result, the vxfentshdw command incorrectly maps shared disks. Resolution::Symantec has replaced the hardcoded path for the /user/bin/ksh shell to make it appropriate for all operating systems. * Incident no::2382384 Tracking ID ::2364959 Symptom::1. After installing 5.1sp1 without configuring when node is rebooted, LLT, GAB, FENCING and AMF goes in maintenance state. 2. After upgrade to 5.1SP1, local zones SMF dependencies complain about missing services and "svcs -x" is not clean Description::Install of 5.1SP1 or upgrade from 5.0 in a global zone only temporarily disables llt/gab/fencing/amf services. So after reboot they are re-enabled and SMF dependencies fail. Resolution::Changed the LLT, GAB, FENCING and AMF postinstall scripts to permanently disable services. Added a check not to create SMF service for non-global zones. * Incident no::2382460 Tracking ID ::2209661 Symptom::If you configure the Veritas fencing module (VxFEN) in one of the following two ways, then you may not be able to distinguish certain important messages in the log file. o /etc/vxfenmode file contains 3 or more coordination points with single_cp=1 o /etc/vxfenmode file contains 1 disk as a coordination point with single_cp=1 Description::For the above configurations, certain messages are not appropriately highlighted in the following log file: /var/VRTSvcs/log/vxfen/vxfend_A.log For the first configuration, the log file contains the following important message: Ignoring the single_cp attribute For the second configuration, the log file contains the following important message: Option single_cp is enabled. With this option, Symantec recommends the usage of a Co-ordination Point server protected via SFHA and reachable from each node of this client cluster via multiple completely redundant networks. VxFen detected a single disk. Resolution::Symantec has updated the vxfend code to highlight the above messages with the word WARNING and proper formatting: For the first configuration, the log file displays the following message: *** WARNING: Ignoring the single_cp attribute *** as 3 coordination points *** have been specified. For the second configuration, the log file displays the following message: *** WARNING: Option single_cp is enabled. With this *** option, Symantec recommends the usage of a *** coordination point server protected via SFHA and *** reachable from each node of this client cluster *** via multiple completely redundant networks. VxFen *** detected a single disk. * Incident no::2382559 Tracking ID ::2208792 Symptom::In a cluster where the Veritas fencing module (VxFEN) is running, the vxfenswap utility fails with following error: I/O fencing does not appear to be configured on node Description::When you run the vxfenswap utility, it checks for the VxFEN status on the local node. To determine the status, the utility runs a query in the /usr/bin/ksh shell. On certain platforms, the query may return a null result as the ksh shell may not exist in the default path. The utility therefore concludes that VxFEN is not configured. Resolution::Symantec has replaced the hardcoded path for the /user/bin/ksh shell to make it appropriate for all operating systems. * Incident no::2386326 Tracking ID ::2375203 Symptom::The Veritas fencing module (VxFEN) fails to start and displays the following error message: ERROR V-11-2-1003 At least three coordinator disks must be defined If you run the 'vxfenadm -i 'command, the output indicates the same serial number for different disks. If you run the './etc/vx/diag.d/vxdmpinq -e 1 -p 131 ' command to determine the raw data size, the output indicates that the raw data size is greater than 96. Description::The fencing module runs a SCSI3 query on disk to determine its serial number. The buffer size for the query is 96 KB whereas the size of the output is much larger. Therefore, the serial number of the disk is truncated, and appears to be the same for all disks. Resolution::Symantec has updated the vxfenadm utility to use a larger buffer size for its SCSI3 queries. * Incident no::2394176 Tracking ID ::2350983 Symptom::If you run the vxfenswap utility on a multinode VCS cluster, then after some time, the vxfenswap operation stalls and no output appears on the console. However, the console does not freeze (the system does not hang). If you run the 'ps -ef | grep vxfen' command on every node, the output indicates that the 'vxfenconfig -o modify' process is running on some nodes, but it is not running at least on one node. Description::The vxfenswap utility executes the 'vxfenconfig -o modify' command on each node using ssh or rsh. The utility performs the other tasks related to online replacement of coordination points (OCPR) via broadcast messages among cluster nodes. If the utility is unable to fork the 'vxfenconfig -o modify' command on one of the nodes, the vxfenconfig instance on the other nodes cannot proceed further. This may occur due to intermittent failures in ssh/rsh communication or in network connectivity between the cluster nodes. The vxfenswap utility runs the ' ssh vxfenconfig -o modify &' command on each node, as a result of which the ssh/rsh process runs as a background job and the vxfenswap utility cannot capture the state of the background jobs. This is the root cause of the symptom. As a workaround, you can run the 'vxfenswap -a cancel' command from the console or one of the other nodes. The stalled vxfenswap process resumes , and VxFEN continues to use old coordination points. However, this workaround is not comprehensive. Resolution::Symantec has modified the vxfenswap utility to track the processes related to the 'ssh vxfenconfig -o modify &' command for their exit status. If any process fails, vxfenswap sends a message to the console stating: Failed to validate the new set of coordination points The vxfenswap utility then rolls back the entire OCPR operation to bring the cluster to its normal state. You therefore need not manually run the 'vxfenswap -a cancel' command. * Incident no::2400485 Tracking ID ::2400473 Symptom::When the Veritas fencing module (VxFEN) starts, it may encounter issues in reading coordination point information. VxFEN may display an error such as: V-11-2-1036 Unable to configure VxFEN since daemon failed to talk to the driver. The user mode process that processes coordination point information may abort. If you then try to configure the driver in another fencing mode, VxFEN displays the following error: V-11-2-1050 Mismatched modes on local node and running cluster. Unable to configure vxfen Description::The problem occurs because the variable that tracks the fencing mode is incorrectly passed from the user land configuration files to the VxFEN kernel driver. Resolution::Symantec has modified the VxFEN kernel driver to properly reset the fencing mode variables when it receives an error from the user land. * Incident no::2426663 Tracking ID ::2426659 Symptom::If you run the 'vxfenswap' command to change the fencing mode from 'customized' to 'scsi3', the vxfend process continues to run. Description::The kernel driver of the Veritas fencing module (VxFEN) starts the vxfend process only when VxFEN is configured in a customized mode. However, when you change the fencing mode to 'scsi3', the VxFEN kernel driver fails to terminate the vxfend process. Resolution::Symantec has modified the VxFEN kernel driver to appropriately terminate the vxfend process. * Incident no::2438261 Tracking ID ::2482167 Symptom::The vxfenswap utility fails to change the interaction policy for coordinator disks from SCSI3 raw to SCSI3 dmp by using the '/etc/vxfenmode.test' files. However, you can use the vxfenswap utility to change the policy from SCSI3 dmp to SCSI3 raw. Description::Even when the '/etc/vxfenmode.test' file exists, vxfenswap reads /etc/vxfenmode first, and then reads /etc/vxfenmode.test. The utility must read only the '/etc/vxfenmode.test' file when available. Resolution::Symantec has modified vxfenswap to read only /etc/vxfenmode.test when available. When the test file does not exist, vxfenswap reads /etc/vxfenmode. Incidents from old Patches: --------------------------- NONE