llttab - LLT configuration file
The /etc/llttab file contains information used by lltconfig command to configure the LLT protocol. The system administrator is responsible for maintaining this file correctly and consistently across all systems in the cluster.
A pound sign (#) in the first column of a line means that the line is a comment. Blank lines are ignored. A minimal configuration must set the systemid and specify a network link.
The /etc/llttab file supports the following commands:
set-node systemid Set the systemid. This is a required command. systemid may be an integer in the valid range of systemids, or it may be a symbolic name (which is translated via /etc/llthosts to a systemid). systemid also may be a filename beginning with a slash (/); the first word from the file is used as a symbolic name and translated via /etc/llthosts to a systemid. Systemids must be unique within a cluster. If LLT detects a configuration in which another system is using the same systemid, it disables the protocol until the system is rebooted. set-cluster clusterid Set the clusterid to a number between 0 and 65535. This command is needed only if more than one cluster is sharing network hardware being used by LLT. In this case, each cluster needs its own clusterid (or, alternately, a unique SAP), so that the clusters traffic do not interfere with each other. Machines with different clusterids cannot communicate with each other. The default clusterid is 0.
LLT supports a maximum of 8 links. A link can be specified using either of the following two link commands:
link tag device_name systemid_range link_type sap mtu_size [ip_address broadcast_address | multicast_ address][hidden]
Configure a network interface link below LLT and give it a tag for use in subsequent commands and for reference by lltstat. The supported values for the link_type are ether, udp and udp6. On selected versions of Linux, link type rdma is also supported. At least one link command is required for LLT to operate. The link is bound to the sap and is used to send heartbeats and data to other nodes.
The device_name is the network device path. For link type ether, it is followed by a colon (:) and an integer which specifies the unit or PPA used by LLT to attach. For Linux, the device_name can be given in two ways. Either it can be specified as an interface name, i.e. eth0, eth1, etc., or in the more preferred way which is to specify it as eth-<macaddress>, i.e. eth-xx:xx:xx:xx:xx:xx.
For link types udp and udp6, the device_name is the udp and udp6 device path respectively. On Linux, the device_name can be specified as udp or upd6. For link type rdma, the device_name is udp.
systemid_range represents the range of systems for which the command is valid. If the link command is valid for all systems, specify a dash (-).
sap specifies the SAP to bind on the network links. mtu_size specifies the maximum transmission unit to use for sending packets on network links. Packets with size greater than this number are appropriately fragmented by LLT before transmission. The defaults for sap or mtu_size are specified by a dash (-).
The ip_address and broadcast_address can be specified for the link type udp and rdma only. The default for broadcast_address is specified by a dash. For the default value, the broadcast address is calculated based on the traditional canonical class of the address.
The ip_address and multicast_address can be specified for the link type udp6 only. The multicast_ address should always be specified as a dash (-).
By specifying the hidden keyword you can configure a link as "hidden". The "hidden" link is not visible to the LLT clients and is not considered in the cluster membership computation. It is used only by the LLT for exchanging some LLT-specific information with peer nodes. This type of link is needed to enable faster detection of LLT link failures.
link-lowpri tag device_name systemid_range link_type sap mtu_size [ ip_address broadcast_address | multicast_address][hidden]
Same as the link command above. The only difference is that it configures a link as low priority. A low priority link is only used to send heartbeat traffic most of the time. It is used for transmitting regular data only as a last resort, when all of the non-lowpri links are down. If more than one cluster is sharing the same network using a lowpri link (usually the public network) then it is extremely important for each cluster to have a unique clusterid. Also, if there is more than one lowpri link configured on a system and sharing the same network, then each must use a unique SAP or else LLT will generate messages referring to lost heartbeats on the console, since links are normally required to be completely isolated.
set-addr systemid tag address Manually set the address of a peer node (identified by systemid) as seen on particular link (identified by tag) of the local node. For link type ether, the address is the corresponding MAC address; for link types udp or udp6 it is the corresponding IP or IPv6 address. Use this command when LLT is using links that do not support broadcast (multicast if the link type is udp6), and thus do not support its automatic address discovery. In such a configuration, this command should be used on each node, for each link of each peer node. This command must follow the link command, which defines its tag. set-verbose 0 | 1 Set the verbose mode of lltconfig, which causes informational messages to be displayed while it processes commands. The default is 0. set-warn warning_level Set warning mode, which causes warning messages to be sent to the /var/adm/messages file. The default is 0. include systemid_range Set a systemid_range of systemids valid for participation in the cluster. This command alters the limits of systemids that applications may use, to prevent attempts to communicate with non-existent systems. The systemid_range is specified as two integers separated by a dash (-). The default is to include 0-nn, where -nn is the highest supported systemid as determined by the kernel configuration. exclude systemid_range Set a systemid_range of systemids not valid for participation in the cluster. This command alters the limits of systemids that applications may use to prevent attempts to communicate with non-existent systmes. The systemid_range is specified as two integers separated by a dash (-). The default is to include 0-nn, where -nn is the highest supported systemid as determined by the kernel configuration. set-arp 0 | 1 Set the LLT-ARP mode. If it is set, LLT automatically discovers the addresses of other nodes in the cluster, so that they do not have to be set manually with the set-addr command. This option only has meaning when set-bcsathb is 0. The default is 0. set-bcasthb 0 | 1 Set the broadcast heartbeat mode. If it is set, LLT uses the broadcast capabilities (multicast capabilities for the link type udp6) of the underlying network links to send heartbeats and LLT-ARP discovery packets to other systems, instead of sending them to each system individually. This option is valid only when all underlying network links support broadcast (multicast if the link type is udp6). The default is 1. set-timer timer:value Set the values of the protocol timers. value is specified in 1/100ths of a second. timer field can be one of these-
heartbeat: Send heartbeat packets repeteadly to peer nodes after every heartbeat timer interval. The default value is 50.
peertrouble: Mark a link of a peer node as "troubled", if do not receive any packet on that link for this timer interval. Once a link is marked as "troubled", LLT will not send any data on that link till there is at least one active link available. The default value is 200.
peerinact: Mark a link of a peer node as "inactive", if do not receive any packet on that link for this timer interval. Once a link is marked as "inactive", LLT will not send any data on that link. The default value is 3200 on AIX and 1600 on other platforms. This timer value should always be greater than peertrouble timer value.
rpeerinact: Mark RDMA channel of a RDMA link as "inactive", if the node does not receive any packet on that link for this timer interval. Once RDMA channel is marked as "inactive", LLT does not send any data on the RDMA channel of that link, however, it may continue to send data over non-RDMA channel of that link until peerinact expires. You can view the status of the RDMA channel of a RDMA link using lltstat -nvv -r command. The default value of this attribute is 700. This timer value should always be greater than peertrouble timer value and less than peerinact value. This attribute is supported only on selected versions of Linux.
oos: If received packets remain out-of-sequence for this timer interval, send a NAK to the sender. The default value is 10.
retrans: Retransmit a packet if do not receive its acknowledged for this timer interval. The default value is 10.
service: Call LLT service routine (which delivers messages to LLT clients) after every service timer interval. The default value is 100.
arp: Expire stored MAC addresses of peer nodes after this timer interval. The default value is 30000.
linkstable: This option specifies the amount of time to wait before processing the link-down event for any link of the local node. LLT receives link-down events from the operating system when the LLT "faster detection of link failure" feature is enabled. The default value is 200.
set-flow parameter:value Set the values of the flow control limits. The value is specified in number of packets. parameter field can be one of these-
highwater: This is the maximum number of packets LLT will transfer without flow controlling the client. When the number of packets in per-node transmit queue reaches highwater, LLT stops transmitting packets to that node by flow controlling the clients.
lowwater: When LLT has flow controlled the client, it will not start transmission again till the number of packets in transmit queue drops to lowwater.
rporthighwater: When a ports receive queue has rporthighwater number of packets in it, LLT stops accepting any more packets on that port. These packets will be dropped, and will be retransmitted again by the sender.
rportlowwater: Once a ports receive queue becomes full, LLT waits till the number of packets in it drops to rportlowwater. At this point, LLT again starts accepting packets on that port.
window: This is the maximum number of un-ACKed packets LLT will put in flight.
ackval: LLT sends acknowledgement of a packet by piggybacking an ACK packet on the next outbound data packet to the sender node. If there are no data packets on which to piggyback the ACK packet, LLT waits for ackval number of packets before sending an explict ACK to the sender.
sws: This is the number of packets to queue (during transmit) to avoid silly window syndrome. linkburst: LLT sends packets on all the configured links in round-robin manner.
linkburst is the number of back-to-back packets sent on a link before the next link is choosen.
set-nofastpath 0 | 1 Disable fastpath mode. If set to 1 LLT does not use the "fastpath" interface to NICs even if it exists. This must precede any link or link-lowpri commands. The default is 0 i.e. LLT attempts to use "fastpath" interface if it exists. set-checksum 0 | 1 | 10 | 2 | 20 Set checksum mode. When set to 1, LLT checksums each packet it sends to peer to guard against on-the-wire packet corruption. LLT will also offload checksum calculation to hardware if the underlying NIC supports it. In case checksum verification fails on the receiver for a packet, LLT drops that packet, causing the sender to retransmit it.
Setting to 10 is same as setting to 1 except that LLT will strictly do checksums in software and will NOT offload checksumming to NIC even if it is capable of doing so.
When set to 2, LLT also checksums the whole data buffer submitted by the client to be verified by the peer before delivering it to peer-client. In case the checksum verification fails on the receiver, LLT will panic the machine. This is purposefully done to help in analysis of memory corruption from a crash dump.
Setting to 20 is same as setting to 2 except that LLT will strictly do checksums in software and will NOT offload checksumming to NIC even if it is capable of doing so.
Level 2 and level 20 checksums should only be used when diagnosing memory corruption under the advisement of the support center, since it does have the ability to panic the machine.
The default is 10 (checksums on) as LLT is a reliable transport and we need to guarantee packet accuracy. Level 10 checksums may be disabled if the private network is known to be reliable. There may be some peformance improvement due to the CPU cycles needed to perform the checksum.
Currently checksum offloading is only implemented on Linux and only for transmitting packets.
set-tracelevel 0 | 1 | 3 Set trace level. If set to 1 (the default), LLT will trace all events (upcalls, flow-control, link and connection state changes) in an internal circular buffer (called as trace buffer).
When set to 3, LLT will also trace packets that are received or transmitted. This has an overhead and may impact performance. Hence should be used only to debug.
Setting to 0 disables tracing.
set-tracesize size_in_kb Set trace buffer size. The size is specified in KB. set-strictsrc 0 | 1 Enable strict source address checking.
If set to 1, LLT will check the source address of incoming packets and drop packets from unknown sources. When set to 0 this check is not performed.
This option is only applicable when UDP links are configured. For ethernet this will be ignored (even if set to 1). If left unspecified this option will be set to 1 automatically if UDP links are configured.
set-linkfaildetectlevel 0 | 1 | 2 Set the LLT link failure detection level.
When set to 0 (the default), the link failure is detected and processed after the LLT "peerinact" time period.
When set to 1, enables the LLT "Faster detection of link failure" feature and with this the link failure is detected immediately and processed after the LLT "linkstable" time period. For this, you need to configure one additional link as LLT "hidden" link. You can configure your public link as LLT "hidden" link. The "hidden" link is used only for LLT internal purposes and not visible to the LLT clients.
When set to 2, provides the same functionality as option 1 but this option must be used only when you have two-node cluster connected via cross-over cable (no switch involved in the physical connections). In this case, you need not configure the extra LLT "hidden" link as mentioned for option 1.
Use thelltstat -c command to display the current configured value.
The above functionality for the options 1 and 2 is available only on Solaris and HP-UX platforms.
set-dbg-minlinks 2 Prevent GAB from reporting jeopardy membership.
You must set the set-dbg-minlinks value to 2 if you configured only one LLT link that is derived from aggregated (bonded) NICs. This ensures that GAB does not show jeopardy membership.
The following example shows a typical /etc/llttab file with two links:
# look up my node ID in /etc/llthosts set-node system1 link qfe1 /dev/qfe:1 - ether - - link qfe3 /dev/qfe:3 - ether - -
The following example shows an /etc/llttab file with two links, one of which is used only as a last resort:
# take the name in /etc/nodename and look # up my system ID in /etc/llthosts set-node /etc/nodename # heartbeats only on this link unless the other one fails link-lowpri le0 /dev/le:0 - ether 0xCAFE - link qfe0 /dev/qfe:0 - ether 0xCAFE -
In the example above, the last dash (-) directs LLT to assign a default value to mtu_size. The following example shows an /etc/llttab file for a cluster of two systems connected by two links that do not support broadcasting:
# set my system ID numerically set-node 2 link scid0 /dev/scid:0 - ether - - link scid1 /dev/scid:1 - ether - - # MAC addresses on link 0 set-addr 1 scid0 00:00:00:01 set-addr 2 scid0 00:00:00:02 # MAC addresses on link 1 set-addr 1 scid1 00:00:01:01 set-addr 2 scid1 00:00:01:02 set-arp 0 set-bcasthb 0
When LLT and GAB are running under a cluster manager other than VCS, configure LLT and GAB as per the cluster managers supplementary documentation on LLT and GAB.
/etc/llttab LLT configuration file
/etc/llthosts LLT host name database
/opt/VRTSllt/sample-llttab Sample LLT configuration file