Some applications use two or more MSL servers for redundancy. For example, you can use two servers that act as a single module, configured in a redundant pair. Each server has its own IP address, and together they form a cluster that also has a unique IP address. The Clustering panel is used to create and manage these groups of servers. For information about cluster configuration in the NuPoint environment, refer to the Create the Cluster topic in the NuPoint UM Technician's Handbook.
The foundation of a cluster is the host membership algorithm that ensures the cluster maintains data integrity by using the following inter-node communications:
Network connections between the cluster systems
Cluster Configuration System daemon (cssd) that synchronizes node configuration
Mitel proprietary “heartbeat” algorithm that controls the election of the cluster “master” node and provides presence information via UDP broadcast messages.
The difference between MSL 9.x (based on CentOS 5.3) and MSL 8.x (based on CentOS 4.7) is that the new cluster infrastructure is built on top of openais clustering technology. Openais uses multicast technology rather than broadcast packets for improved efficiency. Distributed Lock Management (DLM) and Global File System (GFS) are now part of the base kernel.
Each network switch and associated networking equipment in an MSL cluster must:
be capable of supporting multicast addresses and Internet Group Management Protocol (IGMP)
have both of these options enabled
If a node cannot meet these requirements, it cannot participate in the cluster.
The cluster software (specifically cman, the Cluster Manager) creates a unique cluster ID for each cluster. The cman then automatically chooses a multicast address for cluster communications, consisting of the first 16 bits of 239.192 and the lower 16 bits based on cluster ID. To view the cluster ID and multicast addresses used by cman, run the cman_tool status command on a cluster node.
Note: Make sure to check the configuration of routers that cluster packets will pass through. Some routers have longer learning times, which can have a negative impact on cluster performance.
Heartbeat is an MSL cluster process that is responsible for controlling the election of a cluster master node and for providing presence information. All nodes send regular, encrypted PING messages to indicate their presence, and receive back PONG messages as confirmation. PING and PONG form the basis of node discovery and health (“heartbeat”), checking presence every 2 seconds. You can configure the number of unsuccessful PING messages that constitute a “node down” situation. The time between PING sent and PONG received is used to maintain a round-trip delay table for each node.
A propriety election algorithm determines the master node in the cluster. The heartbeat process ensures that individual nodes can talk to enough network resources and confirm that they are in a position to provide service. When service is confirmed, heartbeat manages the election process and starts any required services on the master node. Heartbeat uses a combination of cluster PING algorithm (UDP broadcast on port 694) and standard ICMP echo messages. Each cluster node must be able to send ICMP echo messages to the default gateway configured on the cluster node and receive the response.
The heartbeat process logs to /var/log/heartbeat/current.
To allow MSL cluster nodes to communicate, the following IP ports must be enabled on all switches and associated networking equipment that forms the communication path between cluster nodes:
IP Port |
Protocol |
Component |
5404, 5405 |
UDP |
cman/openais (cluster manager default) |
6808, 6809 |
UDP |
(multicast) cman/openais (Set by MSL in /etc/cluster.conf) |
21064 |
TCP |
dlm (Distributed Lock Manager) |
50006, 50008, 50009 |
TCP |
ccsd (Cluster Configuration System daemon) |
50007 |
UDP |
|
694 |
UDP |
(broadcast) heartbeat (MSL cluster heartbeat daemon) |
Notes:
Before cluster creation, run the ping command to verify packet transmission among all cluster nodes.
To verify that the openais multicast packets and the heartbeat broadcast packages are being passed through any network devices connected between the cluster nodes:
Create the cluster master by adding the first node (node1).
As “root” on the master node (node1) execute tcpdump ip multicast. You should see output like this:
12:55:36.241905 IP node1.mycompany.local.6553 > 239.192.55.156.6809: UDP, length 118 …
12:55:38.617305 IP node1.mycompany.local.ha-cluster > 255.255.255.255.ha-cluster: UDP, length 129…
As “root” on the other nodes (node2) to be added to the cluster execute tcpdump ip multicast. If both the broadcast and multicast packets are being forwarded you should see the same packets from node1 with different timestamps as they are received. If the multicast packets are being dropped you will only see the heartbeat packets from node1 on the other nodes.
When Spanning Tree Protocol (STP) is enabled on a network, there is a significant delay from the time a network device has obtained a link with the switch to the time the switch actually starts forwarding the packets from that link to the network. In the case of a cluster, this delay may manifest as a cluster node reboot loop. In other words, when a node is rebooted (fenced due to network/server problem), its network link comes up and the node attempts to join the cluster. The cluster suite attempts to communicate with the other nodes in the cluster to determine cluster health. Since the switch is not yet forwarding the nodes packets (due to the STP delay), no communication with the other cluster nodes can take place. The cluster suite interprets the situation as the other nodes being in a "bad" state and fences (power cycles) them. The newly rebooted nodes then go through the same cycle and fence the other cluster members when they come up.
To prevent the fencing cycle from occurring MSL inserts a 60-second delay after bringing up the network service before allowing the boot process to continue. This gives the network Spanning Tree Protocol time to settle before starting the cluster services."