Heartbeat configuration requires three files located in
/etc/ha.d
. The ha.cf
contains the main heartbeat configuration, including the list of
the nodes and times for identifying failures.
haresources
contains the list of resources to
be managed within the cluster. The authkeys
file contains the security information for the cluster.
The contents of these files should be identical on each host within the Heartbeat cluster. It is important that you keep these files in sync across all the hosts. Any changes in the information on one host should be copied to the all the others.
For these examples n example of the ha.cf
file is shown below:
logfacility local0 keepalive 500ms deadtime 10 warntime 5 initdead 30 mcast bond0 225.0.0.1 694 2 0 mcast bond1 225.0.0.2 694 1 0 auto_failback off node drbd1 node drbd2
The individual lines in the file can be identified as follows:
logfacility
: Sets the logging, in this case setting the logging to use syslog.keepalive
: Defines how frequently the heartbeat signal is sent to the other hosts.deadtime
— the delay in seconds before other hosts in the cluster are considered 'dead' (failed).warntime
: The delay in seconds before a warning is written to the log that a node cannot be contacted.initdead
: The period in seconds to wait during system startup before the other host is considered to be down.mcast
: Defines a method for sending a heartbeat signal. In the above example, a multicast network address is being used over a bonded network device. If you have multiple clusters then the multicast address for each cluster should be unique on your network. Other choices for the heartbeat exchange exist, including a serial connection.If you are using multiple network interfaces (for example, one interface for your server connectivity and a secondary or bonded interface for your DRBD data exchange), use both interfaces for your heartbeat connection. This decreases the chance of a transient failure causing a invalid failure event.
auto_failback
: Sets whether the original (preferred) server should be enabled again if it becomes available. Switching this toon
may cause problems if the preferred went offline and then comes back on line again. If the DRBD device has not been synced properly, or if the problem with the original server happens again you may end up with two different datasets on the two servers, or with a continually changing environment where the two servers flip-flop as the preferred server reboots and then starts again.node
: Sets the nodes within the Heartbeat cluster group. There should be onenode
for each server.
An optional additional set of information provides the
configuration for a ping test that checks the connectivity to
another host. Use this to ensure that you have connectivity on the
public interface for your servers, so the ping test should be to a
reliable host such as a router or switch. The additional lines
specify the destination machine for the ping
,
which should be specified as an IP address, rather than a host
name; the command to run when a failure occurs, the authority for
the failure and the timeout before an nonresponse triggers a
failure. A sample configure is shown below:
ping 10.0.0.1 respawn hacluster /usr/lib64/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster deadping 5
In the above example, the ipfail command, which
is part of the Heartbeat solution, is called on a failure and
'fakes' a fault on the currently active server. You need to
configure the user and group ID under which the command is
executed (using the apiauth
). The failure is
triggered after 5 seconds.
The deadping
value must be less than the
deadtime
value.
The authkeys
file holds the authorization
information for the Heartbeat cluster. The authorization relies on
a single unique 'key' that is used to verify the two machines in
the Heartbeat cluster. The file is used only to confirm that the
two machines are in the same cluster and is used to ensure that
the multiple clusters can co-exist within the same network.