The syncer rate
configuration parameter
should be configured with care as the synchronization rate can
have a significant effect on the performance of the DRBD setup
in the event of a node or disk failure where the information is
being synchronized from the Primary to the Secondary node.
In DRBD, there are two distinct ways of data being transferred between peer nodes:
Replication refers to the transfer of modified blocks being transferred from the primary to the secondary node. This happens automatically when the block is modified on the primary node, and the replication process uses whatever bandwidth is available over the replication link. The replication process cannot be throttled, because you want to transfer of the block information to happen as quickly as possible during normal operation.
Synchronization refers to the process of bringing peers back in sync after some sort of outage, due to manual intervention, node failure, disk swap, or the initial setup. Synchronization is limited to the
syncer rate
configured for the DRBD device.
Both replication and synchronization can take place at the same time. For example, the block devices can be synchronized while they are actively being used by the primary node. Any I/O that updates on the primary node automatically triggers replication of the modified block. In the event of a failure within an HA environment, it is highly likely that synchronization and replication will take place at the same time.
Unfortunately, if the synchronization rate is set too high, then the synchronization process uses up all the available network bandwidth between the primary and secondary nodes. In turn, the bandwidth available for replication of changed blocks is zero, which stalls replication and blocks I/O, and ultimately the application fails or degrades.
To avoid enabling the syncer rate
to consume
the available network bandwidth and prevent the replication of
changed blocks, set the syncer rate
to less
than the maximum network bandwidth.
Avoid setting the sync rate to more than 30% of the maximum
bandwidth available to your device and network bandwidth. For
example, if your network bandwidth is based on Gigabit ethernet,
you should achieve 110MB/s. Assuming your disk interface is
capable of handling data at 110MB/s or more, then the sync rate
should be configered as 33M
(33MB/s). If your
disk system works at a rate lower than your network interface,
use 30% of your disk interface speed.
Depending on the application, you may wish to limit the synchronization rate. For example, on a busy server you may wish to configure a significantly slower synchronization rate to ensure the replication rate is not affected.
The al-extents
parameter controls the number
of 4MB blocks of the underlying disk that can be written to at
the same time. Increasing this parameter lowers the frequency of
the metadata transactions required to log the changes to the
DRBD device, which in turn lowers the number of interruptions in
your I/O stream when synchronizing changes. This can lower the
latency of changes to the DRBD device. However, if a crash
occurs on your primary, then all of the blocks in the activity
log (that is, the number of al-extents
blocks) must be completely resynchronized before replication can
continue.