It is safe to shut down a master server and restart it later.
When a slave loses its connection to the master, the slave tries
to reconnect immediately and retries periodically if that fails.
The default is to retry every 60 seconds. This may be changed
with the CHANGE MASTER TO
statement. A slave also is able to deal with network
connectivity outages. However, the slave notices the network
outage only after receiving no data from the master for
slave_net_timeout
seconds. If
your outages are short, you may want to decrease
slave_net_timeout
. See
Section 5.1.4, “Server System Variables”.
An unclean shutdown (for example, a crash) on the master side
can result in the master binary log having a final position less
than the most recent position read by the slave, due to the
master binary log file not being flushed. This can cause the
slave not to be able to replicate when the master comes back up.
Setting sync_binlog=1
in the
master my.cnf
file helps to minimize this
problem because it causes the master to flush its binary log
more frequently.
Shutting down a slave cleanly is safe because it keeps track of where it left off. However, be careful that the slave does not have temporary tables open; see Section 17.4.1.19, “Replication and Temporary Tables”. Unclean shutdowns might produce problems, especially if the disk cache was not flushed to disk before the problem occurred:
For transactions, the slave commits and then updates
relay-log.info
. If a crash occurs between these two operations, relay log processing will have proceeded further than the information file indicates and the slave will re-execute the events from the last transaction in the relay log after it has been restarted.A similar problem can occur if the slave updates
relay-log.info
but the server host crashes before the write has been flushed to disk. To minimize the chance of this occurring, setsync_relay_log_info=1
in the slavemy.cnf
file. The default value ofsync_relay_log_info
is 0, which does not cause writes to be forced to disk; the server relies on the operating system to flush the file from time to time.
The fault tolerance of your system for these types of problems is greatly increased if you have a good uninterruptible power supply.