4.10. Replication in MySQL

This section describes the various replication features in MySQL. It serves as a reference to the options available with replication. You will be introduced to replication and learn how to implement it. Toward the end, there are some frequently asked questions and descriptions of problems and how to solve them.

We suggest that you visit our website at http://www.mysql.com/ often and read updates to this section. Replication is constantly being improved, and we update the manual frequently with the most current information.

4.10.1. Introduction

One-way replication can be used is to increase both robustness and speed. For robustness you can have two systems and can switch to the backup if you have problems with the master. The extra speed is achieved by sending a part of the non-updating queries to the replica server. Of course this only works if non-updating queries dominate, but that is the normal case.

Starting in Version 3.23.15, MySQL supports one-way replication internally. One server acts as the master, while the other acts as the slave. Note that one server could play the roles of master in one pair and slave in the other. The master server keeps a binary log of updates (Section 4.9.4, “The Binary Log ”) and an index file to binary logs to keep track of log rotation. The slave, upon connecting, informs the master where it left off since the last successfully propagated update, catches up on the updates, and then blocks and waits for the master to notify it of the new updates.

Note that if you are using replication all updates to the tables you replicate should be done through the master, unless you are always careful of avoiding conflicts between updates which users issue on the master and those which users issue on the slave.

Another benefit of using replication is that one can get non-disturbing backups of the system by doing a backup on a slave instead of doing it on the master. Section 4.4.1, “Database Backups ”.

4.10.2. Replication Implementation Overview

MySQL replication is based on the server keeping track of all changes to your database (updates, deletes, etc) in the binary log (Section 4.9.4, “The Binary Log ”) and the slave server(s) reading the saved queries from the master server's binary log so that the slave can execute the same queries on its copy of the data.

It is very important to realise that the binary log is simply a record starting from a fixed point in time (the moment you enable binary logging). Any slaves which you set up will need copies of the data from your master as it existed the moment that you enabled binary logging on the master. If you start your slaves with data that doesn't agree with what was on the master when the binary log was started, your slaves may fail.

Please see the following table for an indication of master-slave compatibility between different versions. With regard to version 4.0, we recommend using same version on both sides.

  MasterMasterMasterMaster
  3.23.33 and up4.0.04.0.14.0.3 and up
Slave3.23.33 and upyes no no no
Slave4.0.0no yes no no
Slave4.0.1yes no yes no
Slave4.0.3 and upyes no no yes

Note: MySQL Version 4.0.2 is not recommended for replication. As a general rule, it's always recommended to use recent MySQL versions for replication.

Note that when you upgrade a master from MySQL 3.23 to MySQL 4.0 (or 4.1) you should not restart replication using old 3.23 binary logs, because this will unfortunately confuse the 4.0 slave. The upgrade can be safely done this way:

  • Block all updates on the master (FLUSH TABLES WITH READ LOCK).

  • Wait until all the slaves have caught up all changes from the master (use SHOW MASTER STATUS on the master, and SELECT MASTER_POS_WAIT() on the slaves). Then run SLAVE STOP and RESET SLAVE on the slaves.

  • Shutdown MySQL on the master and slaves, upgrade the master and slaves to MySQL 4.0. Restart the master and slaves, and issue CHANGE MASTER TO commands on the slaves, unless the slaves already have the identification of the master in their my.cnf files.

Starting from 4.0.0, one can use LOAD DATA FROM MASTER to set up a slave. Be aware that LOAD DATA FROM MASTER currently works only if all the tables on the master are MyISAM type, and will acquire a global read lock, so no writes are possible while the tables are being transferred from the master. When we implement hot lock-free table backup (in MySQL 5.0), this global read lock will no longer be necessary.

Due to the above limitation, we recommend that at this point you use LOAD DATA FROM MASTER only if the dataset on the master is relatively small, or if a prolonged read lock on the master is acceptable. While the actual speed of LOAD DATA FROM MASTER may vary from system to system, a good rule for a rough estimate of how long it is going to take is 1 second per 1 MB of the datafile. You will get close to the estimate if both master and slave are equivalent to 700 MHz Pentium, are connected through 100 MBit/s network, and your index file is about half the size of your datafile. Of course, this is only a rough order of magnitude estimate.

Once a slave is properly configured and running, it will simply connect to the master and wait for updates to process. If the master goes away or the slave loses connectivity with your master, it will keep trying to connect every master-connect-retry seconds until it is able to reconnect and resume listening for updates.

Each slave keeps track of where it left off. The master server has no knowledge of how many slaves there are or which ones are up-to-date at any given time.

4.10.3. Replication Implementation Details

Three threads are involved in replication : one on the master and two on the slave. When START SLAVE is issued, the I/O thread is created on the slave. It connects to the master and asks it to send its binlogs. Then one thread (named Binlog_dump in SHOW PROCESSLIST on the master) is created on the master to send these binlogs. The I/O thread reads what Binlog_dump sends and simply copies it to some local files in the slave's data directory called relay logs. The last thread, the SQL thread, is created on the slave; it reads the relay logs and executes the queries it contains.

Here is how the three threads show up in SHOW PROCESSLIST. All SHOW PROCESSLIST examples are taken from MySQL version 4.0.15; the content of the State column was changed in that version to be more meaningful.

MASTER show processlist\G
*************************** 1. row ***************************
     Id: 2
   User: root
   Host: localhost:32931
     db: NULL
Command: Binlog Dump
   Time: 94
  State: Has sent all binlog to slave; waiting for binlog to be updated
   Info: NULL
SLAVE show processlist\G
*************************** 1. row ***************************
     Id: 10
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Waiting for master to send event
   Info: NULL
*************************** 2. row ***************************
     Id: 11
   User: system user
   Host:
     db: NULL
Command: Connect
   Time: 11
  State: Has read all relay log; waiting for the I/O slave thread to update it
   Info: NULL

(MASTER and SLAVE prompts were obtained with option -prompt of mysql, Section 4.8.2, “mysql, The Command-line Tool ”.) Here thread 2 is on the master. Thread 10 is the I/O thread on the slave. Thread 11 is the SQL thread on the slave; note that the value in the Time column can tell how late the slave is compared to the master (Section 4.10.8, “Replication FAQ ”).

With SHOW PROCESSLIST you can know what is happening on the master and on the slave as regards replication.

Here are the most common states you will see in the State column for the Binlog_dump thread (if you don't see this thread, then replication is not running, for sure):

  • Sending binlog event to slave. Binlogs are made of events (an event is usually a query plus some information); the thread has read an event from the binlog and is sending it to the slave.

  • Finished reading one binlog; switching to next binlog

  • Has sent all binlog to slave; waiting for binlog to be updated. The thread has read all binary logs and is idle, waiting for connections on this master to write more data into binary logs if they want.

  • Waiting to finalize termination. Very brief state - the thread is stopping.

Here are the most common states you will see in the State column for the I/O thread:

  • Connecting to master. Now attempting to connect to the master.

  • Checking master version. Very brief state - happens just after connection is established.

  • Registering slave on master. Very brief state - happens just after connection is established.

  • Requesting binlog dump. Very brief state - happens just after connection is established; the thread sends to the master a request to send the content of its binlogs, starting from the requested binlog and position.

  • Waiting to reconnect after a failed binlog dump request. If the above request failed (disconnection), this is showed while the thread is sleeping for master-connect-retry seconds before retrying.

  • Reconnecting after a failed binlog dump request. Then the thread tries to reconnect to the master.

  • Waiting for master to send event. The thread is now connected and waiting for binlog events to arrive. This can last for long if the master is idle. This wait will timeout after slave_read_timeout seconds, then the connection will be considered broken and reconnection will be attempted.

  • Queueing master event to the relay log. The thread has read an event and is copying it to the relay log.

  • Waiting to reconnect after a failed master event read. Got an error while reading (disconnection); sleeping for master-connect-retry seconds.

  • Reconnecting after a failed master event read. Then the thread tries to reconnect. When connection is established again, state will be Waiting for master to send event again.

  • Waiting for the SQL slave thread to free enough relay log space. You are using a non-zero relay_log_space_limit, and the relay log(s) has (have) grown so much that its (their) size exceeds the value of this variable. The (I/O) thread is so waiting until the SQL thread frees enough space by deleting some relay logs.

  • Waiting for slave mutex on exit. Very brief state - happens when the thread is stopping.

Here are the most common states you will see in the State column for the SQL thread:

  • Reading event from the relay log

  • the-query-being-executed. The thread has read an event from the relay log, extracted the query from it and is executing the query, the State column shows this query.

  • Has read all relay log; waiting for the I/O slave thread to update it

  • Waiting for slave mutex on exit. Very brief state - happens when the thread is stopping.

Before MySQL 4.0.2, the I/O and SQL threads were one. The advantage brought by the two separate threads is that it makes the reading job and the execution job independant, thus the reading job is not slowed down by the execution job. As soon as the slave starts, even if it has not been running for a while, the I/O thread can quickly fetch all the binlogs, while the SQL thread lags far behind and may take hours to catch. If the slave stops, though it has not executed everything yet, at least it has fetched everything, so binlogs can be purged on the master, as a safe copy is locally stored on the slave for future use.

Relay logs are by default named as the hostname followed by -relay-bin plus a numeric extension. A -relay-bin.index file contains the list of all relay logs currently in use. By default these files are in the slave's data directory. Relay logs have the same format than binary logs, so they can be read with mysqlbinlog. A relay log is automatically deleted by the SQL thread as soon as it no longer needs it (that is, as soon as it has executed all its events). The user has no command to delete relay logs as the SQL thread does the job. However, from MySQL 4.0.14, FLUSH LOGS rotates relay logs, which will influence deletion by the SQL thread. A new relay log is created when the I/O thread starts, or when the size of the current relay log exceeds max_relay_log_size (or if this variable is 0 or the slave is older than MySQL 4.0.14, when the size exceeds max_binlog_size), or when FLUSH LOGS is issued (from version 4.0.14).

Replication also creates two small files in the data directory: these files are the disk images of the output of SHOW SLAVE STATUS (Section 4.10.7, “SQL Commands Related to Replication ” for a description of this command); but as disk images they survive slave's shutdown; this way at restart time the slave still knows his master and where the slave is in the master's binlogs, and where it is in its own relay logs.

  • master.info is updated by the I/O thread. Here is a correspondance between the file's rows and the columns displayed by SHOW SLAVE STATUS:

    Line#Description
    1 Master_Log_File
    2 Read_Master_Log_Pos
    3 Master_Host
    4 Master_User
    5 Password (not in SHOW SLAVE STATUS)
    6 Master_Port
    7 Connect_Retry
  • relay-log.info is updated by the SQL thread. Here is a correspondance between the file's rows and the columns displayed by SHOW SLAVE STATUS:

    Line#Description
    1 Relay_Log_File
    2 Relay_Log_Pos
    3 Relay_Master_Log_File
    4 Exec_master_log_pos

4.10.4. How To Set Up Replication

Here is a quick description of how to set up complete replication on your current MySQL server. It assumes you want to replicate all your databases and have not configured replication before. You will need to shutdown your master server briefly to complete the steps outlined here.

While this method is the most straightforward way to set up a slave, it is not the only one. For example, if you already have a snapshot of the master, and the master already has server id set and binary logging enabled, you can set up a slave without shutting the master down or even blocking the updates. For more details, please see Section 4.10.8, “Replication FAQ ”.

If you want to be able to administrate a MySQL replication setup, we suggest that you read this entire chapter through and try all commands mentioned in Section 4.10.7, “SQL Commands Related to Replication ”. You should also familiarise yourself with replication startup options in my.cnf in Section 4.10.6, “Replication Options in my.cnf”.

  1. Make sure you have a recent version of MySQL installed on the master and slave(s), and that these versions match with the above yes/no array.

    Please, do not report bugs until you have verified that the problem is present in the latest release.

  2. Set up a replication user on the master with the FILE (in MySQL versions older than 4.0.2) or REPLICATION SLAVE privilege in newer MySQL versions. You must also have given this user permission to connect from all the slaves. If the user is only doing replication (which is recommended), you don't need to grant any additional privileges.

    For example, to create a user named repl which can access your master from any host, you might use this command:

    mysql GRANT FILE ON *.* TO repl@"%" IDENTIFIED BY 'password'; # master  4.0.2
    
    mysql GRANT REPLICATION SLAVE ON *.* TO repl@"%" IDENTIFIED BY 'password'; # master = 4.0.2
    

    If you plan to use the LOAD TABLE FROM MASTER or LOAD DATA FROM MASTER commands, you will also need to grant, on the master, to the above user, the REPLICATION CLIENT (or SUPER if the master is older than 4.0.13) privilege, the RELOAD privilege, and SELECT privileges on all tables you want to load. All master tables on which the user can't SELECT will be ignored by LOAD DATA FROM MASTER.

  3. If you are using MyISAM tables, flush all the tables and block write queries by executing FLUSH TABLES WITH READ LOCK command.

    mysql FLUSH TABLES WITH READ LOCK;
    

    and then take a snapshot of the data on your master server.

    The easiest way to do this is to simply use an archiving program (tar on Unix, PowerArchiver, WinRAR, WinZIP or any similar software on Windows) to produce an archive of the databases in your master's data directory. Include all the databases you want to replicate.

    tar -cvf /tmp/mysql-snapshot.tar /path/to/data-dir
    

    If you want to replicate only a database called this_db, you can do just this:

    tar -cvf /tmp/mysql-snapshot.tar /path/to/data-dir/this_db
    

    You may not want to replicate the mysql database, then you can exclude it from the archive too. Into the archive you needn't copy the master's binary logs, error log, master.info / relay-log.info / relay logs (if the master is itself a slave of another machine). You can exclude all this from the archive.

    After or during the process of taking a snapshot, read the value of the current binary log name and the offset on the master:

    mysql  SHOW MASTER STATUS;
    +---------------+----------+--------------+-------------------------------+
    | File          | Position | Binlog_do_db | Binlog_ignore_db              |
    +---------------+----------+--------------+-------------------------------+
    | mysql-bin.003 | 73       | test,bar     | foo,manual,sasha_likes_to_run |
    +---------------+----------+--------------+-------------------------------+
    1 row in set (0.06 sec)
    

    The File column shows the name of the log, while Position shows the offset. In the above example, the binary log value is mysql-bin.003 and the offset is 73. Record the values - you will need to use them later when you are setting up the slave.

    Once you have taken the snapshot and recorded the log name and offset, you can re-enable write activity on the master:

    mysql UNLOCK TABLES;
    

    If you are using InnoDB tables, ideally you should use the InnoDB Hot Backup tool that is available to those who purchase MySQL commercial licenses, support, or the backup tool itself. It will take a consistent snapshot without acquiring any locks on the master server, and record the log name and offset corresponding to the snapshot to be later used on the slave. More information about the tool is avalaible at http://www.innodb.com/hotbackup.html.

    Without the hot backup tool, the quickest way to take a snapshot of InnoDB tables is to shut the master server down and copy the InnoDB datafiles and logs, and the table definition files (.frm). To record the current log file name and offset, you should do the following before you shut down the server:

    mysql FLUSH TABLES WITH READ LOCK;
    mysql SHOW MASTER STATUS;
    

    And then record the log name and the offset from the output of SHOW MASTER STATUS as was shown earlier. Once you have recorded the log name and the offset, shut the server down without unlocking the tables to make sure it goes down with the snapshot corresponding to the current log file and offset:

    shell mysqladmin -uroot shutdown
    

    An alternative for both MyISAM and InnoDB tables is taking an SQL dump of the master instead of a binary copy like above; for this you can use mysqldump -master-data on your master and later run this SQL dump into your slave. This is however slower than doing a binary copy.

    If the master has been previously running without log-bin enabled, the values of log name and position displayed by SHOW MASTER STATUS or mysqldump will be empty. In that case, record empty string (") for the log name, and 4 for the offset.

  4. Make sure that my.cnf on the master has log-bin if it is not there already and server-id=unique number in the [mysqld] section. If those options are not present, add them and restart the server. It is very important that the id of the slave is different from the id of the master. Think of server-id as something similar to the IP address - it uniquely identifies the server instance in the community of replication partners.

    [mysqld]
    log-bin
    server-id=1
    
  5. Stop the slave server(s). Add the following to my.cnf on the slave(s):

    server-id=some unique number between 1 and 2^32-1 that is different from
     that of the master
    

    replacing the values in with what is relevant to your system.

    server-id must be different for each server participating in replication. If you don't specify a server-id, it will be set to 1 if you have not defined master-host, else it will be set to 2. Note that in the case of server-id omission the master will refuse connections from all slaves, and the slave will refuse to connect to a master. Thus, omitting server-id is only good for backup with a binary log.

  6. Start the slave server(s). If it has been replicating previously, start the slave server with option skip-slave-start. You may want to start the slave server with option log-warnings, this way you will get more messages about network/connection problems for example.

  7. Copy the snapshot data into your data directory on your slave(s) (or execute the output of the above mysqldump into the mysql). Make sure that the privileges on the files and directories are correct. The user which MySQL runs as needs to be able to read and write to them, just as on the master.

  8. Execute the following command on the slave(s):

    mysql CHANGE MASTER TO MASTER_HOST='master host name',
     MASTER_USER='replication user name',
     MASTER_PASSWORD='replication password',
     MASTER_LOG_FILE='recorded log file name',
     MASTER_LOG_POS=recorded log offset;
    

    replacing the values in with the actual values relevant to your system.

    The maximum string length for the above variables are:

    MASTER_HOST 60
    MASTER_USER 16
    MASTER_PASSWORD 32
    MASTER_LOG_FILE 255
  9. Start the slave threads:

    mysql START SLAVE;
    

After you have done the above, the slave(s) should connect to the master and catch up on any updates which happened since the snapshot was taken.

If you have forgotten to set server-id for the slave you will get the following error in the error log file:

Warning: one should set server_id to a non-0 value if master_host is set.
The server will not act as a slave.

If you have forgotten to do this for the master, the slaves will not be able to connect to the master.

If a slave is not able to replicate for any reason, you will find error messages in the error log on the slave.

Once a slave is replicating, you will find a file called master.info and one called relay-log.info in the data directory. These two files are used by the slave to keep track of how much of the master's binary log it has processed. Do not remove or edit these files, unless you really know what you are doing. Even in that case, it is preferred that you use CHANGE MASTER TO command. NOTE: the content of master.info overrides some options specified on the command-line or in my.cnf (Section 4.10.6, “Replication Options in my.cnf” for more details).

Now that you have a snapshot, you can use it to set up other slaves. To do so, follow the slave portion of the procedure described above. You do not need to take another snapshot of the master.

4.10.5. Replication Features and Known Problems

Here is an explanation of what is supported and what is not:

  • Replication will be done correctly with AUTO_INCREMENT, LAST_INSERT_ID(), and TIMESTAMP values.

  • USER() and LOAD_FILE() functions are replicated without changes and will thus not work reliably on the slave. This is also true for CONNECTION_ID() in slave versions strictly older than 4.1.1. The newPASSWORD() function in MySQL 4.1, is well replicated since 4.1.1 masters ; your slaves must be 4.1.0 or above to replicate it. If you have older slaves and need to replicate PASSWORD() from your 4.1.x master, you should start your master with option -old-password.

  • sql_mode, FOREIGN_KEY_CHECKS and table_type variables are not replicated.

  • You have to use the same character set (-default-character-set) on the master and the slave. If not, you may get duplicate key errors on the slave, because a key that is regarded as unique in the master character set may not be unique in the slave character set.

  • If you are using transactional tables on the master and non-transactional tables (for the same tables) on the slave, you will get problems if the slave is stopped in the middle of a BEGIN/COMMIT block, as the slave will later start at the beginning of the BEGIN block. This issue is on our TODO and will be fixed in the near future.

  • Update queries that use user variables are badly replicated in 3.23 and 4.0. This is fixed in 4.1. Note that user variables' names are case insensitive starting from version 5.0, so you should take this into account when setting up replication between 5.0 and a previous version.

  • The slave cannot connect to the master using SSL yet.

  • There is a theoretical chance (though we have never heard of it actually occurring) that the data on the master and slave become different if a query is designed in such a way that the data modification is non-deterministic, that is, left to the will of the query optimiser (which generally is not good practice!). For a detailed explanation, see Section 1.7.6.2, “Open Bugs / Design Deficiencies in MySQL ”.

  • Strictly before MySQL 4.1.1, FLUSH, ANALYZE, OPTIMIZE, REPAIR commands are not stored in the binary log and are because of this not replicated to the slaves. This is not normally a problem as these commands don't change anything. This does however mean that if you update the MySQL privilege tables directly without using the GRANT statement and you replicate the mysql privilege database, you must do a FLUSH PRIVILEGES on your slaves to put the new privileges into effect. Also if you use FLUSH TABLES when renaming a MyISAM table involved in a MERGE table, you will have to issue FLUSH TABLES manually on the slave. Since MySQL 4.1.1, these commands are written to the binary log (except FLUSH LOGS, FLUSH MASTER, FLUSH SLAVE, FLUSH TABLES WITH READ LOCK) unless you specify NO_WRITE_TO_BINLOG (or its alias LOCAL) (for a syntax example, Section 4.5.3, “FLUSH Syntax ”).

  • MySQL only supports one master and many slaves. Later we will add a voting algorithm to automatically change master if something goes wrong with the current master. We will also introduce 'agent' processes to help do load balancing by sending select queries to different slaves.

  • Temporary tables are replicated with the exception of the case when you shut down slave server (not just slave thread) when you have some replicated temporary tables and they are used in subsequent updates by the master. To deal with this problem shutting down the slave, do STOP SLAVE, check Slave_open_temp_tables variable to see if it is 0, if so issue mysqladmin shutdown. If the number is not 0, restart the slave threads with START SLAVE and see if you have better luck next time. We have plans to fix this in the near future.

  • It is safe to connect servers in a circular master-slave relationship with log-slave-updates enabled. Note, however, that many queries will not work right in this kind of setup unless your client code is written to take care of the potential problems that can happen from updates that occur in different sequence on different servers.

    This means that you can do a setup like the following:

    A - B - C - A
    

    Thanks to server ids, which are encoded in the binary log events, A will know when the event it reads had originally been created by A, so A will not execute it and there will be no infinite loop. But this circular setup will only work if you only do non conflicting updates between the tables. In other words, if you insert data in A and C, you should never insert a row in A that may have a conflicting key with a row insert in C. You should also not update the same rows on two servers if the order in which the updates are applied matters.

  • If the query on the slave gets an error, the slave SQL thread will terminate, and a message will appear in the .err file. You should then connect to the slave manually, fix the cause of the error (for example, non-existent table), and then run START SLAVE.

  • If connection to the master is lost, the slave will retry immediately, and then in case of failure every master-connect-retry (default 60) seconds. Because of this, it is safe to shut down the master, and then restart it after a while. The slave will also be able to deal with network connectivity outages. However, the slave will notice the network outage only after receiving no data from the master for slave_net_timeout seconds. So if your outages are short, you may want to decrease slave_net_timeout ; see Section 4.5.7.4, “SHOW VARIABLES”.

  • Shutting down the slave (cleanly) is also safe, as it keeps track of where it left off. Unclean shutdowns might produce problems, especially if disk cache was not synced before the system died. Your system fault tolerance will be greatly increased if you have a good UPS.

  • Due to the non-transactional nature of MyISAM tables, it is possible to have a query that will only partially update a table and return an error code. This can happen, for example, on a multi-row insert that has one row violating a key constraint, or if a long update query is killed after updating some of the rows. If that happens on the master, the slave thread will exit and wait for the DBA to decide what to do about it unless the error code is legitimate and the query execution results in the same error code. If this error code validation behaviour is not desirable, some ( or all) errors could be masked out with slave-skip-errors option starting in Version 3.23.47.

  • If you update transactional tables from non-transactional tables inside a BEGIN/COMMIT segment updates to the binary log may be out of sync if some thread changes the non-transactional table before the transaction commits. This is because the transaction is written to the binary log only when it's commited.

  • Before version 4.0.15, any update to a non-transactional table is written to the binary log at once when the update is made while transactional updates are written on COMMIT or not written at all if you use ROLLBACK; you have to take this into account when updating both transactional tables and non-transactional tables in the same transaction if you are using binary logging for backups or replication. In version 4.0.15 we changed the behaviour of logging of transactions which mix updates to transactional and non-transactional tables, which solves the problems (order of queries is good in binlog, and all needed queries are written to the binlog even in case of ROLLBACK). The problem which remains is when a second connection updates the non-transactional table while the first connection's transaction is not finished yet (wrong order can still occur, because the second connection's update will be written immediately after it is done).

The following table is about problems in 3.23 that are fixed in 4.0:

  • LOAD DATA INFILE will be handled properly as long as the file still resides on the master server at the time of update propagation.

  • LOAD LOCAL DATA INFILE will be skipped.

  • In 3.23 RAND() in updates does not replicate properly. Use RAND(some_non_rand_expr) if you are replicating updates with RAND(). You can, for example, use UNIX_TIMESTAMP() for the argument to RAND(). This is fixed in 4.0.

4.10.6. Replication Options in my.cnf

On both master and slave you need to use the server-id option. This sets a unique replication id. You should pick a unique value in the range between 1 to 2^32-1 for each master and slave. Example: server-id=3

The options you can use on the MASTER are all described there: see Section 4.9.4, “The Binary Log ”.

The following table describes the options you can use on the SLAVE. It is recommended to read the following paragraph; these options can help you customize replication to suit your needs.

NOTE: replication handles the following options :

  • master-host

  • master-user

  • master-password

  • master-port

  • master-connect-retry

in a special way. If no master.info file exists (replication is used for the very first time or you have run RESET SLAVE and shutdown/restarted the slave server), the slave uses values specified on the command-line or in my.cnf. But if master.info exists, the slave IGNORES any values specified on the command-line or in my.cnf, and uses instead the values it reads from master.info. For example, if you have

master-host=this_host

in your my.cnf, are using replication, then want to replicate from another host, modifying the above line in my.cnf will have no effect. You must use CHANGE MASTER TO instead. This holds true for master-host, master-user, master-password, master-port, master-connect-retry. Therefore, you may decide to put no master-* options in my.cnf and instead use only CHANGE MASTER TO (Section 4.10.7, “SQL Commands Related to Replication ”).

OptionDescription
log-slave-updates Tells the slave to log the updates done by the slave SQL thread to the slave's binary log. Off by default. Of course, it requires that the slave be started with binary logging enabled (log-bin option). You have to use log-slave-updates to chain several slaves ; for example for the following setup to work
A - B -C
(C is a slave of B which is a slave of A) you need tostart B with the log-slave-updatesoption.
log-warnings Makes the slave print more messages about what it is doing. For example, it will warn you that it succeeded in reconnecting after a network/connection failure, and warn you about how each slave thread started.
max-relay-log-size=# To rotate the relay log automatically. Section 4.5.7.4, “SHOW VARIABLES”.
master-host=host Master hostname or IP address for replication. If not set, the slave thread will not be started. Note that the setting of master-host will be ignored if there exists a valid master.info file. Probably a better name for this options would have been something like bootstrap-master-host, but it is too late to change now. Example: master-host=db-master.mycompany.com
master-user=username The username the slave thread will use for authentication when connecting to the master. The user must have the FILE privilege. If the master user is not set, user test is assumed. The value in master.info will take precedence if it can be read. Example: master-user=scott
master-password=password The password the slave thread will authenticate with when connecting to the master. If not set, an empty password is assumed.The value in master.info will take precedence if it can be read. Example: master-password=tiger
master-port=portnumber The port the master is listening on. If not set, the compiled setting of MYSQL_PORT is assumed. If you have not tinkered with configure options, this should be 3306. The value in master.info will take precedence if it can be read. Example: master-port=3306
master-connect-retry=seconds The number of seconds the slave thread will sleep before retrying to connect to the master in case the master goes down or the connection is lost. Default is 60. The value in master.info will take precedence if it can be read. Example: master-connect-retry=60
master-ssl Planned to enable the slave to connect to the master using SSL. Does nothing yet! Example: master-ssl
master-ssl-key=filename Master SSL keyfile name. Only applies if you have enabled master-ssl. Does nothing yet. Example: master-ssl-key=SSL/master-key.pem
master-ssl-cert=filename Master SSL certificate file name. Only applies if you have enabled master-ssl. Does nothing yet. Example: master-ssl-cert=SSL/master-cert.pem
master-ssl-capath Master SSL CA path. Only applies if you have enabled master-ssl. Does nothing yet.
master-ssl-cipher Master SSL cipher. Only applies if you have enabled master-ssl. Does nothing yet.
master-info-file=filename To give master.info another name and/or to put it in another directory than the data directory.
relay-log=filename To specify the location and name that should be used for relay logs. You can use this to have hostname-independant relay log names, or if your relay logs tend to be big (and you don't want to decrease max_relay_log_size) and you need to put them on some area different from the data directory, or if you want to increase speed by balancing load between disks.
relay-log-index=filename To specify the location and name that should be used for the relay logs index file.
relay-log-info-file=filename To give relay-log.info another name and/or to put it in another directory than the data directory.
relay-log-purge=0|1 Available since MySQL 4.1.1. Disables/enables automatic purging of relay logs as soon as they are not needed anymore. This is a global variable which can be dynamically changed with SET GLOBAL RELAY_LOG_PURGE=0|1. The default value is 1.
relay-log-space-limit=# To put an upper limit on the total size of all relay logs on the slave (a value of 0 means "unlimited"). This is useful if you have a small hard disk on your slave machine. When the limit is reached, the I/O thread pauses (does not read the master's binlog) until the SQL thread has catched up and deleted some now unused relay logs. Note that this limit is not absolute: there are cases where the SQL thread needs more events to be able to delete; in that case the I/O thread will overgo the limit until deletion becomes possible. Not doing so would cause a deadlock (which happens before MySQL 4.0.13). Users should not set relay-log-space-limit to less than twice the value of max-relay-log-size (or max-binlog-size if max-relay-log-size is 0) because in that case there are chances that when the I/O thread waits for free space because relay-log-space-limit is exceeded, the SQL thread has no relay log to purge and so cannot satisfy the I/O thread, forcing the I/O thread to temporarily ignore relay-log-space-limit.
replicate-do-table=db_name.table_name Tells the slave thread to restrict replication to the specified table. To specify more than one table, use the directive multiple times, once for each table. This will work for cross-database updates, in contrast to replicate-do-db. Please read notes which follow this table. Example: replicate-do-table=some_db.some_table
replicate-ignore-table=db_name.table_name Tells the slave thread to not replicate any command that updates the specified table (even if any other tables may be update by the same command). To specify more than one table to ignore, use the directive multiple times, once for each table. This will work for cross-database updates, in contrast to replicate-ignore-db. Please read notes which follow this table. Example: replicate-ignore-table=db_name.some_table
replicate-wild-do-table=db_name.table_name Tells the slave thread to restrict replication to queries where any of the updated tables match the specified wildcard pattern. To specify more than one table, use the directive multiple times, once for each table. This will work for cross-database updates. Please read notes which follow this table. Example: replicate-wild-do-table=foo%.bar% will replicate only updates that uses a table in any databases that start with foo and whose table names start with bar. Note that if you do replicate-wild-do-table=foo%.% then the rule will be propagated to CREATE DATABASE and DROP DATABASE, that is, these two statements will be replicated if the database name matches the database pattern ('foo%' here) (this magic is triggered by '%' being the table pattern).
replicate-wild-ignore-table=db_name.table_name Tells the slave thread to not replicate a query where any table matches the given wildcard pattern. To specify more than one table to ignore, use the directive multiple times, once for each table. This will work for cross-database updates. Please read notes which follow this table. Example: replicate-wild-ignore-table=foo%.bar% will not do updates to tables in databases that start with foo and whose table names start with bar. Note that if you do replicate-wild-ignore-table=foo%.% then the rule will be propagated to CREATE DATABASE and DROP DATABASE, that is, these two statements will not be replicated if the database name matches the database pattern ('foo%' here) (this magic is triggered by '%' being the table pattern).
replicate-do-db=database_name Tells the slave to restrict replication to commands where the current database (that is, the one selected by USE) is database_name. To specify more than one database, use the directive multiple times, once for each database. Note that this will not replicate cross-database queries such as UPDATE some_db.some_table SET foo='bar' while having selected a different or no database. If you need cross database updates to work, make sure you have 3.23.28 or later, and use replicate-wild-do-table=db_name.%. Please read notes which follow this table. Example: replicate-do-db=some_db. Example of what does not work as you could expect it: if the slave is started with replicate-do-db=sales, and you do USE prices; UPDATE sales.january SET amount=amount+1000;, this query will not be replicated. If you need cross database updates to work, use replicate-wild-do-table=db_name.% instead. The main reason for this "just-check-the-current-database" behaviour is that it's hard from the command alone to know if a query should be replicated or not ; for example if you are using multi-table-delete or multi-table-update commands that go across multiple databases. It's also very fast to just check the current database.
replicate-ignore-db=database_name Tells the slave to not replicate any command where the current database (that is, the one selected by USE) is database_name. To specify more than one database to ignore, use the directive multiple times, once for each database. You should not use this directive if you are using cross table updates and you don't want these update to be replicated. Please read notes which follow this table. Example: replicate-ignore-db=some_db. Example of what does not work as you could expect it: if the slave is started with replicate-ignore-db=sales, and you do USE prices; UPDATE sales.january SET amount=amount+1000;, this query will be replicated. If you need cross database updates to work, use replicate-wild-ignore-table=db_name.% instead.
replicate-rewrite-db=from_name-to_name Tells the slave to translate the current database (that is, the one selected by USE) to to_name if it was from_name on the master. Only statements involving tables may be affected (CREATE DATABASE, DROP DATABASE won't), and only if from_name was the current database on the master. This will not work for cross-database updates. Note that the translation is done before replicate-* rules are tested. Example: replicate-rewrite-db=master_db_name-slave_db_name
report-host=host Available after 4.0.0. Hostname or IP of the slave to be reported to the master during slave registration. Will appear in the output of SHOW SLAVE HOSTS. Leave unset if you do not want the slave to register itself with the master. Note that it is not sufficient for the master to simply read the IP of the slave off the socket once the slave connects. Due to NAT and other routing issues, that IP may not be valid for connecting to the slave from the master or other hosts. For the moment this option has no real interest ; it is meant for failover replication which is not implemented yet. Example: report-host=slave1.mycompany.com
report-port=portnumber Available after 4.0.0. Port for connecting to slave reported to the master during slave registration. Set it only if the slave is listening on a non-default port or if you have a special tunnel from the master or other clients to the slave. If not sure, leave this option unset. For the moment this option has no real interest ; it is meant for failover replication which is not implemented yet.
skip-slave-start Tells the slave server not to start the slave threads on server startup. The user can start them later with START SLAVE.
slave_compressed_protocol=# If 1, then use compression on the slave/client protocol if both slave and master support this.
slave-load-tmpdir=filename This option is by default equal to tmpdir. When the SQL slave replicates a LOAD DATA INFILE command, it extracts the to-be-loaded file from the relay log into temporary files, then loads these into the table. If the file loaded on the master was huge, the temporary files on the slave will be huge too; therefore you may wish/have to tell the slave to put the temporary files on some large disk different from tmpdir, using this option. In that case, you may also use the relay-log option, as relay logs will be huge too.
slave-net-timeout=# Number of seconds to wait for more data from the master before aborting the read, considering the connection broken and retrying to connect, first time immediately, then every master-connect-retry seconds.
slave-skip-errors= [err_code1,err_code2,... | all] Tells the slave SQL thread to continue replication when a query returns an error from the provided list. Normally, replication will discontinue when an error is encountered, giving the user a chance to resolve the inconsistency in the data manually. Do not use this option unless you fully understand why you are getting the errors. If there are no bugs in your replication setup and client programs, and no bugs in MySQL itself, you should never get an abort with error. Indiscriminate use of this option will result in slaves being hopelessly out of sync with the master and you having no idea how the problem happened. For error codes, you should use the numbers provided by the error message in your slave error log and in the output of SHOW SLAVE STATUS. A full list of error messages can be found in the source distribution in Docs/mysqld_error.txt. You can (but should not) also use a very non-recommended value of all which will ignore all error messages and keep barging along regardless. Needless to say, if you use it, we make no promises regarding your data integrity. Please do not complain if your data on the slave is not anywhere close to what it is on the master in this case - you have been warned. Example: slave-skip-errors=1062,1053 or slave-skip-errors=all

Some of these options, like all replicate-* options, can only be set at the slave server's startup, not on-the-fly. We plan to fix this.

Here is the order of evaluation of the replicate-* rules, to decide if the query is going to be executed by the slave or ignored by it:

  1. Are there some replicate-do-db or replicate-ignore-db rules?

    • Yes: test them like for binlog-do-db and binlog-ignore-db (Section 4.9.4, “The Binary Log ”). What is the result of the test?

      • ignore the query: ignore it and exit.

      • execute the query: don't execute it immediately, defer the decision, go to step below.

    • No: go to step below.

  2. Are there some replicate-*-table rules?

    • No: execute the query and exit.

    • Yes: go to step below. Only tables which are to be updated will be compared to rules (INSERT INTO sales SELECT * from prices: only sales will be compared to rules). If several tables are to be updated (multi-table statement), the first matching table (matching "do" or "ignore") wins (i.e. the first table is compared to rules, then if no decision could be taken the second table is compared to rules etc).

  3. Are there some replicate-do-table rules?

    • Yes: does the table match any of them?

      • Yes: execute the query and exit.

      • No: go to step below.

    • No: go to step below.

  4. Are there some replicate-ignore-table rules?

    • Yes: does the table match any of them?

      • Yes: ignore the query and exit.

      • No: go to step below.

    • No: go to step below.

  5. Are there some replicate-wild-do-table rules?

    • Yes: does the table match any of them?

      • Yes: execute the query and exit.

      • No: go to step below.

    • No: go to step below.

  6. Are there some replicate-wild-ignore-table rules?

    • Yes: does the table match any of them?

      • Yes: ignore the query and exit.

      • No: go to step below.

    • No: go to step below.

  7. No replicate-*-table rule was matched. Is there another table to test against these rules?

    • Yes: loop.

    • No: we have tested all tables to be updated, could not match any rule. Are there replicate-do-table or replicate-wild-do-table rules ?

      • Yes: ignore the query and exit.

      • No: execute the query and exit.

4.10.7. SQL Commands Related to Replication

Replication can be controlled through the SQL interface. Here is the summary of commands. Near each command you will find "(Slave)", meaning this command is issued on the slave, or "Master", meaning it is issued on the master.

4.10.7.1. START SLAVE (slave)

Starts the slave threads. Was called SLAVE START in MySQL 3.23. As of MySQL 4.0.2, you can add IO_THREAD or SQL_THREAD options to the statement to start only the I/O thread or the SQL thread. The I/O thread reads queries from the master server and stores them in the relay log. The SQL thread reads the relay log and executes the queries. Note that if START SLAVE succeeds in starting the slave threads it will return without any error. But even in that case it might be that slave threads start and then later stop (because they don't manage to connect to the master or read his binlogs or any other problem). START SLAVE will not warn you about this, you have to check your slave's .err file for error messages generated by the slave threads, or check that these are running fine with SHOW SLAVE STATUS.

4.10.7.2. STOP SLAVE (slave)

Stops the slave threads. Was called SLAVE STOP in MySQL 3.23. Like SLAVE START, this statement may be used with IO_THREAD and SQL_THREAD options.

4.10.7.3. SET SQL_LOG_BIN=0|1 (master)

Disables/enables binary logging for the user's connection (SQL_LOG_BIN is a session variable) if the user has the SUPER privilege. Ignored otherwise.

4.10.7.4. SET GLOBAL SQL_SLAVE_SKIP_COUNTER=n (slave)

Skip the next n events from the master. Only valid when the slave thread is not running, otherwise, gives an error. Useful for recovering from replication stops caused by a statement.

4.10.7.5. RESET MASTER (master)

Deletes all binary logs listed in the index file, resetting the binlog index file to be empty. Previously named FLUSH MASTER.

4.10.7.6. RESET SLAVE (slave)

Makes the slave forget its replication position in the master's binlogs, deletes the master.info and relay-log.info files, all relay logs, starts a new relay log. Note: relay logs which had not been totally executed by the SQL slave thread (which are likely to exist if you issued STOP SLAVE in an highly-loaded replication slave) are also deleted. Connection information (master host, master port, master user, master password) is immediately reset to the values specified in startup options (master-host etc) if there were some. Previously named FLUSH SLAVE.

4.10.7.7. LOAD TABLE tblname FROM MASTER (slave)

Downloads a copy of the table from master to the slave. Implemented mainly for debugging of LOAD DATA FROM MASTER. Requires that the replication user which is used to connect to the master has RELOAD and SUPER privileges on the master, and SELECT on the master table to load. On the slave's side, the user which issues LOAD TABLE FROM MASTER should have grants to drop and create the table. Please read the timeout notes in the description of LOAD DATA FROM MASTER below, they apply here too.

4.10.7.8. LOAD DATA FROM MASTER (slave)

Takes a snapshot of the master and copies it to the slave. Updates the values of MASTER_LOG_FILE and MASTER_LOG_POS so that the slave will start replicating from the correct position. Will honor table and database exclusion rules specified with replicate-* options. So far works only with MyISAM tables and acquires a global read lock on the master while taking the snapshot. In the future it is planned to make it work with InnoDB tables and to remove the need for global read lock using the non-blocking online backup feature.

If you are loading big tables, you may have to increase the values of net_read_timeout and net_write_timeout on both your master and slave ; see Section 4.5.7.4, “SHOW VARIABLES”.

Note that LOAD DATA FROM MASTER does NOT copy any tables from the mysql database. This is to make it easy to have different users and privileges on the master and the slave.

Requires that the replication user which is used to connect to the master has RELOAD and SUPER privileges on the master, SELECT privileges on all master's tables you want to load. All master's tables on which the user has no SELECT privilege will be ignored by LOAD DATA FROM MASTER; this is because the master will hide them to the user: LOAD DATA FROM MASTER calls SHOW DATABASES to know the master databases to load, but SHOW DATABASES returns only databases on which the user has some privilege, Section 4.5.7.1, “Retrieving information about Database, Tables, Columns, and Indexes ”. On the slave's side, the user which issues LOAD DATA FROM MASTER should have grants to drop and create the involved databases and tables.

4.10.7.9. CHANGE MASTER TO master_def_list (slave)

CHANGE MASTERis a "brutal" command, it is recommended to read this whole description before using it in production.

Changes the master parameters (connection and binlog information) to the values specified in master_def_list. master_def_list is a comma-separated list of master_def where master_def is one of the following: MASTER_HOST, MASTER_USER, MASTER_PASSWORD, MASTER_PORT, MASTER_CONNECT_RETRY, MASTER_LOG_FILE, MASTER_LOG_POS, RELAY_LOG_FILE, RELAY_LOG_POS (these last two only starting from MySQL 4.0). For example:

CHANGE MASTER TO
  MASTER_HOST='master2.mycompany.com',
  MASTER_USER='replication',
  MASTER_PASSWORD='bigs3cret',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='master2-bin.001',
  MASTER_LOG_POS=4,
  MASTER_CONNECT_RETRY=10;
CHANGE MASTER TO
  RELAY_LOG_FILE='slave-relay-bin.006',
  RELAY_LOG_POS=4025;

You only need to specify the values that need to be changed. The values that you omit will stay the same with the exception of when you specify (not necessarily change) the host or port. In that case, the slave will assume that the master is different from before. Therefore, the old values of master's binlog name and position are considered no longer applicable, thus if you didn't specify MASTER_LOG_FILE and MASTER_LOG_POS in the command, MASTER_LOG_FILE=" and MASTER_LOG_POS=4 will silently be appended to it.

MASTER_LOG_FILE and MASTER_LOG_POS are the coordinates from which the I/O slave thread will start reading from the master, next time this thread is started. As CHANGE MASTER deletes relay logs (see below), they are also the coordinates from which the SQL slave thread will start executing next time it is started.

CHANGE MASTERdeletes all relay logs (and starts a new one), unless you specified RELAY_LOG_FILE or RELAY_LOG_POS (in that case relay logs will be kept; since MySQL 4.1.1 the RELAY_LOG_PURGE global variable will silently be set to 0). CHANGE MASTER updates master.info and relay-log.info.

Note: if, just before you issue CHANGE MASTER, the SQL slave thread is late by one or more queries compared to the I/O thread (a very common case when replication is running in high-load environments), then as CHANGE MASTER deletes relay logs containing these non-executed queries, and so replication then restarts from the coordinates of the I/O thread, the SQL thread will have "lept" over the non-executed queries. Therefore, unless these queries were not important, you should, before issuing CHANGE MASTER, either:

  • ensure that the SQL slave thread has read all existing relay logs; you can achive this by stopping only the I/O slave thread (STOP SLAVE IO_THREAD), then monitoring the progress of the running SQL slave thread with SHOW SLAVE STATUS and SELECT MASTER_POS_WAIT(), until it has caught up. This way there will be no leap for the SQL slave thread.

  • run STOP SLAVE, check where the SQL slave thread is in the master's binlog (using SHOW SLAVE STATUS, columns Relay_Master_Log_File and Exec_master_log_pos), and add a specification of these coordinates to the CHANGE MASTER command (MASTER_LOG_FILE=..., MASTER_LOG_POS=...). This way, you will instruct the I/O slave thread to start replication from the former coordinates of the SQL slave thread, so there will be no leap for the SQL slave thread.

If you don't take care of this issue, even a simple STOP SLAVE; CHANGE MASTER TO MASTER_USER='repl'; START SLAVE; run in the middle of an highly-loaded replication could break this replication and spoil the slave's data.

CHANGE MASTER is useful for setting up a slave when you have the snapshot of the master and have recorded the log and the offset on the master that the snapshot corresponds to. You can run CHANGE MASTER TO MASTER_LOG_FILE='log_name_on_master', MASTER_LOG_POS=log_offset_on_master on the slave after restoring the snapshot.

The first example above (CHANGE MASTER TO MASTER_HOST='master2.mycompany.com' etc) changes the master and master's binlog coordinates. This is when you want the slave to replicate the master. The second example, less frequently used, is when the slave has relay logs which, for some reason, you want the slave to execute again; to do this the master needn't be reachable, you just have to do CHANGE MASTER TO and start the SQL thread (START SLAVE SQL_THREAD). You can even use this out of a replication setup, on a standalone, slave-of-nobody server, to recover after a crash. Suppose your server has crashed and you have restored a backup. You want to replay the server's own binlogs (not relay logs, but regular binary logs), supposedly named myhost-bin.*. First make a backup copy of these binlogs in some safe place, in case you don't exactly follow the procedure below and accidentally have the server purge the binlogs. If using MySQL 4.1.1 or newer, do SET GLOBAL RELAY_LOG_PURGE=0 for additional safety. Then start the server without log-bin, with a new (different from before) server id, with relay-log=myhost-bin (to make the server believe that these regular binlogs are relay logs) and skip-slave-start, then issue

CHANGE MASTER TO RELAY_LOG_FILE='myhost-bin.153',RELAY_LOG_POS=410, MASTER_HOST='some_dummy_string';
START SLAVE SQL_THREAD;

Then the server will read and execute its own binlogs, thus achieving crash recovery. Once the recovery is finished, run STOP SLAVE, shutdown the server, delete master.info and relay-log.info, and restart the server with its original options. For the moment, specifying MASTER_HOST (even with a dummy value) is compulsory to make the server think he is a slave, and giving the server a new, different from before, server id is also compulsory otherwise the server will see events with its id and think it is in a circular replication setup and skip the events, which is unwanted. In the future we plan to add options to get rid of these small constraints.

4.10.7.10. MASTER_POS_WAIT() (slave)

This is not a command but a function, used to ensure that the slave has reached (read and executed up to) a given position in the master's binlog; see Section 6.3.6.2, “Miscellaneous Functions ” for a full description.

4.10.7.11. SHOW MASTER STATUS (master)

Provides status information on the binlog of the master.

4.10.7.12. SHOW SLAVE HOSTS (master)

Gives a listing of slaves currently registered with the master.

4.10.7.13. SHOW SLAVE STATUS (slave)

Provides status information on essential parameters of the slave threads (Slave). If you type it in the mysql client, you can put a \G instead of a semicolon at the end, to get a vertical, more readable layout:

SLAVE show slave status\G
*************************** 1. row ***************************
          Master_Host: localhost
          Master_User: root
          Master_Port: 3306
        Connect_retry: 3
      Master_Log_File: gbichot-bin.005
  Read_Master_Log_Pos: 79
       Relay_Log_File: gbichot-relay-bin.005
        Relay_Log_Pos: 548
Relay_Master_Log_File: gbichot-bin.005
     Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
      Replicate_do_db:
  Replicate_ignore_db:
           Last_errno: 0
           Last_error:
         Skip_counter: 0
  Exec_master_log_pos: 79
      Relay_log_space: 552
1 row in set (0.00 sec)
  • Master_Host the current master host.

  • Master_User the current user used to connect to the master.

  • Master_Port the current master port.

  • Connect_Retry the current value of master-connect-retry.

  • Master_Log_File the master's binlog in which the I/O thread is currently reading.

  • Read_Master_Log_Pos the position which the I/O thread has read up to in this master's binlog.

  • Relay_Log_File the relay log which the SQL thread is currently reading and executing.

  • Relay_Log_Pos the position which the SQL thread has read and executed up to in this relay log.

  • Relay_Master_Log_File the master's binlog which contains the last event executed by the SQL thread.

  • Slave_IO_Running tells whether the I/O thread is started or not.

  • Slave_SQL_Running tells whether the SQL thread is started or not.

  • Replicate_do_db / Replicate_ignore_db the lists of the databases which have been specified with option replicate-do-db / replicate-ignore-db; starting from version 4.1, options replicate_*_table are also displayed in four more columns.

  • Last_errno the error number returned by the lastly executed query (should be 0).

  • Last_error the error message returned by the lastly executed query (should be empty); if not empty, you will find this message in the slave's error log too. For example:

    Last_errno: 1051
    Last_error: error 'Unknown table 'z'' on query 'drop table z'
    

    Here the table 'z' existed on the master and was dropped there, but it did not exist on the slave (the user had forgotten to copy it to the slave when setting the slave up), so DROP TABLE failed on the slave.

  • Skip_counter the last used value for SQL_SLAVE_SKIP_COUNTER.

  • Exec_master_log_pos the position in the master's binlog (Relay_Master_Log_File) of the last event executed by the SQL thread. ((Relay_Master_Log_File,Exec_master_log_pos) in the master's binlog corresponds to (Relay_Log_File,Relay_Log_Pos) in the relay log).

  • Relay_log_space the total size of all existing relay logs.

4.10.7.14. SHOW MASTER LOGS (master)

Lists the binary logs on the master. You should use this command prior to PURGE MASTER LOGS to find out how far you should go.

4.10.7.15. SHOW BINLOG EVENTS (master)

SHOW BINLOG EVENTS [ IN 'logname' ] [ FROM pos ] [ LIMIT [offset,] rows ]

Shows the events in the binary log. If you do not specify 'logname', the first binary log will be displayed.

4.10.7.16. PURGE MASTER LOGS (master)

PURGE MASTER|BINARY LOGS TO 'logname' ; PURGE MASTER|BINARY LOGS BEFORE 'date'

The BEFORE variant is available in MySQL 4.1; its date argument can be in format 'YYYY-MM-DD HH:MI:SS'. MASTER and BINARY are here synonyms. Deletes all the binary logs that are listed in the log index as being strictly prior to the specified log or date, and removes them from the log index, so that the given log now becomes the first. Example:

PURGE MASTER LOGS TO 'mysql-bin.010' ;
PURGE MASTER LOGS BEFORE '2003-04-02 22:46:26' ;

This command will do nothing and fail with an error if you have an active slave that is currently reading one of the logs you are trying to delete. However, if you have a dormant slave, and happen to purge one of the logs it wants to read, the slave will be unable to replicate once it comes up. The command is safe to run while slaves are replicating - you do not need to stop them.

You must first check all the slaves with SHOW SLAVE STATUS to see which log they are on, then do a listing of the logs on the master with SHOW MASTER LOGS, find the earliest log among all the slaves (if all the slaves are up to date, this will be the last log on the list), backup all the logs you are about to delete (optional) and purge up to the target log.

4.10.8. Replication FAQ

Q: How do I configure a slave if the master is already running and I do not want to stop it?

A: There are several options. If you have taken a backup of the master at some point and recorded the binlog name and offset ( from the output of SHOW MASTER STATUS ) corresponding to the snapshot, do the following:

  • Make sure unique server id is assigned to the slave.

  • Execute CHANGE MASTER TO MASTER_HOST='master-host-name', MASTER_USER='master-user-name', MASTER_PASSWORD='master-pass', MASTER_LOG_FILE='recorded-log-name', MASTER_LOG_POS=recorded_log_pos

  • Execute SLAVE START

If you do not have a backup of the master already, here is a quick way to do it consistently:

  • FLUSH TABLES WITH READ LOCK

  • gtar zcf /tmp/backup.tar.gz /var/lib/mysql ( or a variation of this)

  • SHOW MASTER STATUS - make sure to record the output - you will need it later

  • UNLOCK TABLES

An alternative is taking an SQL dump of the master instead of a binary copy like above; for this you can use mysqldump -master-data on your master and later run this SQL dump into your slave. This is however slower than doing a binary copy.

No matter which of the two ways you used, afterwards follow the instructions for the case when you have a snapshot and have recorded the log name and offset. You can use the same snapshot to set up several slaves. As long as the binary logs of the master are left intact, you can wait as long as several days or in some cases maybe a month to set up a slave once you have the snapshot of the master. In theory the waiting gap can be infinite. The two practical limitations is the diskspace of the master getting filled with old logs, and the amount of time it will take the slave to catch up.

You can also use LOAD DATA FROM MASTER. This is a convenient command that will take a snapshot, restore it to the slave, and adjust the log name and offset on the slave all at once. In the future, LOAD DATA FROM MASTER will be the recommended way to set up a slave. Be warned, howerver, that the read lock may be held for a long time if you use this command. It is not yet implemented as efficiently as we would like to have it. If you have large tables, the preferred method at this time is still with a local tar snapshot after executing FLUSH TABLES WITH READ LOCK.

Q: Does the slave need to be connected to the master all the time?

A: No, it does not. You can have the slave go down or stay disconnected for hours or even days, then reconnect, catch up on the updates, and then disconnect or go down for a while again. So you can, for example, use master-slave setup over a dial-up link that is up only for short periods of time. The implications of that are that at any given time the slave is not guaranteed to be in sync with the master unless you take some special measures. In the future, we will have the option to block the master until at least one slave is in sync.

Q: How do I know how late the slave is compared to the master? In other words, how do I know the date of the last query replicated by the slave?

A: This is possible only if the SQL slave thread exists (that is, if it shows up in SHOW PROCESSLIST, Section 4.10.3, “Replication Implementation Details ”) (in MySQL 3.23: if the slave thread exists, that is, shows up in SHOW PROCESSLIST), and if it has executed at least one event from the master. Indeed, when the SQL slave thread executes an event read from the master, this thread modifies its own time to the event's timestamp (this is why TIMESTAMP is well replicated). So in the Time column in the output of SHOW PROCESSLIST, the number of seconds displayed for the SQL slave thread is the number of seconds between the timestamp of the last replicated event and the real time of the slave machine. You can use this to determine the date of the last replicated event. Note that if your slave has been disconnected from the master for one hour, then reconnects, you may immediately see Time values like 3600 for the SQL slave thread in SHOW PROCESSLIST... This would be because the slave is executing queries that are one hour old.

Q: How do I force the master to block updates until the slave catches up?

A: Execute the following commands:

  • Master: FLUSH TABLES WITH READ LOCK

  • Master: SHOW MASTER STATUS - record the log name and the offset

  • Slave: SELECT MASTER_POS_WAIT('recorded_log_name', recorded_log_offset) When the select returns, the slave is currently in sync with the master

  • Master: UNLOCK TABLES - now the master will continue updates.

Q: What issues should I be aware of when setting up two-way replication?

A: MySQL replication currently does not support any locking protocol between master and slave to guarantee the atomicity of a distributed (cross-server) update. In other words, it is possible for client A to make an update to co-master 1, and in the meantime, before it propagates to co-master 2, client B could make an update to co-master 2 that will make the update of client A work differently than it did on co-master 1. Thus when the update of client A will make it to co-master 2, it will produce tables that will be different from what you have on co-master 1, even after all the updates from co-master 2 have also propagated. So you should not co-chain two servers in a two-way replication relationship, unless you are sure that you updates can safely happen in any order, or unless you take care of mis-ordered updates somehow in the client code.

You must also realise that two-way replication actually does not improve performance very much, if at all, as far as updates are concerned. Both servers need to do the same amount of updates each, as you would have one server do. The only difference is that there will be a little less lock contention, because the updates originating on another server will be serialised in one slave thread. This benefit, though, might be offset by network delays.

Q: How can I use replication to improve performance of my system?

A: You should set up one server as the master, and direct all writes to it, and configure as many slaves as you have the money and rackspace for, distributing the reads among the master and the slaves. You can also start the slaves with -skip-bdb, -low-priority-updates and -delay-key-write=ALL to get speed improvements for the slave. In this case the slave will use non-transactional MyISAM tables instead of BDB tables to get more speed.

Q: What should I do to prepare my client code to use performance-enhancing replication?

A: If the part of your code that is responsible for database access has been properly abstracted/modularised, converting it to run with the replicated setup should be very smooth and easy - just change the implementation of your database access to read from some slave or the master, and to always write to the master. If your code does not have this level of abstraction, setting up a replicated system will give you an opportunity/motivation to it clean up. You should start by creating a wrapper library /module with the following functions:

  • safe_writer_connect()

  • safe_reader_connect()

  • safe_reader_query()

  • safe_writer_query()

safe_ means that the function will take care of handling all the error conditions.

You should then convert your client code to use the wrapper library. It may be a painful and scary process at first, but it will pay off in the long run. All applications that follow the above pattern will be able to take advantage of one-master/many slaves solution. The code will be a lot easier to maintain, and adding troubleshooting options will be trivial. You will just need to modify one or two functions, for example, to log how long each query took, or which query, among your many thousands, gave you an error. If you have written a lot of code already, you may want to automate the conversion task by using Monty's replace utility, which comes with the standard distribution of MySQL, or just write your own Perl script. Hopefully, your code follows some recognisable pattern. If not, then you are probably better off rewriting it anyway, or at least going through and manually beating it into a pattern.

Note that, of course, you can use different names for the functions. What is important is having unified interface for connecting for reads, connecting for writes, doing a read, and doing a write.

Q: When and how much can MySQL replication improve the performance of my system?

A: MySQL replication is most beneficial for a system with frequent reads and not so frequent writes. In theory, by using a one master/many slaves setup you can scale by adding more slaves until you either run out of network bandwidth, or your update load grows to the point that the master cannot handle it.

In order to determine how many slaves you can get before the added benefits begin to level out, and how much you can improve performance of your site, you need to know your query patterns, and empirically (by benchmarking) determine the relationship between the throughput on reads (reads per second, or max_reads) and on writes max_writes) on a typical master and a typical slave. The example here will show you a rather simplified calculation of what you can get with replication for our imagined system.

Let's say our system load consists of 10% writes and 90% reads, and we have determined that max_reads = 1200 - 2 * max_writes, or in other words, our system can do 1200 reads per second with no writes, our average write is twice as slow as average read, and the relationship is linear. Let us suppose that our master and slave are of the same capacity, and we have N slaves and 1 master. Then we have for each server (master or slave):

reads = 1200 - 2 * writes (from bencmarks)

reads = 9* writes / (N + 1) (reads split, but writes go to all servers)

9*writes/(N+1) + 2 * writes = 1200

writes = 1200/(2 + 9/(N+1)

So if N = 0, which means we have no replication, our system can handle 1200/11, about 109 writes per second (which means we will have 9 times as many reads due to the nature of our application).

If N = 1, we can get up to 184 writes per second.

If N = 8, we get up to 400.

If N = 17, 480 writes.

Eventually as N approaches infinity (and our budget negative infinity), we can get very close to 600 writes per second, increasing system throughput about 5.5 times. However, with only 8 servers, we increased it almost 4 times already.

Note that our computations assumed infinite network bandwidth, and neglected several other factors that could turn out to be significant on your system. In many cases, you may not be able to make a computation similar to the one above that will accurately predict what will happen on your system if you add N replication slaves. However, answering the following questions should help you decided whether and how much, if at all, the replication will improve the performance of your system:

  • What is the read/write ratio on your system?

  • How much more write load can one server handle if you reduce the reads?

  • How many slaves do you have bandwidth for on your network?

Q: How can I use replication to provide redundancy/high availability?

A: With the currently available features, you would have to set up a master and a slave (or several slaves), and write a script that will monitor the master to see if it is up, and instruct your applications and the slaves of the master change in case of failure. Some suggestions:

  • To tell a slave to change the master use the CHANGE MASTER TO command.

  • A good way to keep your applications informed as to the location of the master is by having a dynamic DNS entry for the master. With bind you can use nsupdate to dynamically update your DNS.

  • You should run your slaves with the -log-bin option and without -log-slave-updates. This way the slave will be ready to become a master as soon as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO on the other slaves. For example, consider you have the following setup ("M" means the master, "S" the slaves, "WC" the clients which issue database writes and reads; clients which issue only database reads are not represented as they don't need to switch):

           WC
            \
             v
     WC---- M
           / | \
          /  |  \
         v   v   v
        S1   S2  S3
    

    S1 (like S2 and S3) is a slave running with -log-bin and without -log-slave-updates. As the only writes executed on S1 are those replicated from M, the binary log on S1 is empty (remember S1 runs without -log-slave-updates). Then, for some reason, M becomes unavailable, and you want S1 to become the new master (i.e. direct all WC to S1, and make S2 and S3 replicate S1). No WC accesses M. Instruct all WC to direct their queries to S1. From now on, all queries sent by WC to S1 are written to the binary log of S1. The binary log of S1 contains exactly every writing query sent to S1 since M died. On S2 (and S3) do STOP SLAVE, CHANGE MASTER TO MASTER_HOST='S1' (where 'S1' is replaced by the real hostname of S1). To CHANGE MASTER, add all information about how to connect to S1 from S2 or S3 (user, password, port). In CHANGE MASTER, no need to specify the name of S1's binary log or binary log position to read from: we know it is the first binary log, from position 4, and these are the defaults of CHANGE MASTER. Finally do START SLAVE on S2 and S3, and now you have this:

           WC
          /
          |
     WC   |  M(unavailable)
      \   |
       \  |
        v v
         S1--S2  S3
          ^       |
          +-------+
    

    When M is up again, you just have to issue on it the same CHANGE MASTER as the one issued on S2 and S3, so that M becomes a slave of S1 and picks all the WC writes it has missed while it was down. Now to make M a master again (because it is the most powerful machine for example), follow the procedure like if S1 was unavailable and M was to be the new master; then during the procedure don't forget to run RESET MASTER on M before making S1, S2, S3 slaves of M, or they may pick old WC writes from before M's unavailibility.

We are currently working on integrating an automatic master election system into MySQL, but until it is ready, you will have to create your own monitoring tools.

4.10.9. Troubleshooting Replication

If you have followed the instructions, and your replication setup is not working, first check the following:

  • Is the master logging to the binary log? Check with SHOW MASTER STATUS. If it is, Position will be non-zero. If not, verify that you have given the master log-bin option and have set server-id.

  • Is the slave running? Do SHOW SLAVE STATUS and check that the Slave_IO_Running and Slave_SQL_Running are both "Yes". If not, verify slave options

  • Check the error log for messages. Many users have lost time by not doing this early enough.

  • If the slave is running, did it establish connection with the master? Do SHOW PROCESSLIST, find the I/O and SQL threads (Section 4.10.3, “Replication Implementation Details ” to see how they display), and check their State column. If it says Connecting to master, verify the privileges for the replication user on the master, master host name, your DNS setup, whether the master is actually running, whether it is reachable from the slave.

  • If the slave was running, but then stopped: it usually happens when some query that succeeded on the master fails on the slave. This should never happen if you have taken a proper snapshot of the master, and never modify the data on the slave outside of the slave thread. If it does, it is a bug, read below on how to report it.

  • If a query on that succeeded on the master refuses to run on the slave, and a full database resync (that is, delete the slave's database and copy a new snapshot from the master) does not seem feasible, try the following:

    • First see if the slave's table was different from the master's. Understand how it happened (it may be a bug: read the Changelogs in the online MySQL manual http://www.mysql.com/documentation to check if this is a known bug and if it is fixed yet). Then make the slave's table identical to the master's and run SLAVE START.

    • If the above does not work or does not apply, try to understand if it would be safe to make the update manually (if needed) and then ignore the next query from the master.

    • If you have decided you can skip the next query, do SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; SLAVE START; to skip a query that does not use AUTO_INCREMENT or LAST_INSERT_ID(), or SET GLOBAL SQL_SLAVE_SKIP_COUNTER=2; SLAVE START; otherwise. The reason queries that use AUTO_INCREMENT or LAST_INSERT_ID() are different is that they take two events in the binary log of the master.

    • Make sure you are not running into an old bug by upgrading to the most recent version.

    • If you are sure the slave started out perfectly in sync with the master, and no one has updated the tables involved outside of slave thread, report the bug.

When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, it is time to send us a bug report. We need to get as much information as possible from you to be able to track down the bug. Please do spend some time and effort preparing a good bug report.

If you have a repeatable way to demonstrate the bug, use mysqlbug to prepare a bug report and enter it into our bugs database at http://bugs.mysql.com/. If you have a phantom - a problem that does occur but you cannot duplicate "at will" - fortunately this rarely happens:

  • Verify that there is no user error involved. For example, if you update the slave outside of the slave thread, the data will be out of sync, and you can have unique key violations on updates, in which case the slave thread will stop and wait for you to clean up the tables manually to bring them in sync.

  • Run slave with log-slave-updates and log-bin - this will keep a log of all updates on the slave.

  • Save all evidence before resetting the replication. If we have no or only sketchy information, it would take us a while to track down the problem. The evidence you should collect is:

    • All binary logs on the master

    • All binary log on the slave

    • The output of SHOW MASTER STATUS on the master at the time you have discovered the problem

    • The output of SHOW SLAVE STATUS on the master at the time you have discovered the problem

    • Error logs on the master and on the slave

  • Use mysqlbinlog to examine the binary logs. The following should be helpful to find the trouble query, for example:

    mysqlbinlog -j pos_from_slave_status /path/to/log_from_slave_status | head
    

Once you have collected the evidence on the phantom problem, try hard to isolate it into a separate test case first. Then enter the problem into our bugs database at http://bugs.mysql.com/ with as much information as possible.

freelance web developer India web development india website designer | Software developer India