Bug #6256 Serial binary log execution blocks slave activity unnecessarily
Submitted: 25 Oct 2004 23:21 Modified: 10 Oct 2008 10:34
Reporter: Chris Tucker Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:All OS:Any (All)
Assigned to: CPU Architecture:Any

[25 Oct 2004 23:21] Chris Tucker
Description:
MySQL replication works by serializing all data-modifying queries executed on the master and running slaves that read and execute the resultant binary logs in a linear fashion.  As a result, when a slow data-modifying query is executed against the master, the slave falls behind in replication by the length of time it takes to complete executing that individual query.  Meanwhile, there may be many entirely unrelated data-modifying queries that are blocked from executing because they occur later in the binary log.  The end result is that the slave data becomes stale for the time it takes to execute the query.  This problem can be exarcebated by lock contention on a busy slave further slowing execution of a query that may execute relatively quickly on the master.

How to repeat:
Beyond theoretical understanding of the problem, it can be demonstrated easily by simply running a slow update query against a master and observing the expected resultant delay in any other queries executed.  Note, this is not being filed as a bug but as a feature request: this behaviour is expected and correct insofar as MySQL replication is presently defined.

Suggested fix:
Creative use of two existing features of MySQL (and relational databases in general) should make it possible to considerably improve on the current performance of replication under the conditions outlined above.  The obvious way to prevent one query from blocking the execution of other, later, queries is to take advantage of the concurrency MySQL already provides for multiple connections coming in to the server.  Doing this requires that we understand dependencies between queries, so we don't introduce data consistency problems by executing queries on the slave in an order that makes the results of those queries different than those on the master.  I believe the standard system of locks available in MySQL, and careful integration with InnoDB locking structures and transactional systems, provide a solution to this problem.

When a data-modifying query executes at present, it takes out an exclusive write lock on the data it is modifying.  In the case of InnoDB this is done on a row-by-row basis, while with MyISAM it operates on the table as a whole.  I do not currently think the semantics of locking necessarily prevent this solution from working, though they will have an impact on performance and complexity.  When a table/row is locked, no other process can read or modify the data in that table/row.  

We can use this to our advantage for replication by stating that whenever a replication statement executes it takes out a "replication WRITE lock" on the data it is changing; this lock must be enforced against all data that statement may touch or be impacted by (namely, all tables/rows it changes and reads).  My understanding of how locks are computed is limited, but presumably it is a relatively trivial task to determine at least what tables should be locked a priori.  As soon as a replication SQL execution thread determines and takes these locks, we take another thread to execute the next statement in the replication log and repeat the process.  Assuming it does not touch any of the data locked by an earlier replication statment, it can proceed to completion without any lock contention.  Any number of additional replication execution threads may be launched, up to a user-specified limit, each processing a separate statement.  Any thread that must contend for a lock with an earlier replication statement blocks waiting for that statement to complete, as we might expect in a normal locking situation.

The key concept here is that rather than executing queries in a temporally ordered fashion we execute in a data-dependency ordered fashion.  The temporal ordering of queries is essentially irrelevant to the correct operation of replication under the assumption that no later query can execute if it is in any way dependent on an earlier query.

There are many parts of this concept that need fleshing out.  The replication locking system could be refined considerably to appropriately deal with read vs. write locks (I stick with just write locks at this stage for simplicity's sake).  Transactional support will require more thought about how dependencies are computed (it would likely require all requirements of a transaction be computed a priori, which may be computationally expensive).  Proving correctness will be considerably more difficult than in the current system (indeed, it may be trivial to prove the system will perform INcorrect, in which case I shall have to rethink the idea).

Primarily, I should like to foster a discussion of these points and see if it is possible to solve this problem.  It is a considerable issue for high-load, high-availability sites, and appears to be an eminently solvable (if difficult) issue.
[10 Oct 2008 10:34] Susanne Ebrecht
Many thanks for writing a feature request.

This should be fixed by Row Based Replication. RBR is implemented in MySQL 5.1 just try our actual version MySQL 5.1-28-rc