Bug #19066 DELETE FROM replication inconsistency for NDB
Submitted: 12 Apr 2006 23:13 Modified: 7 Jul 2006 20:42
Reporter: Lars Thalmann Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S3 (Non-critical)
Version:5.1 OS:Any (ALL)
Assigned to: Mats Kindahl CPU Architecture:Any

[12 Apr 2006 23:13] Lars Thalmann
Description:
Running DELETE FROM t while running INSERTS in other thread can lead
to different table in master compared to slave.

The current implementation of "DELETE FROM t"  uses SBR to replicate 
the statement if the handler supports the delete_all_rows() call (i.e. it 
does not give an error message).

The problem is if the handler uses row-based locking.  Then the delete_all_rows() 
could, while deleting, be intercepted by a different thread that does inserts into the
same table.  These inserted rows will still be present on master after the DELETE FROM
finishes.

Replication would transport the INSERTS before the DELETE and the result on the slave
will be that there are no rows in the table.

This does not match what we have on the master and that is not allowed.

How to repeat:
- Create a table t with many rows (in NDB)
- Start executing "DELETE FROM t" in one thread
- Execute "INSERT INTO t VALUES (...)" in anther thread
- Wait for DELETE to complete.
- Now master still has the inserted row, while the slave does not

Suggested fix:
Add a flag for the storage engine that it does not allow smart delete replication (i.e. SBR)
[15 May 2006 13:20] Mats Kindahl
This bug will be solved in the following manner, depending on the value of the server variable BINLOG_FORMAT:

BINLOG_FORMAT=STATEMENT: Replication is always done statement-based.

BINLOG_FORMAT=MIXED: Replication will usually be done statement-based. There is an exception when a sub-expression (e.g., a call to a UDF) requires row-based replication.

BINLOG_FORMAT=ROW: Replication will always be done row-based. This implies that for engines that support the delete_all_rows(), the full contents of the table will be written to the binary log prior to calling delete_all_rows() on the master.
[13 Jun 2006 19:11] Mats Kindahl
Patch pushed into Replication/Backup Team Tree
[20 Jun 2006 7:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7899
[20 Jun 2006 7:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7900
[21 Jun 2006 14:19] Lars Thalmann
This was pushed into 5.1.12
[7 Jul 2006 20:42] Mike Hillyer
Documented in 5.1.12 changelog.