Bug #36763 TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
Submitted: 16 May 2008 18:45 Modified: 16 Feb 2009 9:31
Reporter: Tim Clark Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.24, 6.0 BK OS:Any
Assigned to: Mats Kindahl
Tags: innodb, read uncommitted, replication, row based, truncate
Triage: Triaged: D2 (Serious) / R2 (Low) / E2 (Low)

[16 May 2008 18:45] Tim Clark
Description:
There may be a couple of bugs (or at least documentation issues) here. I'm not completely sure what the correct behavior is supposed to be.

In the process of developing a storage engine that only supports row-based replication, I discovered that TRUNCATE TABLE is not replicated for tables belonging to that storage engine when the binlog_format is "mixed." However, this is not reported to the user (as it would be if the binlog_format were "statement"). The TRUNCATE succeeds without error or warning on the master but is never replicated to the slave.

How to repeat:
This can be reproduced with InnoDB by setting the transaction isolation level to READ UNCOMMITTED, since that forces InnoDB to not support statement-based logging. However, the same behavior can be seen on any storage engine that only supports row-based logging, regardless of the isolation level.

Assuming that master/slave servers have been set up for replication, the following sequence will reproduce the problem.

On the master:
mysql> set binlog_format=mixed;
Query OK, 0 rows affected (0.00 sec)

mysql> create table t3 (i int) engine=innodb;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t3 values(1),(2),(3);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> set transaction isolation level read uncommitted;
Query OK, 0 rows affected (0.00 sec)

mysql> truncate table t3;
Query OK, 3 rows affected (0.00 sec)

mysql> select * from t3;
Empty set (0.01 sec)

Now on the slave, which should also return the empty set:
mysql> select * from t3;
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
+------+
3 rows in set (0.00 sec)

Suggested fix:
I suspect that row-logging-only storage engines should not be permitted to work in a binlog_format=MIXED environment. Therefore, I would expect to see an error as I would when binlog_format=STATEMENT: "ERROR 1598 (HY000): Binary logging not possible. Message: Statement-based format required for this statement, but not allowed by this combination of engines"
[16 May 2008 21:22] Sveta Smirnova
Thank you for the report.

Verified as described.
[20 May 2008 6:42] Mats Kindahl
Since TRUNCATE is equivalent to DROP + CREATE, I would say that this should be treated as a DDL and, in other words, replicated as a statement.

Note that there is a difference in behavior between DELETE FROM with no where condition and TRUNCATE in this case: DELETE FROM will remove all rows that were present on the master, while TRUNCATE will drop and recreate the table to empty it.
[26 May 2008 19:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47063

ChangeSet@1.2647, 2008-05-26 21:43:15+02:00, mats@mats-laptop.(none) +3 -0
  Bug #36763:
  TRUNCATE TABLE fails to replicate when stmt-based binlogging is
  not supported.
  
  When executing a TRUNCATE TABLE under MIXED mode, the TRUNCATE statement was
  not written to the binary log.  The reason for this was that the statement
  was assumed to be a "row query type", meaning that the statement is logged
  only if the current statement is not row-based.  Since InnoDB forces row-based
  logging for statements in READ UNCOMMITTED isolation level, this caused the
  statement to not be logged.
  
  The problem is fixed by deliberately marking TRUNCATE TABLE statements to be
  "statement query types", meaning that they are always logged as statements.
[23 Jun 2008 9:41] Mats Kindahl
The bugs system did not pick up the patch, which is available as:

   http://lists.mysql.com/commits/48151

Bug #36763: TRUNCATE TABLE fails to replicate when stmt-based binlogging is
            not supported.
        
When executing a TRUNCATE TABLE under MIXED mode, the TRUNCATE statement was
not written to the binary log.  The reason for this was that the statement
was assumed to be a "row query type", meaning that the statement is logged
only if the current statement is not row-based.  Since InnoDB forces row-based
logging for statements in READ UNCOMMITTED isolation level, this caused the
statement to not be logged.
        
The problem is fixed by deliberately marking TRUNCATE TABLE statements to be
"statement query types", meaning that they are always logged as statements.
[7 Jul 2008 23:23] Dewey Gaedcke
When running RBR only on 5.1.25-rc-log, we have db named db1 excluded from bin logging on the master.

All dml (insert, update and delete) statements are properly missing from the log and therefor not replicating.

But the "truncate" statement IS being logged (even though the table exists in an excluded DB) and is trying to replicate.   We believe that either:
1) your patch forgot to consider the "exclude" from bin-log list
 or
2) because the "truncate" is running in the context of a stored procedure (owned by a different schema) and no "use db1;" statement exists, the "exclude" filter is not being respected

One other detail to complicate matters:
in our case, the table is a temporary table and does not exist on the slave ;-)
[2 Feb 2009 10:20] Mats Kindahl
Dewey,

This is a different problem. Could you please report it as a separate bug and provide the exact statements to re-create the problem?

I can see several reasons to why you have this failure, so having a proper test case would allow us to solve the issue.
[3 Feb 2009 12:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/64994

2748 Mats Kindahl	2009-02-03
      Bug #36763
      TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
      
      There were two separate problems with the code, both of which are fixed with
      this patch:
      
      1. An error was printed by InnoDB for TRUNCATE TABLE in statement mode when
         the in isolation levels READ COMMITTED and READ UNCOMMITTED since InnoDB
         does permit statement-based replication for DML statements. However,
         the TRUNCATE TABLE is not transactional, but is a DDL, and should therefore
         be allowed to be replicated as a statement.
      
      2. The statement was not logged in mixed mode because of the error above, but
         the error was not reported to the client.
      
      This patch fixes the problem by treating TRUNCATE TABLE a DDL, that is, it is
      always logged as a statement and not reporting an error from InnoDB for TRUNCATE
      TABLE.
[4 Feb 2009 10:17] Marko Mäkelä
Mats,

I'm not sure if it is safe to replicate TRUNCATE TABLE on the statement level. This is because in InnoDB, TRUNCATE TABLE will (on purpose) break any locks that other transactions may be holding on the table. What does Heikki say?

When it comes to the patch http://lists.mysql.com/commits/64994, I wouldn't call thd_sql_command(thd) on every code path in ha_innobase::external_lock(), but add the call right before the error printout, like this:

	if (lock_type == F_WRLCK)
	{
		ulong const binlog_format= thd_binlog_format(thd);
		ulong const tx_isolation = thd_tx_isolation(thd);
		if (tx_isolation <= ISO_READ_COMMITTED
		    && binlog_format == BINLOG_FORMAT_STMT
		    && thd_sql_command(thd) != SQLCOM_TRUNCATE)
		{
			char buf[256];
			my_snprintf(buf, sizeof(buf),

This follows the principle to perform the quickest and most discriminating checks first. Note that the thd accessor functions may be non-inlined function calls.
[4 Feb 2009 13:48] Heikki Tuuri
Mats,

replicating TRUNCATE is not safe at all, regardless if we use statement-based or row-based :(.

Suppose we have user A who does inserts to table t in a single transaction T1. And user B issues TRUNCATE meanwhile. Then in the master database, only the last INSERTs get into t.

But if MySQL writes transaction T1 to the binlog AFTER the TRUNCATE statement, then ALL the inserts end up in table t in the slave!

Possible fixes: 
1) Easy: block TRUNCATE when some other users has row locks on the table.
2) Hard: strip from the binlog those INSERTs that happened before the TRUNCATE.

The root cause for this bug is that TRUNCATE is not transactional.

Regards,

Heikki
[4 Feb 2009 13:54] Heikki Tuuri
Mats,

note that the same bug occurs if user B does DROP TABLE t; CREATE TABLE t ...

Few users will hit this replication bug because it is not good programming style to call DROP or TRUNCATE when other users are using the table!

Regards,

Heikki
[4 Feb 2009 16:54] Heikki Tuuri
I think there has been discussion about blocking DROP TABLE of a table that carries row locks. I cannot remember what was the conclusion. Of course, it is the spirit of a transactional database that you cannot use DROP TABLE to wipe out the changes of an active transaction.

Technically, MySQL could ask InnoDB to set an X-lock on the table to be TRUNCATEd or DROPped. I the thread is not holding any critical MySQL mutex at that time. When the thread gets the X-lock on the table, it can safely proceed to DROP or TRUNCATE.
[4 Feb 2009 16:58] Heikki Tuuri
If the thread is holding LOCK_open or some otehr critical MySQL mutex when it calls InnoDB's truncate or drop operation, then the fix is harder. We cannot hold a critical mutex during a potentially very long lock wait.

Also, if we implement blocking it is a major change in behavior. It should be introduced in a major new MySQL version.
[4 Feb 2009 18:51] Mats Kindahl
As far as I can see in the code, LOCK_open is held for tables that have the re-create flag set, which InnoDB has not. When doing a TRUNCATE by calling delete_all_rows(), which InnoDB does, no LOCK_open mutex is acquired.

DROP TABLE, on the other hand, acquires the LOCK_open before removing the table.
[4 Feb 2009 19:31] Mats Kindahl
For databases that are supposed to be on-line, it is probably hard to find a good time where the TRUNCATE can be safely executed. :)

I suppose, however, that it is comparably rare to execute TRUNCATE, so suppose that the best solution is to block TRUNCATE while the active threads executing INSERTs finish without seriously affecting performance. This will at least be correct.

Backpatching the binary log is close to impossible, and even more so under high load.
[6 Feb 2009 10:30] Mats Kindahl
Added new bug BUG#42643 for administering the parts that need to change inside InnoDB.
[6 Feb 2009 10:33] Marko Mäkelä
Mats, will you be posting another patch that does not change ha_innodb.cc? I wouldn't allow the replication of TRUNCATE TABLE until we have fixed Bug #42643 inside InnoDB.
[6 Feb 2009 10:55] Mats Kindahl
Marko,

Yes, I was planning to file a new patch that does not include the changes to ha_innodb.cc, but I assume that you are not proposing that we disable TRUNCATE TABLE for all storage engines, but rather hard-code an exception for InnoDB. However, the best place to add such an exception is still ha_innodb.cc.

Assuming that I file this new patch, it will generate an error when used as in BUG#42643, but it will allow the TRUNCATE TABLE to be written to the binary log in other situations. This basically leaves the situation the same as before, for InnoDB, but allow other engines that disallow statement-based replication to work correctly.

So, what do you suggest that I add to the patch?
[9 Feb 2009 10:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65585

2748 Mats Kindahl	2009-02-06
      Bug #36763
      TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
      
      There were two separate problems with the code, both of which are fixed with
      this patch:
      
      1. An error was printed by InnoDB for TRUNCATE TABLE in statement mode when
         the in isolation levels READ COMMITTED and READ UNCOMMITTED since InnoDB
         does permit statement-based replication for DML statements. However,
         the TRUNCATE TABLE is not transactional, but is a DDL, and should therefore
         be allowed to be replicated as a statement.
      
      2. The statement was not logged in mixed mode because of the error above, but
         the error was not reported to the client.
      
      This patch fixes the problem by treating TRUNCATE TABLE a DDL, that is, it is
      always logged as a statement and not reporting an error from InnoDB for TRUNCATE
      TABLE.
[10 Feb 2009 14:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65755

2750 Mats Kindahl	2009-02-10
      BUG#36763: TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
      
      Post-merge fixes. Changes to some result sets.
[10 Feb 2009 21:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/65808

2752 Mats Kindahl	2009-02-10
      Bug #36763
      TRUNCATE TABLE fails to replicate when stmt-based binlogging is not supported.
      
      Correcting some tests that was failing in pushbuild as well as fixing result
      file for some tests that are not executed in the default MTR run.
[11 Feb 2009 11:10] Bugs System
Pushed into 5.1.32 (revid:joro@sun.com-20090211100722-zyl44aq5z237efkd) (version source revid:mats@sun.com-20090210212637-dvfags7appmswv9w) (merge vers: 5.1.32) (pib:6)
[12 Feb 2009 10:15] Jon Stephens
Documented in the 5.1.32 changelog as follows:

        TRUNCATE statements failed to replicate when statement-based
        binary logging mode was not available. The issue was observed
        when using InnoDB with the transaction isolation level set to
        READ UNCOMMITTED (thus forcing InnoDB not to allow
        statement-based logging). However, the same behavior could be
        reproduced using any transactional storage engine supporting
        only row-based logging, regardless of the isolation level. This
        was due to two separate problems: 
        
        1. An error was printed by InnoDB for TRUNCATE when using 
        statement-based logging mode when the transaction isolation 
        level was set to READ COMMITTED or READ UNCOMMITTED, because 
        InnoDB permits statement-based replication for DML statements. 
        However, TRUNCATE is not transactional; since it is the 
        equivalent of DROP TABLE followed by CREATE TABLE, it is 
        actually DDL, and should therefore be allowed to be replicated 
        as a statement. 
        
        2. TRUNCATE was not logged in mixed mode because of the error 
        just described; however, this error was not reported to the 
        client. 
        
        As a result of this fix, TRUNCATE is now treated as DDL; that 
        is, it is always logged as a statement and so no longer causes 
        an error when replicated using a transactional storage engine 
        such as InnoDB.

Also updated http://dev.mysql.com/doc/refman/5.1/en/truncate.html and added a new subsection to "Replication Features and Issues" which will shortly appear online as http://dev.mysql.com/doc/refman/5.1/en/replication-features-truncate.html.

Set status as NDI waiting for push to 6.0 tree.
[12 Feb 2009 10:40] Mats Kindahl
After this patch, InnoDB still generates an error due to BUG#42643. However, this patch fixes the issue with TRUNCATE TABLE not being treated correctly in all cases.
[12 Feb 2009 11:11] Jon Stephens
Docs commit is here: http://lists.mysql.com/commits/66020
[12 Feb 2009 11:14] Jon Stephens
Amended changelog entry; added reference to Bug #42643.
[14 Feb 2009 13:02] Bugs System
Pushed into 6.0.10-alpha (revid:matthias.leich@sun.com-20090212211028-y72faag15q3z3szy) (version source revid:matthias.leich@sun.com-20090212211028-y72faag15q3z3szy) (merge vers: 6.0.10-alpha) (pib:6)
[16 Feb 2009 9:31] Jon Stephens
Also documented in the 6.0.10 changelog and 6.0 Manual. Closed.
[17 Feb 2009 15:01] Bugs System
Pushed into 5.1.32-ndb-6.3.23 (revid:tomas.ulin@sun.com-20090217131017-6u8qz1edkjfiobef) (version source revid:tomas.ulin@sun.com-20090216083519-az6m8y7x0olcur9c) (merge vers: 5.1.32-ndb-6.3.23) (pib:6)
[17 Feb 2009 16:49] Bugs System
Pushed into 5.1.32-ndb-6.4.3 (revid:tomas.ulin@sun.com-20090217134419-5ha6xg4dpedrbmau) (version source revid:tomas.ulin@sun.com-20090216083646-m8st11oj1hhfuuh5) (merge vers: 5.1.32-ndb-6.4.3) (pib:6)
[17 Feb 2009 18:25] Bugs System
Pushed into 5.1.32-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090217134216-5699eq74ws4oxa0j) (version source revid:tomas.ulin@sun.com-20090216083036-kmqvx97kxvq5k2f4) (merge vers: 5.1.32-ndb-6.2.17) (pib:6)