Bug #43929 binlog corruption when max_binlog_cache_size is exceeded
Submitted: 28 Mar 2009 8:57 Modified: 13 Jul 2009 11:52
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.0.78,5.1.32 OS:Any
Assigned to: Alfranio Junior CPU Architecture:Any
Tags: max_binlog_cache_size

[28 Mar 2009 8:57] Shane Bester
Description:
If you set too low max_binlog_cache_size, or have too large transactions, then the binary log gets corrupted.   The following error message is usually seen also:

[ERROR]  10:38:50 [2022]  1784 - query failed (1197) - Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again: insert into t2 (data) values (' [...]

When trying to read the binary log, mysqlbinlog gives errors:

ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 157771618, event_type: 52
ERROR: Could not read entry at offset 15933805: Error in log format or read error.

How to repeat:
Start the server like this:

 ./bin/mysqld_safe --skip-name-resolve --log-bin --sync-binlog=0 --binlog_cache_size=4096 --max_binlog_cache_size=4096 --server-id=1 --innodb_flush_log_at_trx_commit=0

import the sql on the master:

mysql -uroot test --force <binlog.sql

check the resulting binlog, or watch if slave broke.
[28 Mar 2009 8:59] MySQL Verification Team
extract the zip file, then import the sql....

Attachment: bug43929_binlog.zip (application/zip, text), 110.08 KiB.

[28 Mar 2009 15:43] MySQL Verification Team
This is a very, very serious bug !!!

I suggest that it is fixed simply by ignoring this variable or by setting a minimum very high, for example at 1 Gb.
[28 Mar 2009 16:10] MySQL Verification Team
sinisa, then a workload of >1G of transactions will still break it.  It must be fixed properly, for any random size of the variable.  I will provide a randomized testcase if needed, to test the fix.
[30 Mar 2009 11:36] Mats Kindahl
Sinisa,

I agree with Shane: this bug has to be fixed properly, or we will get hit by it even harder later.
[23 May 2009 8:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/74824

2897 Alfranio Correia	2009-05-23
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In order to fix the problem, we save the position of the next character in the
      cache before writing to it. Thus if there is a problem, we simply restore this
      position.
      
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if such changes do not fit in the remaining space in
      cache. There is nothing we can do to circumvent this limitation.
[27 May 2009 9:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75031

2913 Alfranio Correia	2009-05-27
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Post-fix. The incident events described in BUG#43929 must only be generated
      due to errors while writing to the binary log.
[27 May 2009 9:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75032

2912 Alfranio Correia	2009-05-27
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In order to fix the problem, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that cannot fit into the cache, we
      do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
        c) an incident event is logged
      
      For transactional changes that cannot fit into the cache, we do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
     @ mysql-test/include/commit.inc
        Fixed the test case as the binlog is being registered for every statement.
     @ mysql-test/r/commit_1innodb.result
        Fixed the test case as the binlog is being registered for every statement.
     @ mysql-test/suite/binlog/r/binlog_row_mix_innodb_myisam.result
        Fixed the test case.
     @ mysql-test/suite/rpl/include/bug43929_binlog.sql
        Created a new test case.
     @ mysql-test/suite/rpl/r/rpl_binlog_max_cache_size.result
        Created a new test case.
     @ mysql-test/suite/rpl/t/rpl_binlog_max_cache_size-master.opt
        Created a new test case.
     @ mysql-test/suite/rpl/t/rpl_binlog_max_cache_size.test
        Created a new test case.
     @ sql/log.cc
        This is the kernel of the patch.
     @ sql/log.h
        Introduced two new methods into the MYSQL_BIN_LOG class in order
        to appropriately set and check an error code.
     @ sql/sql_delete.cc
        An error is propagated to an upper layer regardless of the type of the table.
     @ sql/sql_insert.cc
        An error is propagated to an upper layer regardless of the type of the table.
     @ sql/sql_update.cc
        An error is propagated to an upper layer regardless of the type of the table.
[27 May 2009 9:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75033

2912 Alfranio Correia	2009-05-27
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In nutshell, to fix the bug, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that cannot fit into the cache, we
      do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
        c) an incident event is logged
      
      For transactional changes that cannot fit into the cache, we do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
[29 May 2009 10:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75226

2912 Alfranio Correia	2009-05-29
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In a nutshell, to fix the bug, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that cannot fit into the cache, we
      do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
        c) an incident event is logged
      
      For transactional changes that cannot fit into the cache, we do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
      
      To work properly, this patch requires that callers to MYSQL_BIN_LOG::write and
      THD::binlog_query handle any error returned and take the appropriate actions
      such as undoing the effects of a statement. We already changed some calls from
      the sql_insert.cc, sql_update.cc and sql_insert.cc modules but the remaining
      calls spread all over the code should be handled in BUG#37148.
[3 Jun 2009 20:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75557

2912 Alfranio Correia	2009-06-03
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In a nutshell, to fix the bug, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that cannot fit into the cache, we
      do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
        c) an incident event is logged after committing/rolling back the transaction,
        if any.
      
      For transactional changes that cannot fit into the cache, we do the following:
        a) its respective statement gives an error
        b) the statement is *not* logged
      
      To work properly, this patch requires that callers to MYSQL_BIN_LOG::write and
      THD::binlog_query handle any error returned and take the appropriate actions
      such as undoing the effects of a statement. We already changed some calls from
      the sql_insert.cc, sql_update.cc and sql_insert.cc modules but the remaining
      calls spread all over the code should be handled in BUG#37148.
[6 Jun 2009 13:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75783

2912 Alfranio Correia	2009-06-06
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In a nutshell, to fix the bug, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that do not fit into the cache,
      we do the following:
        a) the statement is *not* logged
        b) an incident event is logged after committing/rolling back the transaction,
        if any. Note that if a failure happens before writing the incident event to
        the binary log, the slave will not stop and the master will not have reported
        any error.
        c) its respective statement gives an error
      
      For transactional changes that do not fit into the cache, we do the following:
        a) the statement is *not* logged
        b) its respective statement gives an error
      
      To work properly, this patch requires two additional things. Firstly, callers to
      MYSQL_BIN_LOG::write and THD::binlog_query must handle any error returned and
      take the appropriate actions such as undoing the effects of a statement. We
      already changed some calls in the sql_insert.cc, sql_update.cc and sql_insert.cc
      modules but the remaining calls spread all over the code should be handled in
      BUG#37148. Secondly, statements must be either classified as DDL or DML because
      DDLs that do not get into the cache must generate an incident event since they
      cannot be rolled back.
[18 Jun 2009 14:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76574

2950 Alfranio Correia	2009-06-18
      BUG#43929 binlog corruption when max_binlog_cache_size is exceeded
      
      Large transactions and statements may corrupt the binary log if the size of the
      cache, which is set by the max_binlog_cache_size, is not enough to store the
      the changes.
      
      In a nutshell, to fix the bug, we save the position of the next character in the
      cache before starting processing a statement. If there is a problem, we simply
      restore the position thus removing any effect of the statement from the cache.
      Unfortunately, to avoid corrupting the binary log, we may end up loosing changes
      on non-transactional tables if they do not fit in the cache. In such cases, we
      store an Incident_log_event in order to stop the slave and alert users that some
      changes were not logged.
      
      Precisely, for every non-transactional changes that do not fit into the cache,
      we do the following:
        a) the statement is *not* logged
        b) an incident event is logged after committing/rolling back the transaction,
        if any. Note that if a failure happens before writing the incident event to
        the binary log, the slave will not stop and the master will not have reported
        any error.
        c) its respective statement gives an error
      
      For transactional changes that do not fit into the cache, we do the following:
        a) the statement is *not* logged
        b) its respective statement gives an error
      
      To work properly, this patch requires two additional things. Firstly, callers to
      MYSQL_BIN_LOG::write and THD::binlog_query must handle any error returned and
      take the appropriate actions such as undoing the effects of a statement. We
      already changed some calls in the sql_insert.cc, sql_update.cc and sql_insert.cc
      modules but the remaining calls spread all over the code should be handled in
      BUG#37148. Secondly, statements must be either classified as DDL or DML because
      DDLs that do not get into the cache must generate an incident event since they
      cannot be rolled back.
[18 Jun 2009 17:14] Alfranio Junior
Pushed to 5.1-bugteam and 6.0-bugteam.
[19 Jun 2009 11:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76673

2963 Alfranio Correia	2009-06-19
      Post-fix for BUG#43929.
[8 Jul 2009 13:30] Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090708131116-kyz8iotbum8w9yic) (version source revid:alfranio.correia@sun.com-20090619112724-v6pkmyvoh5zv5sib) (merge vers: 5.1.36) (pib:11)
[9 Jul 2009 7:37] Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090708131116-kyz8iotbum8w9yic) (version source revid:alfranio.correia@sun.com-20090619112724-v6pkmyvoh5zv5sib) (merge vers: 5.1.36) (pib:11)
[9 Jul 2009 11:57] Jon Stephens
Documented bugfix in the 5.1.37 changelog as follows:

        Large transactions and statements could corrupt the binary log
        if the size of the cache (as set by max_binlog_cache_size) was
        not great enough to store the changes. Now, for transactions that 
        do not fit into the cache, the statement is not logged, and it
        generates an error instead.

        For non-transactional changes that do not fit into the cache,
        the statement is not logged -- an incident event is logged
        after committing or rolling back any pending transaction, and
        the statement raises an error.

        Note that if a failure happens before writing the incident event
        to the binary log, the slave does not stop, and the master does
        not report any errors.

        See also Bug #37148.

Set status as NDI pending push to 5.4 tree.
[10 Jul 2009 11:21] Bugs System
Pushed into 5.4.4-alpha (revid:anozdrin@bk-internal.mysql.com-20090710111017-bnh2cau84ug1hvei) (version source revid:alfranio.correia@sun.com-20090619113004-c6ag5c417cod9sc4) (merge vers: 5.4.4-alpha) (pib:11)
[13 Jul 2009 11:52] Jon Stephens
Fix also documented in the 5.4.4 changelog. Closed.
[12 Aug 2009 22:02] Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[14 Aug 2009 22:54] Paul DuBois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:33] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[7 Oct 2009 1:49] Paul DuBois
The 5.4 fix has been pushed into 5.4.2.
[24 Nov 2009 8:20] MySQL Verification Team
bug #40672 was marked as a duplicate of this.
[8 Mar 2010 0:36] Paul DuBois
Noted in 5.5.0, 6.0.14 changelogs.
[2 Sep 2010 23:08] James Day
This fix introduced bug #53560 that was fixed in 5.1.49 and 5.5.6.