Bug #32709 Assertion failed: trx_data->empty(), file log.cc
Submitted: 26 Nov 2007 9:34 Modified: 18 Oct 2008 12:37
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.1.24-debug,6.0.5-debug OS:Windows
Assigned to: Mats Kindahl CPU Architecture:Any

[26 Nov 2007 9:34] Shane Bester
Description:
during some long running tests i found this crash:

071126 11:23:37 [ERROR] mysqld-debug: Out of memory at line 3134, '.\sql_class.cc'
071126 11:23:37 [ERROR] mysqld-debug: needed 2779103866 byte (2713969k), memory in use: 26766872 bytes (26140k)
071126 11:23:38 [ERROR] mysqld-debug: Out of memory at line 3134, '.\sql_class.cc'
071126 11:23:38 [ERROR] mysqld-debug: needed 2779103866 byte (2713969k), memory in use: 26766872 bytes (26140k)
Assertion failed: trx_data->empty(), file .\log.cc, line 1293

mysqld-debug.exe!_NMSG_WRITE
mysqld-debug.exe!abort
mysqld-debug.exe!_assert
mysqld-debug.exe!binlog_close_connection
mysqld-debug.exe!closecon_handlerton
mysqld-debug.exe!plugin_foreach_with_mask
mysqld-debug.exe!ha_close_connection
mysqld-debug.exe!THD::~THD
mysqld-debug.exe!THD::`scalar deleting destructor'
mysqld-debug.exe!unlink_thd
mysqld-debug.exe!one_thread_per_connection_end
mysqld-debug.exe!handle_one_connection
mysqld-debug.exe!pthread_start
mysqld-debug.exe!_threadstart

How to repeat:
Repeatable testcase is unknown at this point. Falcon tables were involved.
Server was started with clean datadir like:

mysqld-debug --console --skip-grant-tables --skip-name-resolve --log-bin --log --log-warn=2

Suggested fix:
I don't think the bug is the assertion.  Instead, it's the huge 2.7G memory allocation just before it that should be a cause for concern.
[26 Nov 2007 9:36] MySQL Verification Team
full stack trace and server error log fyi

Attachment: bug32709_outputs.txt (text/plain), 6.05 KiB.

[26 Nov 2007 9:50] MySQL Verification Team
the failing memory allocation is this:

bool Discrete_intervals_list::append(ulonglong start, ulonglong val,
ulonglong incr)
{
...
Discrete_interval *new_interval= new Discrete_interval(start, val, incr);
if (unlikely(new_interval == NULL)) // out of memory
   DBUG_RETURN(1);
...
}
[2 May 2008 12:38] Philip Stoev
Test case for bug 32709

Attachment: bug32709.test (application/octet-stream, text), 1.98 KiB.

[2 May 2008 12:41] Philip Stoev
I was able to reproduce the assertion with the attached test case. To run:

# Start master and slave
$ perl mysql-test-run.pl --start-and-exit --skip-ndb --mysqld=--log-bin rpl_alter

# Run test case
$  perl mysql-test-run.pl --extern --socket=var/tmp/master.sock --user=root bug32709

The test case will complete , however the server will crash. Some observations:

* --log-bin appears to be required
* Innodb is not affected
* I did not see the memory allocation error Shane observed
[16 May 2008 13:58] MySQL Verification Team
i encountered this assertion on 5.1.24-debug today.

Assertion failed: trx_data->empty(), file .\log.cc, line 1329
080516 15:52:18 - mysqld got exception 0x80000003 ;

mysqld-debug.exe!_NMSG_WRITE
mysqld-debug.exe!abort
mysqld-debug.exe!_wassert
mysqld-debug.exe!binlog_close_connection
mysqld-debug.exe!closecon_handlerton
[15 Aug 2008 8:00] Mats Kindahl
The assertion indicates that there were data left in the transaction cache on server shutdown, which means that a transaction did not commit or rollback properly. In this case, the failure is probably caused by the the out-of-memory condition causing the transaction to rollback, with a rollback of a transactional engine not being handled correctly by the binary log.

An easier way to reproduce the error is probably (untested):

  CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=InnoDB;
  INSERT INTO t1 VALUES (1),(2),(3);
  INSERT INTO t1 VALUES (1),(2),(3);  // Duplicate entry error
  INSERT INTO t1 VALUES (4);

which would cause the 4 to not be properly replicated to the slave.

See BUG#38262 for a test case that triggers this error, and which is the one that pointed me to the problem.
[15 Aug 2008 8:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/51701

2658 Mats Kindahl	2008-08-15
      Bug #32709: Assertion failed: trx_data->empty(), file log.cc
      
      The assertion indicates that some data was left in the transaction
      cache when the server was shut down, which means that a previous
      statement did not commit or rollback correctly.
      
      What happened was that a bug in the rollback of a transactional
      table caused the transaction cache to be emptied, but not reset.
      The error can be triggered by having a failing UPDATE or INSERT,
      on a transactional table, causing an implicit rollback.
      
      Fixed by always flushing the pending event to reset the state
      properly.
[20 Aug 2008 20:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52095

2658 Mats Kindahl	2008-08-20
      Bug #32709: Assertion failed: trx_data->empty(), file log.cc
      
      The assertion indicates that some data was left in the transaction
      cache when the server was shut down, which means that a previous
      statement did not commit or rollback correctly.
      
      What happened was that a bug in the rollback of a transactional
      table caused the transaction cache to be emptied, but not reset.
      The error can be triggered by having a failing UPDATE or INSERT,
      on a transactional table, causing an implicit rollback.
      
      Fixed by always flushing the pending event to reset the state
      properly.
[3 Sep 2008 20:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/53198

2679 Mats Kindahl	2008-09-03
      Bug #32709: Assertion failed: trx_data->empty(), file log.cc
      
      The assertion indicates that some data was left in the transaction
      cache when the server was shut down, which means that a previous
      statement did not commit or rollback correctly.
      
      What happened was that a bug in the rollback of a transactional
      table caused the transaction cache to be emptied, but not reset.
      The error can be triggered by having a failing UPDATE or INSERT,
      on a transactional table, causing an implicit rollback.
      
      Fixed by always flushing the pending event to reset the state
      properly.
[3 Sep 2008 20:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/53204

2680 Mats Kindahl	2008-09-03
      BUG#32709: Assertion failed: trx_data->empty(), file log.cc
      
      Incremental fixes: updating a comment and fixing a result file.
[4 Sep 2008 7:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/53228

2680 Mats Kindahl	2008-09-03
      BUG#32709: Assertion failed: trx_data->empty(), file log.cc
      
      Incremental fixes: updating a comment and fixing a result file.
[4 Sep 2008 8:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/53230

2680 Mats Kindahl	2008-09-03
      BUG#32709: Assertion failed: trx_data->empty(), file log.cc
      
      Incremental fixes: updating a comment and fixing a result file.
[15 Sep 2008 8:32] Bugs System
Pushed into 5.1.29  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:kgeorge@mysql.com-20080910094421-1i1kxv3n1bxskiqa) (pib:3)
[15 Sep 2008 16:57] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html
[26 Sep 2008 12:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54543

2698 He Zhenxing	2008-09-26 [merge]
      Merge 5.1 main -> 5.1-rpl
[1 Oct 2008 16:00] Bugs System
Pushed into 5.1.29  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:kgeorge@mysql.com-20080910094421-1i1kxv3n1bxskiqa) (pib:4)
[7 Oct 2008 22:32] Jon Stephens
Set status to NDI pending merge to 6.0.
[17 Oct 2008 16:43] Bugs System
Pushed into 6.0.8-alpha  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:kpettersson@mysql.com-20080911114255-81pt7q1uvl1fkojq) (pib:5)
[18 Oct 2008 12:37] Jon Stephens
Documented bugfix in the 5.1.29 and 6.0.8 changelogs as follows:

        A statement did not always commit or roll back correctly when
        the server was shut down; the error could be triggered by having
        a failing UPDATE or INSERT on a transactional table, causing an implicit 
        rollback.

Closed.
[28 Oct 2008 21:03] Bugs System
Pushed into 5.1.29-ndb-6.2.17  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:tomas.ulin@sun.com-20081028140209-u4emkk1xphi5tkfb) (pib:5)
[28 Oct 2008 22:22] Bugs System
Pushed into 5.1.29-ndb-6.3.19  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:tomas.ulin@sun.com-20081028194045-0353yg8cvd2c7dd1) (pib:5)
[1 Nov 2008 9:47] Bugs System
Pushed into 5.1.29-ndb-6.4.0  (revid:mats@mysql.com-20080903205254-p8ij48ryqg9gzsjz) (version source revid:jonas@mysql.com-20081101082305-qx5a1bj0z7i8ueys) (pib:5)
[30 Mar 2009 2:43] wu song
I encountered the same error in MySQL 5.1.31-enterprise-gpl-advanced-log, bug#32709 only tells that the patch was push into 5.1.29_ndb_...,however my question is : when new patch will push into 5.1.3X GPL non-ndb version?

the error info is as follows on a slave of replication group:

Last_SQL_Error: Could not execute Delete_rows event on table 

myjob_wa.wa_sec_job_index_batch_group; Can't find record in 'wa_sec_job_index_batch_group', 

Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin-218.000036, 

end_log_pos 336238253
[3 Dec 2009 7:37] Sujoy Biswas
I am facing the same issue in Version: '5.1.34-log' as below:

 Slave SQL: Could not execute Delete_rows event on table emlplatform.Alarm; Can't find record in 'Alarm', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log emlplatform-bin.000001, end_log_pos 2198836, Error_code: 1032

It seems the fix is pushed in 5.1.29 and in 6.0 branch. Is the patch also present in 5.1.34-log?
[3 Dec 2009 8:53] MySQL Verification Team
@wu @sujoy your replication problems have nothing to do with this bug.
this bug is a crash in a debug server.  you should upgrade to 5.1.41 or newer, and if there's still a problem, then consider opening a new bug. many RBR bugs have been fixed already.
[3 Dec 2009 10:22] Sujoy Biswas
@Shane, my replication problem is the one reported as http://bugs.mysql.com/bug.php?id=37317 which got closed as duplicate of current ticket(id=32709) and got fixed in 5.1.29.

So, do you mean the patch isn't there for 5.1.34 and i need to upgrade to 5.1.41 to get this working?
[14 Dec 2009 14:56] David Slabaugh
I am encountering the exact same issue with 5.1.37 using row based replication on a linux system.  

Could not execute Delete_rows event on table cls.sessions; Can't find record in 'sessions', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000008, end_log_pos 827481.

If this fix was committed in 5.1.29, why am I still getting the error in 5.1.37?
[24 Mar 2010 10:15] shishu naik
We are also getting the same kind of replication error frequently.
Last_SQL_Error: Could not execute Delete_rows event on table 
myjob_wa.wa_sec_job_index_batch_group; Can't find record in
Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log

We thought the bug was solved in the recent MySQL community version 5.1.44
but it's not??
[28 Apr 2010 17:02] Sheeri Cabral
Please reopen this ticket, it's not fixed yet.

Bug 37317 relates to this one, and there are several folks who have noted their issues there.

I myself am also getting this on MySQL 5.1.41-community-log:

Could not execute Delete_rows event on table foo.bar; Can't find record in 'bar', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.004009, end_log_pos 760660166

the master of this machine has:

no triggers
READ-COMMITTED isolation level
MIXED binlog format
AUTOCOMMIT on

and the table itself is indeed InnoDB.
[3 May 2010 10:44] Sven Sandberg
To people who get the following error message:

Could not execute Delete_rows event on table <DATABASE>.<TABLE>; Can't find record in '<TABLE>', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log <FILENAME>, end_log_pos <NUMBER>

When you see this error in the error log or in the output from SHOW SLAVE STATUS (the Last_SQL_Error column), there are many possible reasons. The present bug used to be one possible reason. The bug was fixed in 5.1.41, so it is not a possible reason any more.

==== What does the message mean? ====

Let's look at the message in detail. The error message means this:
 - The master is using row-based replication, i.e., binlog_format=row or
   binlog_format=mixed.
 - The master has executed a statement that modifies or removes an existing
   row from the table referred to by the error message. I.e., normally an
   UPDATE or DELETE statement (or one of the following more unusual statements:
   REPLACE, INSERT...ON DUPLICATE KEY UPDATE, or LOAD DATA INFILE...REPLACE).
 - When the replication slave tried to re-execute the statement, it did not
   find the row that it was going to modify or remove.

In other words, your slave is missing one or more rows that exists on master. So you need to figure out why the slave has gone out of sync. The possible reasons fall in these categories:
 - a known bug (some possible candidates: BUG#50440, BUG#50439, BUG#49482,
   BUG#49481, BUG#40045, BUG#39753, BUG#47312, BUG#51501, BUG#52868, or any
   other bug where the slave goes out of sync);
 - a user error (mistake in application logic or during manual intervention
   with the database);
 - a not yet reported bug.

==== How to act to find out the reason for the message ====

1. Figure out when and why your slave went out of sync. Try this:
   1.1 Use the binlog positions in the message together with mysqlbinlog to
       find out what row is missing on the slave.
   1.2 Use your application-specific knowledge to figure out when this row
       was last updated on master.
   1.3 Use your application-specific knowledge to figure out if anything
       unusual was going on when the record was last updated on the master, or
       when it was replayed on the slave.

2. Now see what action to take next. There are four possibilities:
   2.1 If you found out that there was a user error, remove the reason for the
       error and try to repair your database.
   2.2 Check if the preconditions for one or more of the bugs listed above
       applies to you. For each candidate bug:
        2.2.1 If the bug is closed: see if you can fix the problem by upgrading
              to the version where it was fixed.
        2.2.2 If the bug is open: see if there is a known workaround and wait
              until we fix the bug.
   2.3 If it is not an error in your application logic and you have checked
       that it can not be any of the bugs above, file a new bug. As usual, be
       specific about the reason and provide all details. In particular, your
       table schemas and data may be useful, as well as the output from
       "mysqlbinlog --hexdump" for relevant parts of the binary logs and relay
       logs. If your report is not at least so specific that we can be sure
       the reason is different from the bugs above, then it will be impossible
       for us to fix the bug no matter how hard we try!