Bug #17233 LOAD DATA INFILE: failure causes mysqld dbug_assert, binlog not flushed
Submitted: 8 Feb 2006 12:10 Modified: 7 Jun 2007 16:10
Reporter: Tomas Ulin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.1,5.0.36-BK OS:Any
Assigned to: Chuck Bell
Tags: bfsm_2007_02_15

[8 Feb 2006 12:10] Tomas Ulin
Description:
CREATE TABLE t1 (word CHAR(20) NOT NULL PRIMARY KEY) ENGINE=INNODB;
--error 1062
LOAD DATA INFILE '../../std_data/words.dat' INTO TABLE t1 ;
DROP TABLE t1;

in debug cores on assertion

mysqld: log.cc:87: int binlog_close_connection(THD*): Assertion `mysql_bin_log.is_open() && !((trans_log)->pos_in_file + (uint) (*(trans_log)->current_pos - (trans_
log)->request_pos))' failed.

How to repeat:
see above
[8 Feb 2006 12:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/2316
[13 Feb 2006 12:38] Tomas Ulin
note that the above was not a patch for this, just disabled a testcase failing because of this bug
[14 Feb 2006 16:21] Magnus Blåudd
Could not reproduce on FC4, enabling test again.
[16 Feb 2006 8:30] Jonas Oreland
The statement as such work, but it leaves mysqld in a poor state.
[16 Feb 2006 9:23] Tomas Ulin
so to clarify the issue is that the command executes... fails...
and then if you close down the server (debug compiled) you will see that it asserts on binlog being open, indicating that the failed command was not rolled back completely

on way to reproduce is to enable the ndb_load test and run mysql-test-run from that testcase
[16 Feb 2006 14:44] Magnus Blåudd
When the thread that serves the connection with mysqltest ends THD::cleanup will be called. One of the steps in that involves rolling back any outstanding transactions, and if "transaction.xid_state.xa_state == XA_PREPARED"then xid_cache_delete(&transaction.xid_state) is called and that causes the crash.
[17 Feb 2006 10:43] Magnus Blåudd
The mysqld server crashes on a DBUG_ASSERT when the connection is closed after a "LOAD DATA INFILE" has failed.

The DBUG_ASSERT is:
DBUG_ASSERT(mysql_bin_log.is_open() && !my_b_tell(trans_log));
The binlog is open, so it is the !my_b_tell(trans_log)  that causes the failure:

gdb>
#7  0x0000000000665c98 in binlog_close_connection (thd=0x17ded88) at log.cc:88
88        DBUG_ASSERT(mysql_bin_log.is_open() && !my_b_tell(trans_log));
(gdb) p trans_log
$1 = (IO_CACHE *) 0x181f408
(gdb) p *trans_log
$2 = {pos_in_file = 0, end_of_file = 18446744073709551615,
  read_pos = 0x181f638 "\u06e4\uffffC\v\001", read_end = 0x0,
  buffer = 0x181f638 "\u06e4\uffffC\v\001", request_pos = 0x181f638 "\u06e4\uffffC\v\001",
  write_buffer = 0x181f638 "\u06e4\uffffC\v\001", append_read_pos = 0x0,
  write_pos = 0x181fa3a '\uffff' <repeats 200 times>...,
  write_end = 0x1827638 "h4z\025", current_pos = 0x181f448,
  current_end = 0x181f450, append_buffer_lock = {global = {__m_reserved = 0,
      __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0,
        __spinlock = 0}}, mutex = {__m_reserved = 0, __m_count = 0,
      __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0,
        __spinlock = 0}}, file = 0x0, line = 0, count = 0, thread = 0},
  share = 0x0, read_function = 0x9a9aa9 <_my_b_read>,
  write_function = 0x9aa9bc <_my_b_write>, type = WRITE_CACHE, pre_read = 0,
  post_read = 0, pre_close = 0, disk_writes = 0, arg = 0x0, file_name = 0x0,
  dir = 0x181f588 "/space/magnus/bug17233/my50-bug17233/mysql-test/var/tmp/",
  prefix = 0x181f5f8 "ML", file = -1, seek_not_done = 1, error = 0,
  buffer_length = 32768, read_length = 32768, myflags = 16,
  alloced_buffer = 1 '\001'}

The following statement produces the crashes:
# should give duplicate key
CREATE TABLE t1 (word CHAR(20) NOT NULL PRIMARY KEY) ENGINE=NDB;
--error 1022
LOAD DATA INFILE '../../std_data/words.dat' INTO TABLE t1 ;
DROP TABLE t1;

I think it's a little confused how this should work with binlog, failures and rollback/commits.

The error is first detected in handled::end_bulk_insert(when the rows to be inserted are sent to NDB) and just after this the function enable_transactions are called. It will perform a commit(which will fail), and thus end_trans(COMMIT) is never called. I tried to do a end_trans(ROLLBACK) and it helps a little....

int ha_enable_transaction(THD *thd, bool on)
{
  int error=0;

  DBUG_ENTER("ha_enable_transaction");
  thd->transaction.on= on;
  if (on)
  {
    /*
      Now all storage engines should have transaction handling enabled.
      But some may have it enabled all the time - "disabling" transactions
      is an optimization hint that storage engine is free to ignore.
      So, let's commit an open transaction (if any) now.
    */
    if (!(error= ha_commit_stmt(thd)))
      error= end_trans(thd, COMMIT);

    else
      error= end_trans(thd, ROLLBACK); << New code
  }
  DBUG_RETURN(error);
}

Later on in' mysql_load' there is several more places that calls rollback/commit
For example at ine 401.
>>
  if (error)
  {
    if (transactional_table)
      ha_autocommit_or_rollback(thd,error);
<<

and again at line 472:
>>
  if (transactional_table)
    error=ha_autocommit_or_rollback(thd,error);
<<
[13 Dec 2006 18:34] Chuck Bell
I was not able to reproduce the exact error. I ran the test on SUSE 10.0 and did not encounter the problem. However, I ran the test on Windows XP and the code generates a different error with the test included at the top of the bug report. 

I ran the test with and without replication and got the same results. 

When you run the test, the code aborts with a heap corruption error (from the OS) originating from a call to my_free() in ha_innodb::close(). 

I ran another test:

CREATE TABLE t1 (word CHAR(20) NOT NULL PRIMARY KEY) ENGINE=INNODB;
LOAD DATA INFILE 'c:\\source\\c++\\names.txt' into table t1;
DROP TABLE t1;

The names.txt file had only 4 unique names:
Joe
Sam
Dave
John

The results were the same error from ha_innodb::close().

Interestingly, using MyISAM, HEAP, or BDB (and presumably others) does not produce this error.
[6 Feb 2007 9:43] Shane Bester
Chuck, I repeated on todays 5.0BK build.

Change in mysql-test-run.pl to enable InnoDB in the slave databases.
---
> mtr_add_arg($args, "%s--skip-innodb", $prefix);
---
< mtr_add_arg($args, "%s--default-table-type=innodb", $prefix);
< mtr_add_arg($args, "%s--transaction-isolation=READ-COMMITTED", $prefix);

(Note, change the 2nd occurance, the one that starts the slave db).
Then, run:

/mysql-test-run.pl rpl_loaddata.test

I attach my error log to this bug report.
[6 Feb 2007 9:43] Shane Bester
debug assertion 5.0.36-bk, suse 9.3 x86

Attachment: stack_slave.txt (text/plain), 2.40 KiB.

[20 Mar 2007 12:38] Chuck Bell
Will try SUSE and Windows XP with latest clone. In progress.
[4 May 2007 16:35] Chuck Bell
This problem arises when the server exits. While no errors occur until that time, there is an inconsistency in the transaction log. When the method binlog_close_connection() is called on exit, the DBUG_ASSERT fails on the check to !my_b_tell(trans_log). In this case, the log's current_pos and requested_pos are not equal. This is caused by calling the rollback method on the handler in the wrong order. The patch for this bug corrects the ordering of the calls to the handler thereby ensuring the transaction log is closed properly.
[7 May 2007 8:45] Rafal Somla
Good to push.
[8 May 2007 12:44] Chuck Bell
Patch pushed to mysql-5.0-rpl on 7 May.
[8 May 2007 14:20] Chuck Bell
Patch committed to 5.1 with clarification of problem:

The problem was an assert was thrown after the operation completed. The 
assert found a non-sunk event in the transaction cache. The events in the 
transaction cache were added after commit_or_rollack and thereafter nothing 
removed them.
[11 May 2007 17:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26519

ChangeSet@1.2578, 2007-05-11 13:04:03-04:00, cbell@suse.vabb.com +1 -0
  BUG#17233 : LOAD DATA INFILE: failure causes mysqld dbug_assert, binlog not flushed
  
  This patch enables the NDB test ndb_load that was previously disabled
  as a result of the LOAD DATA INFILE bug reported in this bug report.
[11 May 2007 18:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/26527

ChangeSet@1.2438, 2007-05-11 14:13:20-04:00, cbell@suse.vabb.com +1 -0
  BUG#17233 : LOAD DATA INFILE: failure causes mysqld dbug_assert, binlog not flushed
    
  This patch enables the NDB test ndb_load that was previously disabled
  as a result of the LOAD DATA INFILE bug reported in BUG#17233.
[1 Jun 2007 19:20] Bugs System
Pushed into 5.0.44
[1 Jun 2007 19:23] Bugs System
Pushed into 5.1.20-beta
[7 Jun 2007 16:10] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.0.44 and 5.1.20 changelogs.
[3 Nov 2010 10:18] Shane Bester
for the interested parties.  this isnt fixed in 5.0.92.

Version: '5.0.92-valgrind-max-debug-log'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution

mysqld: log.cc:88: int binlog_close_connection(THD*): Assertion `!((trans_log)->pos_in_file + (uint) (*(trans_log)->current_pos - (trans_log)->request_pos))' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff448f710 (LWP 8013)]
0x00000030e14329a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00000030e14329a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00000030e1434185 in abort () at abort.c:92
#2  0x00000030e142b935 in __assert_fail (assertion=0x95e958 "!((trans_log)->pos_in_file + (uint) (*(trans_log)->current_pos - (trans_log)->request_pos))", file=<value optimized out>, line=88, function=<value optimized out>) at assert.c:81
#3  0x000000000063d50d in ?? ()
#4  0x0000000000679a65 in ha_close_connection(THD*) ()
#5  0x00000000005aa967 in THD::~THD() ()
#6  0x00000000005aab4b in THD::~THD() ()
#7  0x00000000005b7a3e in end_thread(THD*, bool) ()
#8  0x00000000005d6fe8 in handle_one_connection ()
#9  0x00000030e1807761 in start_thread (arg=0x7ffff448f710) at pthread_create.c:301
#10 0x00000030e14e14ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(gdb)
[3 Nov 2010 11:05] Andrei Elkin
At least I can confirm the fixes are there.
Shane, and can you reproduce it in 5.1 as well?
What would be your setup for that...