Bug #53770 Server crash at handler.cc:2076 on LOAD DATA after timed out COALESCE PARTITION
Submitted: 19 May 2010 0:26 Modified: 13 Aug 2010 10:34
Reporter: Elena Stepanova Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.5.3-m3, 5.5.5-m3, 5.6.99 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Triage: Triaged: D1 (Critical)

[19 May 2010 0:26] Elena Stepanova
Description:
#2  <signal handler called>
#3  handler::ha_thd (this=0x0)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/handler.cc:2076
#4  0x000000000068c6c6 in handler::ha_write_row (this=0x0, buf=0x14b5d10 "�\001")
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/handler.cc:3132
#5  0x000000000068341b in ha_partition::write_row (this=0x14b5a00, buf=0x14b5d10 "�\001")
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/ha_partition.cc:3157
#6  0x000000000068c70f in handler::ha_write_row (this=0x14b5a00, buf=0x14b5d10 "�\001")
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/handler.cc:4682
#7  0x000000000056427c in write_record (thd=0x148e820, table=0x149fe70, info=0x44908f80)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_insert.cc:1670
#8  0x000000000077d34f in mysql_load (thd=0x148e820, ex=0x14cf990, table_list=0x14cfa18, 
    fields_vars=@0x1490b60, set_fields=@0x1490b90, set_values=@0x1490b78, handle_duplicates=DUP_ERROR, 
    ignore=true, read_file_from_client=<value optimized out>)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_load.cc:1068
#9  0x000000000057e3de in mysql_execute_command (thd=0x148e820)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_parse.cc:3436
#10 0x00000000005805a8 in mysql_parse (thd=0x148e820, 
    inBuf=0x14cf8a0 "LOAD DATA LOCAL INFILE 'var/load.in' INTO TABLE t1 (f)", length=54, 
    found_semicolon=0x4490ab50)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_parse.cc:5811
#11 0x00000000005812f2 in dispatch_command (command=COM_QUERY, thd=0x148e820, packet=0x14a1281 "\n2\n3\n", 
    packet_length=54) at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_class.h:723
#12 0x0000000000581b4d in do_command (thd=0x148e820)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_parse.cc:774
#13 0x000000000061807b in do_handle_one_connection (thd_arg=0x148e820)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_connect.cc:1188
#14 0x0000000000618b64 in handle_one_connection (arg=<value optimized out>)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/sql/sql_connect.cc:1127
#15 0x00000000008e2efb in pfs_spawn_thread (arg=<value optimized out>)
    at /export/home/pb2/build/sb_0-1790929-1273478340.45/mysql-5.5.5-m3/storage/perfschema/pfs.cc:1011
#16 0x00002ae35ee25143 in start_thread () from /lib64/libpthread.so.0
#17 0x00002ae35f4948cd in clone () from /lib64/libc.so.6
#18 0x0000000000000000 in ?? ()

The test flow has similarities with the test provided in bug#53676, so these two issues might have the same root cause.

How to repeat:
--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

CREATE TABLE t1 ( i INT NOT NULL AUTO_INCREMENT PRIMARY KEY, f INT )
        ENGINE = MyISAM PARTITION BY HASH(i) PARTITIONS 3;

--echo # Connection 1 starts transaction and gets lock
START TRANSACTION;
SELECT * FROM t1;

--connect (con2,localhost,root,,)
SET lock_wait_timeout = 2;
--echo # Connection 2 tries to coalesce partitions (timeout):
--error ER_LOCK_WAIT_TIMEOUT
ALTER TABLE t1 COALESCE PARTITION 2;

--connect (con3,localhost,root,,)
perl;
open( LD, ">var/load.in" ) || die "Could not open file for writing";
print LD "1\n2\n3\n";
close( LD );
EOF
--echo # Connection 3 tries to load into the table:
send LOAD DATA LOCAL INFILE 'var/load.in' INTO TABLE t1 (f);

--connection default
--real_sleep 1
--echo # Connection 1 commits the transaction
COMMIT;

--connection con3
--echo # Connection 3...
--reap
[26 May 2010 12:20] Mattias Jonsson
The bug is in the ddl_log handling/fast partition alter.

Even if the there is a failure in the locking, the ddl_log tries to complete the operation, resulting in the .frm and .par files are deleted, and rewritten with the new table definition, but the change of partitions was never done. So it leaves the table in a very bad state (i.e updated .frm and .par, but no changes to the partitions handlers.)

The crash occurs because it re-uses a table definition from the table cache with the old definition and the .frm/.par file have the new definition... :(

Investigating how to fix it...
[8 Jun 2010 9:35] Mattias Jonsson
related to bug#53676.
[11 Jun 2010 0:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110773

3092 Mattias Jonsson	2010-06-11
      Bug#53676: Unexpected errors and possible table
                 corruption on ADD PARTITION and LOCK TABLE
      Bug#53770: Server crash at handler.cc:2076 on
                 LOAD DATA after timed out COALESCE PARTITION
      
      5.5 fix for:
      Bug#51042: REORGANIZE PARTITION can leave table in an
                 inconsistent state in case of crash
      Needs to be back-ported to 5.1
      
      5.5 fix for:
      Bug#50418: DROP PARTITION does not interact with
                 transactions
      
      Main problem was non-persistent operations done
      before meta-data lock was taken (53770+53676).
      And 53676 needed to keep the table/partitions opened and locked
      while copying the data to the new partitions.
      
      Also added thorough tests to spot some additional bugs
      in the ddl_log code, which could result in bad state
      between the .frm and partitions.
     @ mysql-test/suite/parts/inc/partition_crash.inc
        recovery test including a crash
     @ mysql-test/suite/parts/inc/partition_crash_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_change.inc
        test all states in fast_alter_partition_table
        CHANGE PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail.inc
        recovery test including an injected error
     @ mysql-test/suite/parts/inc/partition_fail_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_change.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/t/partition_debug-master.opt
        opt file for testing crash recovery
     @ mysql-test/suite/parts/t/partition_debug.test
        Test for ALTER PARTITION commands involving
        crash and error injections.
     @ mysql-test/t/partition_innodb.test
        Also fixes bug#50418
     @ sql/sql_base.cc
        Removed abort_and_upgrade_lock_and_close_table
        and exporting alter_close_tables instead
     @ sql/sql_base.h
        removed some non existing functions,
        added alter_close_tables.
     @ sql/sql_partition.cc
        fast_alter_partition_table:
        Splitted abort_and_upgrade_lock_and_close_table
        to its parts (wait_while_table_is_used and
        close_alter_tables) and always have
        wait_while_table_is_used before any persistent
        operations (including logs, which will be executed
        on failure) and close_alter_tables after create/read/write
        operations and before drop operations.
        
        Added error injections for better test coverage.
        
        write_log_final_change_partition:
        fixed a log_entry linking bug (delete_frm was not
        linked to change/drop partition)
        and drop partition must be executed before
        change partition (change partition can rename a
        partition to an old name, like REORG p1 INTO (p1,p2).
        
        write_log_add_change_partition:
        need to use drop_frm first, and relinking that entry
        and reusing its execute entry.
     @ sql/table.h
        removed a duplicate declaration.
[11 Jun 2010 11:04] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/110809

3092 Mattias Jonsson	2010-06-11
      Bug#53676: Unexpected errors and possible table
                 corruption on ADD PARTITION and LOCK TABLE
      Bug#53770: Server crash at handler.cc:2076 on
                 LOAD DATA after timed out COALESCE PARTITION
      
      5.5 fix for:
      Bug#51042: REORGANIZE PARTITION can leave table in an
                 inconsistent state in case of crash
      Needs to be back-ported to 5.1
      
      5.5 fix for:
      Bug#50418: DROP PARTITION does not interact with
                 transactions
      
      Main problem was non-persistent operations done
      before meta-data lock was taken (53770+53676).
      And 53676 needed to keep the table/partitions opened and locked
      while copying the data to the new partitions.
      
      Also added thorough tests to spot some additional bugs
      in the ddl_log code, which could result in bad state
      between the .frm and partitions.
      
      Updated patch with InnoDB crash and fail tests and
      fixed a glitch in ADD PARTITION and DROP PARTITION.
     @ mysql-test/suite/parts/inc/partition_crash.inc
        recovery test including a crash
     @ mysql-test/suite/parts/inc/partition_crash_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_change.inc
        test all states in fast_alter_partition_table
        CHANGE PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail.inc
        recovery test including an injected error
     @ mysql-test/suite/parts/inc/partition_fail_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_change.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/t/partition_debug_myisam-master.opt
        opt file for testing crash recovery
     @ mysql-test/suite/parts/t/partition_debug_myisam.test
        Test for ALTER PARTITION commands involving
        crash and error injections.
     @ mysql-test/t/partition_innodb.test
        Also fixes bug#50418
     @ sql/sql_base.cc
        Removed abort_and_upgrade_lock_and_close_table
        and exporting alter_close_tables instead
     @ sql/sql_base.h
        removed some non existing functions,
        added alter_close_tables.
     @ sql/sql_partition.cc
        fast_alter_partition_table:
        Splitted abort_and_upgrade_lock_and_close_table
        to its parts (wait_while_table_is_used and
        close_alter_tables) and always have
        wait_while_table_is_used before any persistent
        operations (including logs, which will be executed
        on failure) and close_alter_tables after create/read/write
        operations and before drop operations.
        
        Added error injections for better test coverage.
        
        write_log_final_change_partition:
        fixed a log_entry linking bug (delete_frm was not
        linked to change/drop partition)
        and drop partition must be executed before
        change partition (change partition can rename a
        partition to an old name, like REORG p1 INTO (p1,p2).
        
        write_log_add_change_partition:
        need to use drop_frm first, and relinking that entry
        and reusing its execute entry.
     @ sql/table.h
        removed a duplicate declaration.
[13 Aug 2010 7:53] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115640

3174 Mattias Jonsson	2010-08-13
      Bug#53676: Unexpected errors and possible table
                 corruption on ADD PARTITION and LOCK TABLE
      Bug#53770: Server crash at handler.cc:2076 on
                 LOAD DATA after timed out COALESCE PARTITION
      
      5.5 fix for:
      Bug#51042: REORGANIZE PARTITION can leave table in an
                 inconsistent state in case of crash
      Needs to be back-ported to 5.1
      
      5.5 fix for:
      Bug#50418: DROP PARTITION does not interact with
                 transactions
      
      Main problem was non-persistent operations done
      before meta-data lock was taken (53770+53676).
      And 53676 needed to keep the table/partitions opened and locked
      while copying the data to the new partitions.
      
      Also added thorough tests to spot some additional bugs
      in the ddl_log code, which could result in bad state
      between the .frm and partitions.
      
      Collapsed patch, includes all fixes required from the reviewers.
     @ mysql-test/r/partition_innodb.result
        updated result with new test
     @ mysql-test/suite/parts/inc/partition_crash.inc
        crash test include file
     @ mysql-test/suite/parts/inc/partition_crash_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_change.inc
        test all states in fast_alter_partition_table
        CHANGE PARTITION branch
     @ mysql-test/suite/parts/inc/partition_crash_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail.inc
        recovery test including injecting errors
     @ mysql-test/suite/parts/inc/partition_fail_add.inc
        test all states in fast_alter_partition_table
        ADD PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_change.inc
        test all states in fast_alter_partition_table
        CHANGE PARTITION branch
     @ mysql-test/suite/parts/inc/partition_fail_drop.inc
        test all states in fast_alter_partition_table
        DROP PARTITION branch
     @ mysql-test/suite/parts/inc/partition_mgm_crash.inc
        include file that runs all crash and failure injection tests.
     @ mysql-test/suite/parts/r/partition_debug_innodb.result
        new test result file
     @ mysql-test/suite/parts/r/partition_debug_myisam.result
        new test result file
     @ mysql-test/suite/parts/r/partition_special_innodb.result
        updated result
     @ mysql-test/suite/parts/r/partition_special_myisam.result
        updated result
     @ mysql-test/suite/parts/t/partition_debug_innodb-master.opt
        opt file for using with crashing tests of partitioned innodb
     @ mysql-test/suite/parts/t/partition_debug_innodb.test
        partitioned innodb test that require debug builds
     @ mysql-test/suite/parts/t/partition_debug_myisam-master.opt
        opt file for using with crashing tests of partitioned myisam
     @ mysql-test/suite/parts/t/partition_debug_myisam.test
        partitioned myisam test that require debug builds
     @ mysql-test/suite/parts/t/partition_special_innodb-master.opt
        added innodb-file-per-table to easier verify partition status on disk
     @ mysql-test/suite/parts/t/partition_special_innodb.test
        added test case
     @ mysql-test/suite/parts/t/partition_special_myisam.test
        added test case
     @ mysql-test/t/partition_innodb.test
        added test case
     @ sql/sql_base.cc
        Moved alter_close_tables to sql_partition.cc
     @ sql/sql_base.h
        removed some non existing and duplicated functions.
     @ sql/sql_partition.cc
        fast_alter_partition_table:
        Spletted abort_and_upgrad_lock_and_close_table
        to its parts (wait_while_table_is_used and
        alter_close_tables) and always have
        wait_while_table_is_used before any persistent
        operations (including logs, which will be executed
        on failure) and alter_close_tables after
        create/read/write operations and before
        drop operations.
        
        moved alter_close_tables here from sql_base.cc
        
        Added error injections for better test coverage.
        
        write_log_final_change_partition:
        fixed a log_entry linking bug (delete_frm was not
        linked to change/drop partition)
        and drop partition must be executed before
        change partition (change partition can rename a
        partition to an old name, like REORG p1 INTO (p1,p2).
        
        write_log_add_change_partition:
        need to use drop_frm first, and relinking that entry
        and reusing its execute entry.
     @ sql/sql_table.cc
        added initialization of next_active_log_entry.
     @ sql/table.h
        removed a duplicate declaration.
[13 Aug 2010 10:33] Mattias Jonsson
Closing as duplicate of bug#53676, since that patch also includes the fix for this. (just pushed to mysql-5.5-bugteam and up.)
[25 Aug 2010 9:22] Bugs System
Pushed into mysql-5.5 5.5.6-m3 (revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (version source revid:alik@ibmvm-20100825092002-2yvkb3iwu43ycpnm) (merge vers: 5.5.6-m3) (pib:20)
[30 Aug 2010 8:31] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (version source revid:alik@sun.com-20100830082732-n2eyijnv86exc5ci) (merge vers: 5.6.1-m4) (pib:21)
[30 Aug 2010 8:34] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (version source revid:alik@sun.com-20100830082745-n6sh01wlwh3itasv) (pib:21)