Bug #46610 New MDL: MyISAM MRG engine crash on auto-repair of child
Submitted: 7 Aug 2009 17:55 Modified: 7 Mar 2010 2:02
Reporter: Konstantin Osipov (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Merge storage engine Severity:S3 (Non-critical)
Version:5.4 OS:Any
Assigned to: Konstantin Osipov CPU Architecture:Any
Tags: regression

[7 Aug 2009 17:55] Konstantin Osipov
Description:
I'm using MySQL 5.4.4.
I have a child MyISAM table, with one column, which is crashed.
I have a parent MyisamMRG table, pointing at the "crashed" child -- i.e. a child that needs REPAIR.
I select from the MERGE parent, crash.
The debug version asserts in tdc_remove_table(), on attempt to read
free()d memory.

In order to reproduce the bug I need to "crash" the child table *and* ensure
that the parent table is expelled from the table cache before repairing the child.
In real life a MyISAM table can get "crashed" due to many reasons (index corruption, server crash), in the test below I use --myisam-recover=force
and manually corrupt the index.
In real life a MyISAM MERGE parent can get expelled from the table cache
due to many reasons (table cache full, pending FLUSH TABLES in a concurrent connection, pending DDL against the MERGE parent in a concurrent connection),
in the test case below I manually set the test table definition cache
to the minimal size and occupy it fully in a parallel connection.

The backtrace:

#3  0xb7cc45ce in __assert_fail () from /lib/tls/i686/cmov/libc.so.6
#4  0x0835c5d7 in tdc_remove_table (thd=0x9a5b540, 
    remove_type=TDC_RT_REMOVE_ALL, db=0x9bd4aa0 "test", 
    table_name=0x99fe9e0 "\030") at sql_base.cc:7804
#5  0x0835eca0 in recover_from_failed_open_table_attempt (thd=0x9a5b540, 
    table=0x9bf5590, action=OT_REPAIR) at sql_base.cc:3635
#6  0x08360c6d in open_tables (thd=0x9a5b540, start=0xb7388734, 
    counter=0xb7388720, flags=0) at sql_base.cc:3894
#7  0x083612c6 in open_and_lock_tables_derived (thd=0x9a5b540, 
    tables=0x9a1b9d0, derived=true, flags=0) at sql_base.cc:4316
#8  0x08311751 in open_and_lock_tables (thd=0x9a5b540, tables=0x9a1b9d0)
    at ../../sql/mysql_priv.h:1520
#9  0x0830310d in execute_sqlcom_select (thd=0x9a5b540, all_tables=0x9a1b9d0)
    at sql_parse.cc:4892
#10 0x0830542d in mysql_execute_command (thd=0x9a5b540) at sql_parse.cc:2112
#11 0x0830e91a in mysql_parse (thd=0x9a5b540, 
    inBuf=0x9b411c0 "select * from t1_mrg", length=20, 
    found_semicolon=0xb7389bd4) at sql_parse.cc:5942
#12 0x0830f474 in dispatch_command (command=COM_QUERY, thd=0x9a5b540, 
    packet=0x9aed6b1 "select * from t1_mrg", packet_length=20)
    at sql_parse.cc:1062
#13 0x08310999 in do_command (thd=0x9a5b540) at sql_parse.cc:744
#14 0x082fcf31 in handle_one_connection (arg=0x9a5b540) at sql_connect.cc:1163
#15 0xb7f7e4ff in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#16 0xb7d8449e in clone () from /lib/tls/i686/cmov/libc.so.6

As you can see in the trace, table_name in frame #4 is \030, i.e. corrupt (should be t1).

How to repeat:
kostja@shakti:~/work/5.4-azalea-bugfixing/mysql-test$ cat t/merge_recover-master.opt                                                        
--myisam-recover=force
kostja@shakti:~/work/5.4-azalea-bugfixing/mysql-test$ cat t/merge_recover.test 
--echo #
--echo # Test of MyISAM MRG tables with corrupted children.
--echo # Run with --myisam-recover=force option.
--echo #
--echo # Preparation: we need to make sure that the merge parent
--echo # is never left in the table cache when closed, since this may
--echo # have effect on merge children.
--echo # For that, we set the table cache to minimal size and populate it
--echo # in a concurrent connection.
connect(con1,localhost,root,,test,,);
--echo #
--echo # Switching to connection con1
--echo #
connection con1;
--echo #
--echo # Minimal values.
--echo #
set global table_open_cache=256;
set global table_definition_cache=256;
--disable_warnings
drop procedure if exists p_create_and_lock;
--enable_warnings
delimiter |;
create procedure p_create()
begin
  declare i int default 1;
  set @lock_table_stmt="lock table ";
  set @drop_table_stmt="drop table ";
  while i < @@global.table_definition_cache + 1 do
    set @table_name=concat("t_", i);
    set @opt_comma=if(i=1, "", ", ");
    set @lock_table_stmt=concat(@lock_table_stmt, @opt_comma,
                                @table_name, " read");
    set @drop_table_stmt=concat(@drop_table_stmt, @opt_comma, @table_name);
    set @create_table_stmt=concat("create table if not exists ",
                                  @table_name, " (a int)");
    prepare stmt from @create_table_stmt;
    execute stmt;
    deallocate prepare stmt;
    set i= i+1;
  end while;
end|
delimiter ;|
call p_create();
drop procedure p_create;
--disable_query_log
let $lock=`select @lock_table_stmt`;
eval $lock;
--enable_query_log
--echo #
--echo # Switching to connection 'default'
--echo #
connection default;

--disable_warnings
drop table if exists t1, t1_mrg, t1_copy;
--enable_warnings
let $MYSQLD_DATADIR=`select @@datadir`;
--echo #
--echo # Prepare a MERGE engine table, that refers to a corrupted
--echo # child.
--echo # 
create table t1 (a int, key(a)) engine=myisam;
create table t1_mrg (a int) union (t1) engine=merge;
--echo #
--echo # Create a table with a corrupted index file:
--echo # save an old index file, insert more rows, 
--echo # overwrite the new index file with the old one.
--echo #
insert into  t1 (a) values (1), (2), (3);
flush table t1;
--copy_file $MYSQLD_DATADIR/test/t1.MYI $MYSQLD_DATADIR/test/t1_copy.MYI
insert into  t1 (a) values (4), (5), (6);
flush table t1;
--remove_file $MYSQLD_DATADIR/test/t1.MYI
--copy_file $MYSQLD_DATADIR/test/t1_copy.MYI $MYSQLD_DATADIR/test/t1.MYI
--remove_file $MYSQLD_DATADIR/test/t1_copy.MYI
--echo # check table is needed to mark the table as crashed.
check table t1;
--echo #
--echo # At this point we have a merge table t1_mrg pointing to t1,
--echo # and t1 is corrupted, and will be auto-repaired at open.
--echo # Check that this doesn't lead to memory corruption.
--echo #
select * from t1_mrg;
--echo #
--echo # Cleanup
--echo #
drop table t1, t1_mrg;
--echo #
--echo # Switching to connection con1
--echo #
connection con1;
unlock tables;
prepare stmt from @drop_table_stmt;
execute stmt;
deallocate prepare stmt;
set @@global.table_definition_cache=default;
set @@global.table_open_cache=default;
disconnect con1;
connection default;
kostja@shakti:~/work/5.4-azalea-bugfixing/mysql-test$ cat r/merge_recover.result 
#
# Test of MyISAM MRG tables with corrupted children.
# Run with --myisam-recover=force option.
#
# Preparation: we need to make sure that the merge parent
# is never left in the table cache when closed, since this may
# have effect on merge children.
# For that, we set the table cache to minimal size and populate it
# in a concurrent connection.
#
# Switching to connection con1
#
#
# Minimal values.
#
set global table_open_cache=256;
set global table_definition_cache=256;
drop procedure if exists p_create_and_lock;
create procedure p_create()
begin
declare i int default 1;
set @lock_table_stmt="lock table ";
set @drop_table_stmt="drop table ";
while i < @@global.table_definition_cache + 1 do
set @table_name=concat("t_", i);
set @opt_comma=if(i=1, "", ", ");
set @lock_table_stmt=concat(@lock_table_stmt, @opt_comma,
@table_name, " read");
set @drop_table_stmt=concat(@drop_table_stmt, @opt_comma, @table_name);
set @create_table_stmt=concat("create table if not exists ",
@table_name, " (a int)");
prepare stmt from @create_table_stmt;
execute stmt;
deallocate prepare stmt;
set i= i+1;
end while;
end|
call p_create();
drop procedure p_create;
#
# Switching to connection 'default'
#
drop table if exists t1, t1_mrg, t1_copy;
#
# Prepare a MERGE engine table, that refers to a corrupted
# child.
# 
create table t1 (a int, key(a)) engine=myisam;
create table t1_mrg (a int) union (t1) engine=merge;
#
# Create a table with a corrupted index file:
# save an old index file, insert more rows, 
# overwrite the new index file with the old one.
#
insert into  t1 (a) values (1), (2), (3);
flush table t1;
insert into  t1 (a) values (4), (5), (6);
flush table t1;
# check table is needed to mark the table as crashed.
check table t1;
Table	Op	Msg_type	Msg_text
test.t1	check	warning	Size of datafile is: 42       Should be: 21
test.t1	check	error	Record-count is not ok; is 6   Should be: 3
test.t1	check	warning	Found 6 key parts. Should be: 3
test.t1	check	error	Corrupt
#
# At this point we have a merge table t1_mrg pointing to t1,
# and t1 is corrupted, and will be auto-repaired at open.
# Check that this doesn't lead to memory corruption.
#
select * from t1_mrg;
a
1
2
3
4
5
6
Warnings:
Error	145	Table './test/t1' is marked as crashed and should be repaired
Error	1194	Table 't1' is marked as crashed and should be repaired
Error	1034	Number of rows changed from 3 to 6
#
# Cleanup
#
drop table t1, t1_mrg;
#
# Switching to connection con1
#
unlock tables;
prepare stmt from @drop_table_stmt;
execute stmt;
deallocate prepare stmt;
set @@global.table_definition_cache=default;
set @@global.table_open_cache=default;

I also attached the test files for verifier convenience.

Suggested fix:
The MERGE parent allocates TABLE_LIST elements in ha_myisammrg memory root.
But this memory may be gone inside open_tables(), in case of back-off, but still needed for auto-repair. 
TABLE_LIST elements should be allocated in the execution memory root instead.
[7 Aug 2009 17:58] Konstantin Osipov
-master.opt file

Attachment: merge_recover-master.opt (application/octet-stream, text), 23 bytes.

[7 Aug 2009 17:58] Konstantin Osipov
Test file

Attachment: merge_recover.test (application/octet-stream, text), 3.09 KiB.

[7 Aug 2009 17:58] Konstantin Osipov
Expected result file

Attachment: merge_recover.result (application/octet-stream, text), 2.57 KiB.

[7 Aug 2009 18:11] Sveta Smirnova
Thank you for the report.

Verified as described.

Version 5.1 is not affected.

Backtrace in my environment:

#0  0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000b4a0c4 in my_write_core (sig=11) at stacktrace.c:309
#2  0x00000000006bf69a in handle_segfault (sig=11) at mysqld.cc:2718
#3  <signal handler called>
#4  0x00000000006b9221 in MDL_request::set_type (this=0x8f8f8f8f8f8f8f8f, type_arg=MDL_EXCLUSIVE) at mdl.h:169
#5  0x0000000000728c5a in recover_from_failed_open_table_attempt (thd=0xc7d9148, table=0xc7fceb0, action=OT_REPAIR) at sql_base.cc:3538
#6  0x0000000000729336 in open_tables (thd=0xc7d9148, start=0x40a80350, counter=0x40a80384, flags=0) at sql_base.cc:3805
#7  0x0000000000729eb7 in open_and_lock_tables_derived (thd=0xc7d9148, tables=0xc8b99d0, derived=true, flags=0) at sql_base.cc:4227
#8  0x00000000006dfe98 in open_and_lock_tables (thd=0xc7d9148, tables=0xc8b99d0) at ../../sql/mysql_priv.h:1519
#9  0x00000000006d9364 in execute_sqlcom_select (thd=0xc7d9148, all_tables=0xc8b99d0) at sql_parse.cc:4892
#10 0x00000000006d0c1d in mysql_execute_command (thd=0xc7d9148) at sql_parse.cc:2112
#11 0x00000000006db7e1 in mysql_parse (thd=0xc7d9148, inBuf=0xc8b95d0 "select * from t1_mrg", length=20, found_semicolon=0x40a81f00) at sql_parse.cc:5942
#12 0x00000000006ce51f in dispatch_command (command=COM_QUERY, thd=0xc7d9148, packet=0xc7ec339 "select * from t1_mrg", packet_length=20) at sql_parse.cc:1061
#13 0x00000000006cd81a in do_command (thd=0xc7d9148) at sql_parse.cc:743
#14 0x00000000006cbfe6 in handle_one_connection (arg=0xc7d9148) at sql_connect.cc:1158
#15 0x0000003429e061b5 in start_thread () from /lib64/libpthread.so.0
#16 0x00000034292cd39d in clone () from /lib64/libc.so.6
#17 0x0000000000000000 in ?? ()
[7 Aug 2009 18:55] Konstantin Osipov
The same cause as in Bug#42862
[13 Aug 2009 15:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80766

2857 Konstantin Osipov	2009-08-13
      A fix and a test case for Bug#46610 " 	MySQL 5.4.4: MyISAM MRG engine crash 
      on auto-repair of child".
      Also fixes Bug#42862 "Crash on failed attempt to open a children of a 
      merge table".
      
      MERGE engine needs to extend the global table list
      with TABLE_LIST elements for child tables,
      so that they are opened and locked.
      Previously these table list elements were allocated
      in memory of ha_myisammrg object (MERGE engine handler).
      That would lead to access to freed memory in 
      recover_from_failed_open_table_attempt(), which would
      try to recover a MERGE table child (MyISAM table)
      and use for that TABLE_LIST of that child.
      But by the time recover_from_failed_open_table_attempt()
      is invoked, ha_myisammrg object that owns this
      TABLE_LIST may be destroyed, and thus TABLE_LIST
      memory freed.
      
      The fix is to ensure that TABLE_LIST elements
      that are added to the global table list (lex->query_tables)
      are always allocated in thd->mem_root, which is not
      destroyed until end of execution.
      
      If previously TABLE_LIST elements were allocated
      at ha_myisammrg::open() (i.e. when the TABLE
      object was created and added to the table cache),
      now they are allocated in ha_myisammrg::add_chidlren_list()
      (i.e. right after "open" of the merge parent in 
      open_tables()).
      We still create a list of children names
      at ha_myisammrg::open() to use as a basis
      for creation of TABLE_LISTs, that allows
      to avoid reading the merge handler data
      file on every execution.
     @ mysql-test/r/merge_recover.result
        Test results for Bug#46610.
     @ mysql-test/t/merge_recover-master.opt
        Option file for Bug#46610 test (need a new test because
        of that option, which is not tested anywhere else).
     @ mysql-test/t/merge_recover.test
        Add a test case for Bug#46610.
     @ sql/table.h
        MERGE child child_def_version is now moved from TABLE_LIST
        to MERGE engine specific data structure.
     @ storage/myisammrg/ha_myisammrg.h
        Introduce an auxiliary structure to keep MERGE child name
        and definition version. A list of Mrg_child_def is created
        in ha_myisammrg::open() and reused in ha_myisammrg::add_children_list().
[13 Aug 2009 16:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80767

2857 Konstantin Osipov	2009-08-13
      A fix and a test case for Bug#46610 " 	MySQL 5.4.4: MyISAM MRG engine crash 
      on auto-repair of child".
      Also fixes Bug#42862 "Crash on failed attempt to open a children of a 
      merge table".
      
      MERGE engine needs to extend the global table list
      with TABLE_LIST elements for child tables,
      so that they are opened and locked.
      Previously these table list elements were allocated
      in memory of ha_myisammrg object (MERGE engine handler).
      That would lead to access to freed memory in 
      recover_from_failed_open_table_attempt(), which would
      try to recover a MERGE table child (MyISAM table)
      and use for that TABLE_LIST of that child.
      But by the time recover_from_failed_open_table_attempt()
      is invoked, ha_myisammrg object that owns this
      TABLE_LIST may be destroyed, and thus TABLE_LIST
      memory freed.
      
      The fix is to ensure that TABLE_LIST elements
      that are added to the global table list (lex->query_tables)
      are always allocated in thd->mem_root, which is not
      destroyed until end of execution.
      
      If previously TABLE_LIST elements were allocated
      at ha_myisammrg::open() (i.e. when the TABLE
      object was created and added to the table cache),
      now they are allocated in ha_myisammrg::add_chidlren_list()
      (i.e. right after "open" of the merge parent in 
      open_tables()).
      We still create a list of children names
      at ha_myisammrg::open() to use as a basis
      for creation of TABLE_LISTs, that allows
      to avoid reading the merge handler data
      file on every execution.
     @ mysql-test/r/merge_recover.result
        Test results for Bug#46610.
     @ mysql-test/t/merge_recover-master.opt
        Option file for Bug#46610 test (need a new test because
        of that option, which is not tested anywhere else).
     @ mysql-test/t/merge_recover.test
        Add a test case for Bug#46610.
     @ sql/table.h
        MERGE child child_def_version is now moved from TABLE_LIST
        to MERGE engine specific data structure.
     @ storage/myisammrg/ha_myisammrg.h
        Introduce an auxiliary structure to keep MERGE child name
        and definition version. A list of Mrg_child_def is created
        in ha_myisammrg::open() and reused in ha_myisammrg::add_children_list().
[13 Aug 2009 16:17] Konstantin Osipov
Pushed into 5.4.4
[13 Aug 2009 22:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80797

2858 Konstantin Osipov	2009-08-14
      A follow up patch for Bug#46610 "MySQL 5.4.4: MyISAM MRG engine crash on
      auto-repair of child".
      Fix the test suite failure in --ps-protocol, and make
      test results platform-independent.
[14 Aug 2009 11:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80815

2859 Konstantin Osipov	2009-08-14
      A follow up patch for the fix for Bug#46610 "MySQL 5.4.4: MyISAM 
      MRG engine crash on auto-repair of child".
      Due to bug 46714, there is a duplicate warning in the result
      file, and it contains a platform-specific path.
      Suppress the platform-specific output.
     @ mysql-test/r/merge_recover.result
        Update results.
     @ mysql-test/t/merge_recover.test
        Suppress platform-specific output.
[20 Aug 2009 0:13] Paul DuBois
Noted in 5.4.4 changelog.

Selecting from a MERGE table with a corrupted child MyISAM table
could cause a server crash when the server attempted to automatically
repair the child table.
[24 Aug 2009 13:53] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090824135126-2rngffvth14a8bpj) (version source revid:kostja@sun.com-20090814110435-tq6ufg3gh86m4gvh) (merge vers: 5.4.4-alpha) (pib:11)
[8 Dec 2009 13:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93196

3001 Konstantin Osipov	2009-12-08
      Backport of revid 2617.69.21, 2617.69.22, 2617.29.23:
      ----------------------------------------------------------
      revno: 2617.69.21
      committer: Konstantin Osipov <kostja@sun.com>
      branch nick: 5.4-4284-1-assert
      timestamp: Thu 2009-08-13 20:13:55 +0400
      message:
        A fix and a test case for Bug#46610 "MySQL 5.4.4: MyISAM MRG engine crash
        on auto-repair of child".
        Also fixes Bug#42862 "Crash on failed attempt to open a children of a
        merge table".
      
        MERGE engine needs to extend the global table list
        with TABLE_LIST elements for child tables,
        so that they are opened and locked.
        Previously these table list elements were allocated
        in memory of ha_myisammrg object (MERGE engine handler).
        That would lead to access to freed memory in
        recover_from_failed_open_table_attempt(), which would
        try to recover a MERGE table child (MyISAM table)
        and use for that TABLE_LIST of that child.
        But by the time recover_from_failed_open_table_attempt()
        is invoked, ha_myisammrg object that owns this
        TABLE_LIST may be destroyed, and thus TABLE_LIST
        memory freed.
      
        The fix is to ensure that TABLE_LIST elements
        that are added to the global table list (lex->query_tables)
        are always allocated in thd->mem_root, which is not
        destroyed until end of execution.
      
        If previously TABLE_LIST elements were allocated
        at ha_myisammrg::open() (i.e. when the TABLE
        object was created and added to the table cache),
        now they are allocated in ha_myisammrg::add_chidlren_list()
        (i.e. right after "open" of the merge parent in
        open_tables()).
        We still create a list of children names
        at ha_myisammrg::open() to use as a basis
        for creation of TABLE_LISTs, that allows
        to avoid reading the merge handler data
        file on every execution.
     @ mysql-test/r/merge_recover.result
        Test results for Bug#46610.
     @ mysql-test/t/merge_recover-master.opt
        Option file for Bug#46610 test (need a new test because
        of that option, which is not tested anywhere else).
     @ mysql-test/t/merge_recover.test
        Add a test case for Bug#46610.
     @ sql/table.h
        MERGE child child_def_version is now moved from TABLE_LIST
        to MERGE engine specific data structure.
     @ storage/myisammrg/ha_myisammrg.cc
        Introduce an auxiliary structure to keep MERGE child name
        and definition version. A list of Mrg_child_def is created
        in ha_myisammrg::open() and reused in ha_myisammrg::add_children_list().
[16 Feb 2010 16:50] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:kostja@sun.com-20091211154405-c9yhiewr9o5d20rq) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 17:00] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:kostja@sun.com-20091208135725-r4s4o5rci1pnp2j8) (pib:16)
[17 Feb 2010 0:54] Paul DuBois
Noted in 6.0.14 changelog.

Setting report to Need Merge pending push of Celosia into release tree.
[6 Mar 2010 11:07] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 2:02] Paul DuBois
Noted in 5.5.3 changelog.
[13 Apr 2010 5:01] Paul DuBois
Correction: Not present in any 5.5.x release. 5.5.3 changelog entry removed.