Bug #47029 Server crash when reorganizing two partitions into one with same name as the 2nd
Submitted: 31 Aug 2009 19:42 Modified: 2 Dec 2009 7:35
Reporter: Keyur Joshi Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Partitions Severity:S1 (Critical)
Version:5.1.37-1/5.4.4 OS:Linux (32-bit Ubuntu 9.04 (kernel: 2.6.28-11-generic))
Assigned to: Mikael Ronström
Triage: Needs Triage: D1 (Critical)

[31 Aug 2009 19:42] Keyur Joshi
Description:
I have only seen this on a partitioned InnoDB table WITH subpartitions. The server crashed while trying to reorganize two partitions and naming the result after the second one. More details are in the "How to repeat" section.

The mysql.err file shows the following messages/stacktrace:

090831 15:36:12  InnoDB: Error; possible reasons:
InnoDB: 1) Table rename would cause two FOREIGN KEY constraints
InnoDB: to have the same internal name in case-insensitive comparison.
InnoDB: 2) table `test1`.090831 15:36:12 [ERROR] Invalid (old?) table or database name 'TestTable#P#p3#SP#p3sp1'
`#mysql50#TestTable#P#p3#SP#p3sp1` exists in the InnoDB internal data
InnoDB: dictionary though MySQL is trying to rename table `test1`.090831 15:36:12 [ERROR] Invalid (old?) table or database name 'TestTable#P#p3#SP#p3sp1#TMP#'
`#mysql50#TestTable#P#p3#SP#p3sp1#TMP#` to it.
InnoDB: Have you deleted the .frm file and not used DROP TABLE?
InnoDB: You can look for further help from
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/innodb-troubleshooting.html
InnoDB: If table `test1`.090831 15:36:12 [ERROR] Invalid (old?) table or database name 'TestTable#P#p3#SP#p3sp1'
`#mysql50#TestTable#P#p3#SP#p3sp1` is a temporary table #sql..., then it can be that
InnoDB: there are still queries running on the table, and it will be
InnoDB: dropped automatically when the queries end.
InnoDB: You can drop the orphaned table inside InnoDB by
InnoDB: creating an InnoDB table with the same name in another
InnoDB: database and copying the .frm file to the current database.
InnoDB: Then MySQL thinks the table exists, and DROP TABLE will
InnoDB: succeed.
InnoDB: Error: trying to use a corrupt
InnoDB: table handle. Magic n 0, table name090831 15:36:12 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=134217728
read_buffer_size=33554432
max_used_connections=1
max_threads=151
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1639361 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xc71a6c8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x62942398 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x22) [0x84ff862]
/usr/sbin/mysqld(handle_segfault+0x381) [0x8228641]
[0xb80c8400]
/usr/sbin/mysqld(ha_innobase::info(unsigned int)+0x54) [0x83c7c44]
/usr/sbin/mysqld(ha_partition::info(unsigned int)+0x22e) [0x831e6de]
/usr/sbin/mysqld(handler::get_dup_key(int)+0x5e) [0x8316d2e]
/usr/sbin/mysqld(handler::print_error(int, int)+0x118) [0x83197c8]
/usr/sbin/mysqld(fast_alter_partition_table(THD*, st_table*, Alter_info*, st_ha_create_information*, TABLE_LIST*, char*, char const*, unsigned int)+0xf1a) [0x816f36a]
/usr/sbin/mysqld(mysql_alter_table(THD*, char*, char*, st_ha_create_information*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool)+0x14df) [0x833210f]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x2c23) [0x823b633]
/usr/sbin/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x340) [0x8240390]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0xc38) [0x8240fd8]
/usr/sbin/mysqld(do_command(THD*)+0xdf) [0x8241ddf]
/usr/sbin/mysqld(handle_one_connection+0x253) [0x8232803]
/lib/tls/i686/cmov/libpthread.so.0 [0xb80a24ff]
/lib/tls/i686/cmov/libc.so.6(clone+0x5e) [0xb7e8949e]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xc8361b8 = alter table TestTable reorganize partition p2,p3 into ( partition p3 values less than (to_days('2009-03-01')))
thd->thread_id=1
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

How to repeat:
1. Create table:

create table TestTable ( 
    id smallint(6) not null default 0, 
    timestamp datetime not null, 
    primary key (id, timestamp) 
) engine = InnoDB 
partition by range (to_days(timestamp)) 
subpartition by hash(id) 
subpartitions 2
(
    partition p1 values less than (to_days('2009-01-01')), 
    partition p2 values less than (to_days('2009-02-01')), 
    partition p3 values less than (to_days('2009-03-01')), 
    partition max values less than maxvalue
);

2. Alter table:

alter table TestTable reorganize partition p2,p3 into ( partition p3 values less than (to_days('2009-03-01')));
[31 Aug 2009 20:03] Miguel Solorzano
iguel@luz:~/dbs/mysql-next$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.4.4-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table TestTable ( 
    ->     id smallint(6) not null default 0, 
    ->     timestamp datetime not null, 
    ->     primary key (id, timestamp) 
    -> ) engine = InnoDB 
    -> partition by range (to_days(timestamp)) 
    -> subpartition by hash(id) 
    -> subpartitions 2
    -> (
    ->     partition p1 values less than (to_days('2009-01-01')), 
    ->     partition p2 values less than (to_days('2009-02-01')), 
    ->     partition p3 values less than (to_days('2009-03-01')), 
    ->     partition max values less than maxvalue
    -> );
Query OK, 0 rows affected (0.14 sec)

mysql> 
mysql> alter table TestTable reorganize partition p2,p3 into ( partition p3 values less than
    -> (to_days('2009-03-01')));
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>
[31 Aug 2009 20:15] Miguel Solorzano
backtrace

Attachment: 47029_backtrace.txt (text/plain), 4.78 KiB.

[31 Aug 2009 20:16] Miguel Solorzano
Thank you for the bug report.
[4 Sep 2009 14:14] Mikael Ronström
I found that there are two problems:
1) A deleted table which have been removed from the internal InnoDB
table SYS_TABLES still exists when the table is to be renamed.

2) The rename_table function returns Duplicate key error which later
gets a crash when MySQL calls print_error on the handler and this
in turn calls get_dup_key which soon enough crashes MySQL since
InnoDB assumes rightly that the handler is open when this is called.
So should use a different error code from rename_table.
[4 Sep 2009 20:56] Jimmy Yang
The problem is due to the counter "j" in ha_partition::rename_partitions() is not reset in a double loop, causing code exits earlier, and does not delete all the temp tables it supposes to. Thus later a dup insert of the same (name) temp table into innodb system table trigger the whole problem:

ha_partition::rename_partitions() 
{
677 unint j = 0;

702 do
703 {
          <================= j did not get reset when re-enter
708	do{
                ......
717		file->ha_delete_table(norm_name_buff)
                ......
723	} while (++j < no_subparts);
      .....
739 } while (++i < temp_partitions);
.....
}

(gdb) print temp_partitions
$4 = 2
(gdb) print no_subparts
$5 = 2

In this case, ha_delete_table() got called 3 times intead of 4 times. 3 Temp Table got deleted:

/test/TestTable#P#p2#SP#p2sp0, 
/test/TestTable#P#p2#SP#p2sp1,
/test/TestTable#P#p3#SP#p3sp0

However /test/TestTable#P#p3#SP#p3sp1 did not get deleted.

It is later causing dup key error in innodb, row/row0ins.c:1847 (row_ins_duplicate_error_in_clust). The error:

InnoDB: 2) table `test1`.090831 15:36:12 [ERROR] Invalid (old?) table or database name
'TestTable#P#p3#SP#p3sp1'
`#mysql50#TestTable#P#p3#SP#p3sp1` exists in the InnoDB internal data

call stack shows it comes from a table rename:

#0  row_ins_duplicate_error_in_clust (cursor=0x45c27e90, entry=0x2aaaaada54b8,
    thr=0x2aaaaada74f0, mtr=0x45c279c0) at row/row0ins.c:1847
#1  0x000000000089f3fb in row_ins_index_entry_low (mode=2,
    index=0x2aaaaad814b8, entry=0x2aaaaada54b8, ext_vec=0x2aaaaada9df8,
    n_ext_vec=0, thr=0x2aaaaada74f0) at row/row0ins.c:2039
...
#12 0x000000000083f1df in ha_innobase::rename_table (this=0x174ccc70,
    from=0x45c298c0 "./test/TestTable#P#p3#SP#p3sp1#TMP#",
    to=0x45c296c0 "./test/TestTable#P#p3#SP#p3sp1")

Fix will be reset the counter j in the double loop:

702 do
703 {
+	j = 0;  <=====================
708	do{

717		file->ha_delete_table(norm_name_buff)

723	} while (++j < no_subparts);

739 } while (++i < temp_partitions);

Thanks
Jimmy
[7 Sep 2009 8:31] Mikael Ronström
The bug was found to have 2 bugs in it. One in the MySQL partitioning
code and one in the InnoDB error handling code. Reassigned the bug to
myself for fixing the MySQL part of the bug.

Fix was found by Jimmy Yang and was a simple matter of not initialising
the j variable in the second loop of dropping partitions that have been
reorganised.
[7 Sep 2009 8:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82556

3105 Mikael Ronstrom	2009-09-07
      Fix to ensure that all subpartitions gets deleted before renaming starts, BUG#47029
      modified:
        mysql-test/r/partition_innodb.result
        mysql-test/t/partition_innodb.test
        sql/ha_partition.cc
[7 Sep 2009 8:41] Mattias Jonsson
Bug#45808 is waiting on this bug to analyze if it is the same problem.
[14 Sep 2009 16:03] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 9:05] Jon Stephens
Hi Mikael,

You say there are actually 2 issues in this bug report. However, it appears that only 1 of them is fixed. Should I go ahead and document/close this bug report? Is there a separate bug report for the InnoDB problem, or is that (also) no longer an issue due to this fix?

Thanks!
[16 Sep 2009 13:15] Mattias Jonsson
The fix Mikael did and is pushed within the scope of this bug fixed the crash. The other issue Mikael found will be handled in bug#45961.

So please go ahead and document/close this bug.
[16 Sep 2009 15:03] Jon Stephens
Documented bugfix in the 5.4.4 changelog as follows:

        When reorganizing partitions, not all affected subpartitions
        were removed prior to renaming. One way in which the issue was
        visible was that attempting to reorganize two partitions into a
        single partition having the same name as one of the originasl
        partitions could lead to a crash of the server.

Set Status = Documenting; waiting for 5.1 push.
[18 Sep 2009 8:22] Mattias Jonsson
Mikael pushed this to 51-bugteam 2009-09-07 12:23.
[18 Sep 2009 9:57] Jon Stephens
Set back to NDI, waiting for 5.1 version info to close.
[2 Oct 2009 1:43] Paul Dubois
Moved 5.4 changelog entry from 5.4.4 to 5.4.3.
[6 Oct 2009 8:59] Bugs System
Pushed into 5.1.40 (revid:joro@sun.com-20091006073316-lea2cpijh9r6on7c) (version source revid:mikael@mysql.com-20090907083754-w12rtyvcwy6j3k1o) (merge vers: 5.1.39) (pib:11)
[6 Oct 2009 9:39] Jon Stephens
Fix also documented in the 5.1.40 changelog.

Closed.
[18 Dec 2009 10:30] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:46] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:01] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:15] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)