Bug #49238 Creating/Dropping a temporary table while at 1023 transactions will cause assert
Submitted: 30 Nov 2009 22:02 Modified: 14 Oct 2010 13:54
Reporter: Harrison Fisk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.0,5.1.41 OS:Any
Assigned to: Jimmy Yang
Tags: 1023 transactions, temporary table
Triage: Triaged: D1 (Critical)

[30 Nov 2009 22:02] Harrison Fisk
Description:
When the 1023 transaction limit is reached (BUG #26590) internally it is handled as an out of disk space condition.  Due to BUG #43665 with running out of table space, if InnoDB believes it is out of disk space, it will assert, even though, in this case, it isn't out of disk space.

This is directly related to BUG #43665, but could have a different fix by changing the handling of the 1023 transaction limit.

Here is a resolved backtrace of the problem:

0x579865 handle_segfault + 725
0x6f30d2 row_drop_table_for_mysql + 3986
0x65354c ha_innobase::delete_table(char const*) + 252
0x5c6956 rm_temporary_table(db_type, char*) + 134
0x5c6a28 close_temporary(st_table*, bool) + 120
0x5c781e close_temporary_tables(THD*) + 1006
0x568c7e THD::cleanup() + 142
0x579d26 end_thread(THD*, bool) + 22
0x59c099 handle_one_connection + 1065
0x30bcc062f7 _end + -1183202073

There is also another mention in the comment [30 Nov 18:48] Shane Bester in Bug #43665.

How to repeat:
Open 1025 transactions and create/drop temporary InnoDB tables.

Suggested fix:
Don't treat the 1023 limit internally as out of disk space.

Or fix Bug #26590 or Bug #43665.
[8 Dec 2009 13:44] Harrison Fisk
Full error log from the crash in original bug submission:

091130 6:14:40InnoDB: Warning: cannot find a free slot for an undo log. Do you have too
InnoDB: many active transactions running concurrently?
091130 6:14:40InnoDB: Warning: cannot find a free slot for an undo log. Do you have too
InnoDB: many active transactions running concurrently?
091130 6:14:40InnoDB: Assertion failure in thread 1454528832 in file row0mysql.c line 3339
InnoDB: Failing assertion: err == DB_OUT_OF_FILE_SPACE
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html
InnoDB: about forcing recovery.
InnoDB: Thread 1344141632 stopped in file os0sync.c line 586
InnoDB: Thread 1533806912 stopped in file ./../include/sync0sync.ic line 111
InnoDB: Thread 1325476160 stopped in file ./../include/sync0sync.ic line 111
091130 6:14:40 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=1048576
max_used_connections=2101
max_connections=2100
threads_connected=2006
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 6467584 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x2ab82a4af6c0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x56b23fd0, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x579865
0x6f30d2
0x65354c
0x5c6956
0x5c6a28
0x5c781e
0x568c7e
0x579d26
0x59c099
0x30bcc062f7
[8 Dec 2009 16:39] Heikki Tuuri
Harrison, thank you for noticing this crashing bug. In CREATE TABLE, InnoDB creates a second background transaction to do the actual work. Apparently, I have forgotten to check that there is room for this additional transaction in the 1024 undo log slots!

Jimmy, please modify the code so that various functions in ha_innodb.cc that create a second transaction are prepared for a failure to create a transaction.
[10 Dec 2009 7:39] Jimmy Yang
Using a debugger, we could easily reproduce the error, the stacktrace in the bug report is reproduced with a drop table command:

During the operation, we are requesting a slot in trx_undo_seg_create with trx_rsegf_undo_find_free() with following stack:

trx_undo_seg_create 
trx_undo_create 
trx_undo_assign_undo 
trx_undo_report_row_operation 
btr_cur_del_mark_set_clust_rec 
....
que_eval_sql 
row_drop_table_for_mysql 
ha_innobase::delete_table
handler::ha_delete_table
rm_temporary_table 

If we cannot find the slot, we hit the assertion failure, which expects an out of space error:

abort
row_drop_table_for_mysql <== ut_a(err == DB_OUT_OF_FILE_SPACE);
ha_innobase::delete_table
handler::ha_delete_table
rm_temporary_table 
...

row_drop_table_for_mysql()
{
       ....
       if (err != DB_SUCCESS) {
                ut_a(err == DB_OUT_OF_FILE_SPACE); <== it does not expect DB_TOO_MANY_CONCURRENT_TRXS error

                err = DB_MUST_GET_MORE_FILE_SPACE;

                row_mysql_handle_errors(&err, trx, NULL, NULL);

                ut_error;
        ...
}

(gdb) up
#2  0x0000000000880935 in row_drop_table_for_mysql (
    name=0x44f20ad0 "mysqld.1/#sql2e02_2_0", trx=0x2aaaaad908b8, drop_db=0)
    at row/row0mysql.c:3249
3249                    ut_a(err == DB_OUT_OF_FILE_SPACE);
(gdb) p err
$5 = 47  <==== DB_TOO_MANY_CONCURRENT_TRXS

Apparently, we need to handle this DB_TOO_MANY_CONCURRENT_TRXS error here.

Thanks
Jimmy
[15 Dec 2009 3:12] Jimmy Yang
In row_drop_table_for_mysql(), when we drop a table, and delete its metadata, we had an assumption that the only possible failure is DB_OUT_OF_FILE_SPACE for the log allocation. So the code assert error == DB_OUT_OF_FILE_SPACE when an error happens:

row_drop_table_for_mysql()
{
       err = que_eval_sql();
       ...
       if (err != DB_SUCCESS) {
            ut_a(err == DB_OUT_OF_FILE_SPACE);

            ut_error;
       }

...
}

Apparently, such assumption breaks here, as we could possibly have DB_TOO_MANY_CONCURRENT_TRXS error as well as we run out of free slot for undo log in trx_undo_seg_create(). 

This is not the case in create table, where we handles all possible errors. This is a unique case in drop table since we assume drop table will not hit other problems other than running out of the log space. So we handle this special case here, and still keep the original assumption that only these two errors are possible.

The fix is simple and directly handle the DB_TOO_MANY_CONCURRENT_TRXS error. And since nothing material has been done up to here when the error occurs, we can directly exit.
[15 Dec 2009 9:54] Domas Mituzas
Does that mean that stale temporary table remains inside the server?
[16 Dec 2009 1:18] Jimmy Yang
> Does that mean that stale temporary table remains inside the server?

Yes, the drop table operation fails, so the table is not dropped. It still exists. It does not necessary mean it is stale though.
[12 Jan 2010 15:48] Jimmy Yang
Date: 2010-01-12 07:59:16 +0200 (Tue, 12 Jan 2010)
New Revision: 6421

Log:
branches/5.1: Fix bug #49238: Creating/Dropping a temporary table
while at 1023 transactions will cause assert. Handle possible
DB_TOO_MANY_CONCURRENT_TRXS when deleting metadata in
row_drop_table_for_mysql().
rb://220, approved by Marko
[22 Jan 2010 9:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/97825

3330 Sergey Vojtovich	2010-01-22
      Applying InnoDB snapshot, fixes BUG#49238.
      
      Detailed revision comments:
      
      r6421 | jyang | 2010-01-12 07:59:16 +0200 (Tue, 12 Jan 2010) | 8 lines
      branches/5.1: Fix bug #49238: Creating/Dropping a temporary table
      while at 1023 transactions will cause assert. Handle possible
      DB_TOO_MANY_CONCURRENT_TRXS when deleting metadata in
      row_drop_table_for_mysql().
      
      rb://220, approved by Marko
[22 Jan 2010 13:44] Sergey Vojtovich
Fixed in 5.1, built-in InnoDB.
[4 Feb 2010 10:20] Bugs System
Pushed into 5.1.44 (revid:joro@sun.com-20100204101444-2j32mhqroo0iiio6) (version source revid:dao-gang.qu@sun.com-20100125025505-zqa9v2mgdcfza0v6) (merge vers: 5.1.43) (pib:16)
[5 Feb 2010 11:49] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100204063540-9czpdmpixi3iw2yb) (version source revid:alik@sun.com-20100130201057-zm9nj1sy0xpz1ohp) (pib:16)
[5 Feb 2010 11:55] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100205113942-oqovjy0eoqbarn7i) (version source revid:alik@sun.com-20100204064210-ljwanqvrjs83s1gq) (merge vers: 6.0.14-alpha) (pib:16)
[5 Feb 2010 12:00] Bugs System
Pushed into 5.5.2-m2 (revid:alik@sun.com-20100203172258-1n5dsotny40yufxw) (version source revid:alik@sun.com-20100130182921-uva9w0cxpxqeylbf) (merge vers: 5.5.2-m2) (pib:16)
[10 Feb 2010 19:37] Paul Dubois
Noted in 5.1.44, 5.5.2, 6.0.14 changelogs.

Creating or dropping a TEMPORARY table with 1023 transactions active
caused an assertion failure. 

Setting report to Need Merge pending push to Celosia.
[10 Feb 2010 19:39] Paul Dubois
Remove "TEMPORARY" from preceding note.
[12 Mar 2010 14:20] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:34] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:51] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[12 Mar 2010 16:40] Paul Dubois
Fixed in earlier 5.1.x, 5.5.x.
[6 Apr 2010 8:01] Bugs System
Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:svoj@sun.com-20100401151005-c6re90vdvutln15d) (merge vers: 5.1.46) (pib:16)
[6 Apr 2010 13:43] Jon Stephens
Closed, per Paul's comments above.
[8 Apr 2010 14:56] Mark Callaghan
This crash isn't specific to create temporary table. It took out many servers doing an ALTER today.

0x578ddd handle_segfault + 733
0x6f003b row_drop_table_for_mysql + 3835
0x65569a _ZN11ha_innobase12delete_tableEPKc + 250
0x6437a4 _Z15ha_delete_tableP3THD7db_typePKcS3_b + 180
0x658c2e _Z14quick_rm_table7db_typePKcS1_ + 142
0x65f590 _Z17mysql_alter_tableP3THDPcS1_P24st_ha_create_informationP10TABLE_LISTP10Alter_infojP8st_orderb + 8976
0x59aeaa _Z21mysql_execute_commandP3THDPy + 23018
0x59b6d5 _Z11mysql_parseP3THDPKcjPS2_Py + 469
0x59c78d _Z16dispatch_command19enum_server_commandP3THDPcj + 3837
0x59da78 handle_one_connection + 2296
0x3c7ac0610a _end + 2003364562

The error log has:
* lots of messages about "cannot find a free slot for an undo log. Do you have too many active transactions running concurrently"
* one message like
Assertion failure ... in file row0mysql.c line 3339
Failing assertion: err == DB_OUT_OF_FILE_SPACE
[8 Apr 2010 15:17] Harrison Fisk
Is there any reason that this wasn't fixed in 5.0?  

The fix seems pretty small and safe.  It is a crashing bug so it seems like an ideal candidate to be done in 5.0.
[23 Apr 2010 23:15] Jimmy Yang
The issue is verified in 5.0 with debugger:

===================
Breakpoint 1, trx_undo_seg_create (rseg=0x2aaaaacafeb8,
    rseg_hdr=0x2aaaaaef4026 "ÿÿÿþ", type=2, id=0x416c0ff8,
    undo_page=0x416c0ff0, mtr=0x416c1050) at trx0undo.c:413
413             if (slot_no == ULINT_UNDEFINED) {
(gdb) jump 414
Continuing at 0x7bb33a.

Program received signal SIGSEGV, Segmentation fault.
0x00000000007870f0 in row_drop_table_for_mysql (name=0x416c2cf0 "test/aa",
    trx=0x2aaaaacbc8b8, drop_db=0) at row0mysql.c:3339
3339                    ut_a(err == DB_OUT_OF_FILE_SPACE);
(gdb) p err
$1 = 47

Program received signal SIGSEGV, Segmentation fault.
0x00000000007870f0 in row_drop_table_for_mysql (name=0x416c2cf0 "test/aa",
    trx=0x2aaaaacbc8b8, drop_db=0) at row0mysql.c:3339
3339                    ut_a(err == DB_OUT_OF_FILE_SPACE);
================================

error 47 is DB_TOO_MANY_CONCURRENT_TRXS.

Fix also verified. With the fix, no assertion failure, no server crash, following message printed:

100423 15:54:07InnoDB: Warning: cannot find a free slot for an undo log. Do you have too InnoDB: many active transactions running concurrently?

Thanks
Jimmy
[28 Apr 2010 4:17] James Day
The fix for this is in the InnoDB Plugin version 1.0.7 which was included with MySQL 5.1.46.
[29 Apr 2010 13:30] Heikki Tuuri
Ok to commit to 5.0. We want to remove crashing bugs like this from the 'old stable' version 5.0.
[29 Apr 2010 17:47] Jimmy Yang
Ported the fix to 5.0:

r7009 | jyang | 2010-04-29 10:44:56 -0700 (Thu, 29 Apr 2010) | 6 lines

branches/5.0: Port fix for bug #49238 (Creating/Dropping a temporary
table while at 1023 transactions will cause assert) from 5.1 to
branches/5.1. Separate action for return value DB_TOO_MANY_CONCURRENT_TRXS
from that of DB_MUST_GET_MORE_FILE_SPACE in row_drop_table_for_mysql().
[5 May 2010 15:24] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:14] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:42] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:10] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[15 Jun 2010 8:22] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:39] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[17 Jun 2010 12:21] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:08] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:49] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[7 Jul 2010 17:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113054

2878 Vasil Dimov	2010-07-07
      Merge the fix for Bug#49238 from SVN
      (without the unrelated whitespace changes):
      
        ------------------------------------------------------------------------
        r7009 | jyang | 2010-04-29 20:44:56 +0300 (Thu, 29 Apr 2010) | 6 lines
        
        branches/5.0: Port fix for bug #49238 (Creating/Dropping a temporary
        table while at 1023 transactions will cause assert) from 5.1 to
        branches/5.1. Separate action for return value DB_TOO_MANY_CONCURRENT_TRXS
        from that of DB_MUST_GET_MORE_FILE_SPACE in row_drop_table_for_mysql().
        
        
        ------------------------------------------------------------------------
[14 Jul 2010 7:15] Calvin Sun
The fix is committed into mysql-5.0-bugteam tree.
[23 Jul 2010 12:28] Bugs System
Pushed into mysql-trunk 5.5.6-m3 (revid:alik@sun.com-20100723121820-jryu2fuw3pc53q9w) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:18)
[23 Jul 2010 12:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100723121929-90e9zemk3jkr2ocy) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (pib:18)
[2 Aug 2010 7:50] Bugs System
Pushed into 5.0.92 (revid:georgi.kodinov@oracle.com-20100802074824-5201e4ppst9t3yqt) (version source revid:georgi.kodinov@oracle.com-20100721154924-kvviyf6df2vy7oug) (merge vers: 5.0.92) (pib:18)
[4 Aug 2010 16:57] John Russell
Added to the 5.0.92 change log.
[19 Aug 2010 15:42] Bugs System
Pushed into mysql-5.1 5.1.51 (revid:build@mysql.com-20100819151858-muaaor6jojb5ouzj) (version source revid:build@mysql.com-20100819151858-muaaor6jojb5ouzj) (merge vers: 5.1.51) (pib:20)
[14 Oct 2010 8:38] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (merge vers: 5.1.51-ndb-7.0.20) (pib:21)
[14 Oct 2010 8:53] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (merge vers: 5.1.51-ndb-6.3.39) (pib:21)
[14 Oct 2010 9:09] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (merge vers: 5.1.51-ndb-6.2.19) (pib:21)
[14 Oct 2010 13:54] Jon Stephens
Already documented in the 5.1.44 changelog; no additional changelog entries required. Set back to Closed state.