Bug #22819 SHOW INNODB STATUS crashes the server with an assertion failure under high load
Submitted: 29 Sep 2006 11:39 Modified: 18 Jun 2010 12:49
Reporter: Mark Leith Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:4.0.26,4.1.22 OS:Linux (Linux)
Assigned to: Sunny Bains
Tags: assertion failure, crash, innodb, show innodb status

[29 Sep 2006 11:39] Mark Leith
Description:
SHOW INNODB STATUS causes the server to crash under a high load - causing an assertion failure on !is_waiting.

InnoDB: !!!!!!!!!!!!!! UNIV_DEBUG switched on !!!!!!!!!!!!!!!
060928 14:36:05  InnoDB: Database was not shut down normally.
InnoDB: Starting recovery from log files...
InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 0 101898580
InnoDB: Doing recovery: scanned up to log sequence number 0 101921566
060928 14:36:05  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
060928 14:36:05  InnoDB: Flushing modified pages from the buffer pool...
060928 14:36:06  InnoDB: Started
./mysql-4.0.26/bin/mysqld: ready for connections.
Version: '4.0.26-debug'  socket: '/home/mark/mysql/mysql-4.0.26.sock'  port: 4026  Official MySQL-debug binary
060928 14:57:34InnoDB: Assertion failure in thread 1157110704 in file lock0lock.c line 4196
InnoDB: Failing assertion: !is_waiting
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=90
max_connections=100
threads_connected=54
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x45300018
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x44f8072c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8101df0
0xffffe420
(nil)
0x8261f7f
0x81a6b20
0x8173e2b
0x811037c
0x811395b
0x810f163
0x810ebbe
0x810e2da
0x4003faa7
0x40181c2e
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x8e268f8 = show innodb status
thd->thread_id=2034
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Pegasus:/home/mark/mysql # mysql-4.0.26/bin/resolve_stack_dump -s /tmp/mysqld.sym -n /tmp/mysqld.stack
0x8101df0 handle_segfault + 406
0xffffe420 _end + -138002624
(nil)
0x8261f7f lock_print_info + 504
0x81a6b20 srv_printf_innodb_monitor + 270
0x8173e2b _Z18innodb_show_statusP3THD + 209
0x811037c _Z21mysql_execute_commandv + 2188
0x811395b _Z11mysql_parseP3THDPcj + 181
0x810f163 _Z16dispatch_command19enum_server_commandP3THDPcj + 1433
0x810ebbe _Z10do_commandP3THD + 410
0x810e2da handle_one_connection + 602
0x4003faa7 _end + 936007111
0x40181c2e _end + 937326414 

How to repeat:
Found in testing for Bug#20358. 50 threads were loaded with a test program, each running the following set of statements in a loop:

1|create table if not exists csc11683 (id int not null auto_increment primary key,name varchar(200),index(name))engine=innodb|
1|insert into csc11683 (name) values ('?')|varchar ./words
1|show innodb status|
5|select id from csc11683 where name like '?%'|varchar ./words
5|explain select id, name from csc11683 where id=?|int
5|explain select name from csc11683 where name = '?'|varchar ./words
1|show innodb status|
1|create temporary table temp1(id int not null auto_increment primary key)engine=innodb|
1|insert into temp1 values ()|
1|drop temporary table temp1|
1|create table if not exists `test1`(id int)engine=innodb|
1|insert into test1 (select id from csc11683 order by rand() limit 5)|
5|explain select id, max(id), min(id) from test1 group by id|
1|create table ? (i int, j int, k varchar(10), key(i), key (j, k))engine=innodb|varchar ./words
1|show innodb status|
1|reconnect|

Each bind variable is fill from the words file in the program, for random hits. 

Trying to get a cut down test case for this, but logging it now so it does not get lost. The above loops run for a little while sometimes before a crash, or sometimes it can happen fairly quickly.

Full gdb bt full output to follow.

Suggested fix:
This has already been fixed within 4.1, as the !is_waiting assertion has been removed:

http://lists.mysql.com/internals/9391

Verified on current 4.1 and 5.0 that the test case runs fine.

Logging this for 4.0 however, as it's causing a crash.
[29 Sep 2006 11:40] Mark Leith
(gdb) bt full
#0  0xffffe410 in ?? ()
No symbol table info available.
#1  0x445a070c in ?? ()
No symbol table info available.
#2  0x0000000b in ?? ()
No symbol table info available.
#3  0x081a2217 in write_core (sig=1074014264) at stacktrace.c:220
No locals.
#4  0x08101e94 in handle_segfault (sig=1076452968) at mysqld.cc:1825
        thd = (class THD *) 0x8a33310
#5  <signal handler called>
No symbol table info available.
#6  0x082625cf in lock_table_queue_validate (table=0x40295e68) at lock0lock.c:4196
        lock = (ib_lock_t *) 0x89d59b0
        is_waiting = 1
#7  0x082639ac in lock_validate () at lock0lock.c:4429
        lock = (ib_lock_t *) 0x8c6eda8
        trx = (trx_t *) 0x8c6eb78
        limit = {high = 144913472, low = 144913472}
        space = 144913472
        page_no = 147254136
        i = 0
#8  0x08261f7f in lock_print_info (file=0x898c4b8) at lock0lock.c:4066
        lock = (ib_lock_t *) 0x0
        trx = (trx_t *) 0x0
        space = 144913472
        page_no = 144913472
        load_page_first = 1
        nth_trx = 38
        nth_lock = 0
        i = 0
        mtr = {state = 0, memo = {heap = 0x0, used = 1074026304, 
    data = "¸Ä\230\b\000\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000f\000\000\000\000\000\000\0006.3\b¸Ä\230\b\000\000\000\000\000\000\000\000~ì\232\027\205Êè@\000\000\000\000\000\000\000\000@W\004@¸Ä\230\b\000\000\000\000\000\000\000\000@W\004@¸Ä\230\b\000\000\000\000\000\000\000\000@W\004@¸Ä\230\b\000\000\000\000\000---Type <return> to continue, or q <return> to quit---
\000\000\000x\vZD¨8)\b\033\000\000\000\002\000\000\000d\000\000\000¼\vZD°\vZDb\020ZD`\020ZD", '\0' <repeats 12 times>, "¨\vZD¨8)\b\032\000\000\000è\211R\b¸\vZD¨8)\b\032\000\000\000è\211R\b´\vZD\b\213R\bà\2026\bh\a"..., base = {count = 136919208, start = 0x13, end = 0x85289e8}, list = {prev = 0x917fe38, next = 0x8528ab4}, 
    buf_end = 0, magic_n = 17}, log = {heap = 0x445a0d28, used = 136919208, 
    data = "\022\000\000\000è\211R\b8\rZD¨8)\b\022\000\000\000è\211R\b¨y\022\t¨\212R\b\000\000\000\000\020\000\000\000X\rZD¨8)\b\021\000\000\000è\211R\bh\rZD¨8)\b\021\000\000\000è\211R\bÈ® \b\234\212R\b\000\000\000\000\017\000\000\000\210\rZD¨8)\b\020\000\000\000è\211R\b\230\rZD¨8)\b\020\000\000\000è\211R\b\220\rZD\220\212R\b\000\000\000\000\016\000\000\000¸\rZD¨8)\b\017\000\000\000è\211R\bÈ\rZD¨8)\b\017\000\000\000è\211R\b\030\016ZD\204\212R\b\000\000\000\000\r\000\000\000è\rZD¨8)\b\016\000\000\000è\211R\b"..., base = {count = 137992132, start = 0x835b093, end = 0x269}, list = {prev = 0x0, next = 0x0}, buf_end = 0, magic_n = 1146752824}, 
  modifications = 136849326, n_log_recs = 137992132, log_mode = 1146756016, start_lsn = {high = 144913472, low = 1146752856}, end_lsn = {high = 1074903665, 
    low = 137992128}, magic_n = 1146752856}
#9  0x081a6b20 in srv_printf_innodb_monitor (file=0x898c4b8) at srv0srv.c:1475
        time_elapsed = 52.000999999999998
        current_time = 0
        n_reserved = 0
#10 0x08173e2b in innodb_show_status (thd=0x8a33310) at ha_innodb.cc:4824
        packet = (class String *) 0x8a337bc
        trx = (trx_t *) 0x0
        _db_func_ = 0x0
        _db_file_ = 0x8377088 "\"£\017\b^£\017\bþ"
        _db_level_ = 0
        _db_framep_ = (char **) 0x0
        flen = 149543992
        str = 0x9221318 "\003insert into csc11683 (name) values ('discriminants')"
        field_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x8ca07b0, last = 0x8ca07b0, elements = 1}, <No data fields>}
#11 0x0811037c in mysql_execute_command () at sql_parse.cc:1598
        res = 0
        thd = (class THD *) 0x8a33310
        lex = (LEX *) 0x8a33440
        tables = (TABLE_LIST *) 0x0
        select_lex = (SELECT_LEX *) 0x8a3344c
        slave_fake_lock = false
        fake_prev_lock = (MYSQL_LOCK *) 0x0
        _db_func_ = 0x4012a115 "\201Ãß.\v"
        _db_file_ = 0x48 <Address 0x48 out of bounds>
        _db_level_ = 1075701816
---Type <return> to continue, or q <return> to quit---
        _db_framep_ = (char **) 0x92e5f68
#12 0x0811395b in mysql_parse (thd=0x8a33310, inBuf=0x8a33440 "\001", length=18) at sql_parse.cc:3087
        lex = (LEX *) 0x8a33440
        _db_func_ = 0x2000 <Address 0x2000 out of bounds>
        _db_file_ = 0x12 <Address 0x12 out of bounds>
        _db_level_ = 144913168
        _db_framep_ = (char **) 0x445a1338
#13 0x0810f163 in dispatch_command (command=COM_QUERY, thd=0x8a33310, packet=0x8d37401 "show innodb status", packet_length=18) at sql_parse.cc:1089
        pos = 0x0
        net = (NET *) 0x8a3331c
        error = false
        slow_command = false
        _db_func_ = 0x2 <Address 0x2 out of bounds>
        _db_file_ = 0x445a1210 "\aP\b"
        _db_level_ = 0
        _db_framep_ = (char **) 0x393e20
        start_of_query = 8192
#14 0x0810ebbe in do_command (thd=0x8a33310) at sql_parse.cc:959
        packet = 0x8d37400 "\003show innodb status"
        old_timeout = 30
        packet_length = 19
        net = (NET *) 0x8a3331c
        command = COM_QUERY
        _db_func_ = 0x80e7542 "ÇC\034ôt\016\b\203Ä\020\213]üÉÃ\220U\211å\203ì\024ÿ5L=9\bè\rVýÿ\203Ä\bÿu\bPèÐ\230\037"
        _db_file_ = 0x8a33aa8 "\bÑç\b"
        _db_level_ = 8192
        _db_framep_ = (char **) 0x1000
#15 0x0810e2da in handle_one_connection (arg=0x0) at sql_parse.cc:743
        error = 1075701652
        net = (NET *) 0x8a3331c
        thd = (class THD *) 0x8a33310
        launch_time = 0
        set = {__val = {0 <repeats 32 times>}}
#16 0x4003faa7 in pthread_create () from /lib/tls/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
No symbol table info available.
#17 0x40181c2e in clone () from /lib/tls/libc.so.6
No symbol table info available.
[29 Sep 2006 11:49] Heikki Tuuri
Hmm... maybe we have too strict a debug assertion there.
--Heikki
[30 Sep 2006 15:31] Shane Bester
testcase....

Attachment: testcase.c (text/x-csrc), 23.08 KiB.

[30 Sep 2006 15:42] Shane Bester
the testcase.c crashed my 4.1BK also.

Attachment: stack.4.1.22.txt (text/plain), 4.03 KiB.

[2 Oct 2006 6:49] Shane Bester
currently, i'm only able to get the debug build of 4.1 to crash. non-debug seems not to crash.
[16 Oct 2006 13:36] Elliot Murphy
crash is P1
[8 Nov 2006 15:18] Heikki Tuuri
Hi!

The assertion is checked in the debug version. The assertion simply is too strong. It is well possible that there is a waiting lock request in the queue (for example, on record R1) while there is a granted lock request behind it (for example, on the gap before R1).

Sunny, please remove the is_waiting code from the function below in 5.0 and 5.1.

Regards,

Heikki

lock0lock.c in 5.0:
/*************************************************************************
Validates the lock queue on a table. */

ibool
lock_table_queue_validate(
/*======================*/
                                /* out: TRUE if ok */
        dict_table_t*   table)  /* in: table */
{
        lock_t* lock;
        ibool   is_waiting;

#ifdef UNIV_SYNC_DEBUG
        ut_ad(mutex_own(&kernel_mutex));
#endif /* UNIV_SYNC_DEBUG */

        is_waiting = FALSE;

        lock = UT_LIST_GET_FIRST(table->locks);

        while (lock) {
                ut_a(((lock->trx)->conc_state == TRX_ACTIVE)
                     || ((lock->trx)->conc_state == TRX_PREPARED)
                     || ((lock->trx)->conc_state == TRX_COMMITTED_IN_MEMORY));

                if (!lock_get_wait(lock)) {

                        ut_a(!is_waiting);

                        ut_a(!lock_table_other_has_incompatible(lock->trx, 0,
                                                table, lock_get_mode(lock)));
                } else {
                        is_waiting = TRUE;

                        ut_a(lock_table_has_to_wait_in_queue(lock));
                }

                lock = UT_LIST_GET_NEXT(un_member.tab_lock.locks, lock);
        }

        return(TRUE);
}
[12 Jul 2007 17:55] Timothy Smith
Queued to 5.0- and 5.1-maint team tree(s)
[19 Jul 2007 15:48] Bugs System
Pushed into 5.1.21-beta
[19 Jul 2007 15:50] Bugs System
Pushed into 5.0.48
[22 Jul 2007 18:12] Paul Dubois
Noted in 5.0.48, 5.1.21 changelogs.

SHOW INNODB STATUS caused an assertion failure under high load.
[5 May 2010 15:22] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:27] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 6:04] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:32] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 7:00] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:49] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:08] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:54] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:35] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)