Bug #40786 Maria crash in _lf_alloc_new
Submitted: 17 Nov 2008 14:21 Modified: 27 Dec 2008 8:41
Reporter: Philip Stoev Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Maria storage engine Severity:S1 (Critical)
Version:6.0.8 OS:Any
Assigned to: Philip Stoev CPU Architecture:Any

[17 Nov 2008 14:21] Philip Stoev
Description:
When executing a concurrent workload - sysbench with 2000 concurrent users, maria crashed as follows:

#0  0x0000000000a909d4 in _lf_alloc_new (pins=0x2074580) at lf_alloc-pin.c:513
#1  0x0000000000a8f91f in lf_hash_insert (hash=0x11e6780, pins=0x4c686c20, data=0xffffffffffffffff) at lf_hash.c:373
#2  0x0000000000a11334 in trnman_new_trn (wt=0x2aa02f8) at trnman.c:344
#3  0x0000000000a23f1c in ha_maria::external_lock (this=0x2aaab07e9498, thd=0x2a9eed0, lock_type=0) at ha_maria.cc:2189
#4  0x000000000073c1aa in handler::ha_external_lock (this=0x2074580, thd=0x4c686c20, lock_type=-1) at handler.cc:5298
#5  0x00000000006422ab in mysql_lock_tables (thd=0x2a9eed0, tables=<value optimized out>, count=<value optimized out>, flags=<value optimized out>,
    need_reopen=0x4c686e27) at lock.cc:411
#6  0x000000000068d832 in lock_tables (thd=0x2a9eed0, tables=0x0, count=<value optimized out>, flags=0, need_reopen=0x4c686e27) at sql_base.cc:4465
#7  0x0000000000694295 in open_and_lock_tables_derived (thd=0x2a9eed0, tables=0xab1fab8, derived=true, flags=0) at sql_base.cc:4166
#8  0x0000000000656b77 in execute_sqlcom_select (thd=0x2a9eed0, all_tables=0xab1fab8) at mysql_priv.h:1620
#9  0x000000000065a9de in mysql_execute_command (thd=0x2a9eed0) at sql_parse.cc:2066
#10 0x00000000006d4029 in Prepared_statement::execute (this=0xab1e5a0, expanded_query=<value optimized out>, open_cursor=false) at sql_prepare.cc:3578
#11 0x00000000006d6ffc in Prepared_statement::execute_loop (this=0xab1e5a0, expanded_query=0x4c688470, open_cursor=false, packet=<value optimized out>,
    packet_end=<value optimized out>) at sql_prepare.cc:3245
#12 0x00000000006d76a5 in mysql_stmt_execute (thd=0x2a9eed0, packet_arg=0x2aa2341 "\001", packet_length=15) at sql_prepare.cc:2467
#13 0x000000000065febe in dispatch_command (command=COM_STMT_EXECUTE, thd=0x2a9eed0, packet=0x2aa2341 "\001", packet_length=12) at sql_parse.cc:960
#14 0x0000000000652a72 in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1156
#15 0x00002b9c2c2d0143 in start_thread () from /lib64/libpthread.so.0
#16 0x00002b9c2cb4674d in clone () from /lib64/libc.so.6

How to repeat:
If this is repeatable, a test case will be provided.
[17 Nov 2008 14:43] Philip Stoev
To reproduce, please use "classic" sysbench 0.4.8 from http://sourceforge.net/projects/sysbench and run:

./sysbench \
  --test=oltp \
  --mysql-host=127.0.0.1 \
  --mysql-port=9306 \
  --mysql-user=sb_user \
  --mysql-db=sb_db \
  --mysql-table-engine=maria \
  --oltp-table-size=100000 \
  --mysql-engine-trx=yes prepare

./sysbench \
  --test=oltp \
  --mysql-host=127.0.0.1 \
  --mysql-port=9306 \
  --mysql-user=sb_user \
  --mysql-db=sb_db \
  --mysql-table-engine=maria \
  --max-requests=0 \
  --max-time=1800 \
  --num-threads=200 \
  --mysql-engine-trx=yes run

Note that 200 concurrent users are being simulated here.
[19 Nov 2008 15:32] Guilhem Bichot
I installed the latest sysbench as indicated, then ran the indicated command lines.
In sysbench's output, I saw a few:
ALERT: failed to execute mysql_stmt_execute(): Err1062 Duplicate entry '5038' for key 'PRIMARY'
FATAL: database error, exiting...

but no crash, and sysbench ends saying "Done".
I tested mysql-maria and mysql-6.0-maria, starting server with:
./mtr maria --start-and-exit --mysqld=--max-connections=3000 --gdb
and creating database sb_db with a CREATE DATABASE. I used user "root" instead of "sb_user" for the sysbench command lines.
Additionally, for mysql-maria, in the error log there are a few
Warning: found too many locks at write_wait: enter write_lock

I repeated the same test for 6.0-maria but without --gdb, still no problem.
The table at the end of the test looks like this:
-rw-rw---- 1 guilhem users 7618560 2008-11-19 16:29 sbtest.MAD
-rw-rw---- 1 guilhem users 1826816 2008-11-19 16:29 sbtest.MAI
and contains 100000 records.

Philip, I believe I need access to your problematic machine. It could be 64-bit issue (I use linux 32-bit), or OS, or compiler (some atomic-operations code in Maria is sensitive to gcc bugs), if I can't repeat I won't be able to fix it.
I was using the latest bzr trees. Compared to 6.0.8 there are a few more revisions, maybe it's would be worth for you to retry with the latest 6.0-maria?
[19 Nov 2008 15:59] Guilhem Bichot
and also what build script or what build options you used (debug build, compiler options etc)
[20 Dec 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[27 Dec 2008 8:41] Philip Stoev
No longer repeatable with 6.0.9