Bug #23176 Falcon hangs on update key sysbench test with 256 threads
Submitted: 11 Oct 2006 14:44 Modified: 24 Oct 2006 21:33
Reporter: Patrick Galbraith Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Falcon storage engine Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: CPU Architecture:Any

[11 Oct 2006 14:44] Patrick Galbraith
Running the latest sysbench (I will upgrade on fluffy):

/usr/local/bin/sysbench --test=oltp --mysql-engine-trx=yes --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=update_key --max-requests=0 --oltp-dist-type=special --max-time=600 --mysql-table-engine=falcon --mysql-user=root --num-threads=4 --max-time=180 --mysql-port=5555 --mysql-socket=/tmp/falc.sock --mysql-db=sbtest  run

This test, in the previous version of sysbench to fail and give an error of "Err1020 Record has changed since last read" while running updates.

Alexey produced a new version of sysbench which retries the failed update. It now fails with the same error, but hangs the server. If you try to run:

mysql> update sbtest set k=k+1 where id = 1;

The window hangs.

Looking at it with gdb, this is what I saw:

(gdb) info threads
  12 Thread 1074006352 (LWP 7808)  0x000000390fe0cd28 in do_sigwait () from /lib64/libpthread.so.0
  11 Thread 1074272592 (LWP 7852)  0x000000390fe0be0b in __read_nocancel () from /lib64/libpthread.so.0
  10 Thread 1084762448 (LWP 7853)  0x000000390fe09807 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9 Thread 1095252304 (LWP 7854)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8 Thread 1105742160 (LWP 7855)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7 Thread 1116232016 (LWP 7856)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6 Thread 1126721872 (LWP 7857)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5 Thread 1126988112 (LWP 7876)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4 Thread 1127254352 (LWP 7877)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3 Thread 1127520592 (LWP 7878)  0x000000390fe0ba08 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
  2 Thread 1127786832 (LWP 7879)  0x000000390fe09616 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  1 Thread 47663575081168 (LWP 7807)  0x000000390e5c4c12 in __select_nocancel () from /lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 1127520592 (LWP 7878))]#0  0x000000390fe0ba08 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
(gdb) where
#0  0x000000390fe0ba08 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
#1  0x000000390fe08309 in _L_mutex_lock_62 () from /lib64/libpthread.so.0
#2  0x000000390fe080ec in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000000087530b in Mutex::lock (this=0x2aaaaab10618) at Mutex.cpp:78
#4  0x00000000008a8c66 in SerialLog::flush (this=0x2aaaaab105a0, forceNewWindow=false, commitBlockNumber=231335) at SerialLog.cpp:370
#5  0x00000000008be91c in SRLRollback::append (this=0x2aaaaab10cc8, transId=1065, updateTransaction=true) at SRLRollback.cpp:48
#6  0x00000000008514bf in Dbb::rollback (this=0x2b598a763bd0, transId=1065, updateTransaction=true) at Dbb.cpp:843
#7  0x0000000000847437 in Database::rollback (this=0x2aaaaaaab048, transaction=0x2b598a7977b8) at Database.cpp:1978
#8  0x0000000000829ea2 in Transaction::rollback (this=0x2b598a7977b8) at Transaction.cpp:227
#9  0x00000000008416dc in Connection::rollback (this=0x2b598a795e78) at Connection.cpp:277
#10 0x0000000000812a9d in StorageConnection::rollback (this=0x2b598a7800b0) at StorageConnection.cpp:265
#11 0x000000000080ff61 in NfsStorageTable::rollback (thd=0x109cb20, all=false) at ha_falcon.cpp:641
#12 0x00000000007550ea in ha_rollback_trans (thd=0x109cb20, all=false) at handler.cc:795
#13 0x0000000000755707 in ha_autocommit_or_rollback (thd=0x109cb20, error=1) at handler.cc:852
#14 0x00000000006f7f02 in mysql_update (thd=0x109cb20, table_list=0x10a99c0, fields=@0x109d0e8, values=@0x109d480, conds=0x10aa190, order_num=0,
    order=0x0, limit=18446744073709551615, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:670
#15 0x000000000067ba0b in mysql_execute_command (thd=0x109cb20) at sql_parse.cc:3320
#16 0x0000000000681d1c in mysql_parse (thd=0x109cb20, inBuf=0x10a9900 "UPDATE sbtest set k=k+1 where id=5012", length=37) at sql_parse.cc:6062
#17 0x000000000068273b in dispatch_command (command=COM_QUERY, thd=0x109cb20, packet=0x10a16d1 "UPDATE sbtest set k=k+1 where id=5012",
    packet_length=38) at sql_parse.cc:1828
#18 0x0000000000683cc5 in do_command (thd=0x109cb20) at sql_parse.cc:1612
#19 0x00000000006840a1 in handle_one_connection (arg=0x109cb20) at sql_parse.cc:1227
#20 0x000000390fe062ea in start_thread () from /lib64/libpthread.so.0
#21 0x000000390e5cb73d in clone () from /lib64/libc.so.6
#22 0x0000000000000000 in ?? ()

How to repeat:
1. Start mysqld 
2. Run sysbench prepare:

sysbench --test=oltp --oltp-table-size=1000000 --mysql-engine-trx=yes --mysql-socket=/tmp/patg_falc.sock --mysql-port=5555 --mysql-user=root  --mysql-db=sbtest prepare

(note, you might not need to specify port and socket, so omit if you start on standard port/socket)

3. Run sysbench update_key test

sysbench --test=oltp --mysql-engine-trx=yes --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=update_key --max-requests=0 --oltp-dist-type=special --max-time=600 --mysql-table-engine=falcon --mysql-user=root --num-threads=4 --max-time=180 --mysql-port=5555 --mysql-socket=/tmp/falc.sock --mysql-db=sbtest  run

I will make sure sysbench is up to date on fluffy.
[11 Oct 2006 14:47] Patrick Galbraith
One other important note:

When you start mysql, please start with --default-storage-engine=falcon.

Often when running sysbench prepare then a subsequent test, it runs great because the table was created as myisam.
[11 Oct 2006 22:25] Jim Starkey
It was a simple deadlock in the Serial Log when logging a rollback.  The commit code had already been fixed.

Sysbench is now running, but reporting index cleanup failures.  I'm still trying to figure what's happening there.
[12 Oct 2006 0:55] Patrick Galbraith
Test passes now:

[patg@buffy mysql-5.1-falcon]$ /usr/local/bin/sysbench --test=oltp --mysql-engine-trx=yes --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=update_key --max-requests=0 --oltp-dist-type=special --max-time=600 --mysql-table-engine=falcon --mysql-user=root --num-threads=4 --max-time=180 --mysql-port=5555 --mysql-socket=/tmp/falc.sock --mysql-db=sbtest run
sysbench v0.4.8:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 4

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
Time limit exceeded, exiting...
(last message repeated 3 times)

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           166711
        other:                           0
        total:                           166711
    transactions:                        161274 (895.82 per sec.)
    deadlocks:                           5437   (30.20 per sec.)
    read/write requests:                 166711 (926.02 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          180.0303s
    total number of events:              161274
    total time taken by event execution: 719.2444
    per-request statistics:
         min:                            0.0003s
         avg:                            0.0045s
         max:                            31.4631s
         approx.  95 percentile:         0.0277s

Threads fairness:
    events (avg/stddev):           40318.5000/427.53
    execution time (avg/stddev):   179.8111/0.01
[12 Oct 2006 15:37] Patrick Galbraith
Ok, sorry to change the synopsis a little bit and re-open, but I get a failure at 256 threads with this test.

(gdb) bt
#0  0x000000390fe0ce1d in raise () from /lib64/libpthread.so.0
#1  0x000000000085a958 in Error::debugBreak () at Error.cpp:93
#2  0x000000000085aa71 in Error::error (string=0xc0b008 "assertion failed at line %d in file %s\n") at Error.cpp:70
#3  0x000000000085ab07 in Error::assertionFailed (fileName=0xc10040 "Record.cpp", line=122) at Error.cpp:77
#4  0x000000000088e2ed in ~Record (this=0x2b9b55266c90) at Record.cpp:122
#5  0x000000000088cc50 in Record::release (this=0x2b9b55266c90) at Record.cpp:499
#6  0x00000000008174f6 in StorageTable::clearRecord (this=0x2aaaaad904d0) at StorageTable.cpp:289
#7  0x00000000008103bf in NfsStorageTable::external_lock (this=0x1e31b80, thd=0x127bc70, lock_type=2) at ha_falcon.cpp:1120
#8  0x0000000000750d5e in handler::ha_external_lock (this=0x1e31b80, thd=0x127bc70, lock_type=2) at handler.cc:3592
#9  0x0000000000657604 in unlock_external (thd=0x127bc70, table=0x1ed1198, count=1) at lock.cc:641
#10 0x00000000006578a3 in mysql_unlock_tables (thd=0x127bc70, sql_lock=0x1ed1170) at lock.cc:271
#11 0x00000000006f7fcf in mysql_update (thd=0x127bc70, table_list=0x23a69a0, fields=@0x23a3da8, values=@0x23a4140, conds=0x23a75b0, order_num=0,
    order=0x0, limit=18446744073709551614, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:676
#12 0x000000000067ba9b in mysql_execute_command (thd=0x127bc70) at sql_parse.cc:3320
#13 0x00000000006f0796 in Prepared_statement::execute (this=0x23a37e0, expanded_query=0x43ae6590, open_cursor=false) at sql_prepare.cc:2965
#14 0x00000000006f0b4f in mysql_stmt_execute (thd=0x127bc70, packet_arg=0x12808a1 "\002", packet_length=18) at sql_prepare.cc:2291
#15 0x000000000068266f in dispatch_command (command=COM_STMT_EXECUTE, thd=0x127bc70, packet=0x12808a1 "\002", packet_length=18) at sql_parse.cc:1789
#16 0x0000000000683d55 in do_command (thd=0x127bc70) at sql_parse.cc:1612
#17 0x0000000000684131 in handle_one_connection (arg=0x127bc70) at sql_parse.cc:1227
#18 0x000000390fe062ea in start_thread () from /lib64/libpthread.so.0
#19 0x000000390e5cb73d in clone () from /lib64/libc.so.6
#20 0x0000000000000000 in ?? ()

Steps to reproduce:
1. Make sure to start mysqld as normal, but with --max-connections=256

2. prepare
sysbench --test=oltp --oltp-table-size=1000000 --mysql-engine-trx=yes --mysql-socket=/tmp/falc.sock --mysql-port=5555 --mysql-user=root --mysql-table-engine=falcon --mysql-db=sbtest prepare

3. run
sysbench --test=oltp --mysql-engine-trx=yes --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=update_key --max-requests=0 --oltp-dist-type=special --max-time=600 --mysql-table-engine=falcon --mysql-user=root --num-threads=256 --max-time=180 --mysql-port=5555 --mysql-socket=/tmp/falc.sock --mysql-db=sbtest run
[24 Oct 2006 21:33] Patrick Galbraith