Bug #15868 InnoDB hangs under 200 threads making INSERTs: 'thread thrashing'
Submitted: 19 Dec 2005 20:51 Modified: 31 Aug 2007 1:23
Reporter: Vadim Tkachenko Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:4.1, 5.0 OS:Linux (Linux)
Assigned to: Paul DuBois CPU Architecture:Any

[19 Dec 2005 20:51] Vadim Tkachenko
Description:
InnoDB hangs and crashed under Sysbench Insert benchmarks with 2048 threads.

there is crash in code

        if (sync_array_print_long_waits()) {
                fatal_cnt++;
                if (fatal_cnt > 5) {

                        fprintf(stderr,
"InnoDB: Error: semaphore wait has lasted > %lu seconds\n"
"InnoDB: We intentionally crash the server, because it appears to be hung.\n",
                                srv_fatal_semaphore_wait_threshold);

                        ut_error;
                }
        } else {
                fatal_cnt = 0;
        }

How to repeat:
sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --
oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=2048  --mysql-db=sbtest run

Suggested fix:
Obviously there is a problem with sync_array.
I'd test new version of Osku's patch, I'm waiting for fix of patch.
[20 Dec 2005 8:41] Heikki Tuuri
Vadim,

please post what mysqld prints to the .err log.

Heikki
[20 Dec 2005 21:08] Vadim Tkachenko
Full .err file takes 32MB I cannot post it.
Here is the end of output:

InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
051220 22:05:19InnoDB: Assertion failure in thread 2921647024 in file srv0srv.c line 2080
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
InnoDB: Thread 2642688944 stopped in file sync0arr.c line 126
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=2048
max_connections=3000
threads_connected=2048
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 584168 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xae24c74c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81622ec
0xb75a8d28
(nil)
0xb74d9e8a
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Reso
lved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
[20 Dec 2005 21:53] Heikki Tuuri
Vadim,

attach it as gzipped.

Heikki
[21 Dec 2005 7:26] Vadim Tkachenko
Here is link to gzipped error log:
http://mysql.apachephp.com/files/err.log.crash.gz
[4 Jan 2006 16:27] Heikki Tuuri
Assigning this to Osku.
[16 Jan 2006 9:47] Osku Salerma
I tried testing this with my latest patches to sync array on a dual-CPU Xeon 3.2 GHz / 6 GB memory box, but I have problems running the sysbench commands given in this bug. First, 2048 threads are too much for some reason, it fails with weird "out of memory" errors. Limiting the threads to 250 fails with this:

ALERT: failed to execute mysql_stmt_execute(): Err1062 Duplicate entry '0' for key 1
FATAL: database error, exiting...
[16 Jan 2006 10:04] Vadim Tkachenko
Osku,

you can try thread_stack=128KB for mysql
and ulimit -s 128 for sysbench to increase memory.

and you should prepare benchmark before run by command:
sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --mysql-db=sbtest prepare
[16 Jan 2006 10:38] Osku Salerma
Vadim,

I did run the prepare command (it refuses to run if you haven't). Though I used the exact same command as for run, only substituting "prepare" for "run".
[16 Jan 2006 11:33] Vadim Tkachenko
Osku,

please try this sysbench:
http://mysql.apachephp.com/files/sysbench-0.4.4-sp.tar.gz
[17 Jan 2006 11:47] Osku Salerma
sysbench 0.4.4 starts with 250 threads, but eventually the server dies with:

Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: start of release lock
Warning: found too many locks at write: after releasing lock
Warning: found too many locks at write: thr_unlock
Warning: found too many locks at write: start of release lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
safe_mutex: Trying to lock unitialized mutex at os0sync.c, line 602
mysqld got signal 6;

I'm not sure if InnoDB should even be using the safe_mutex functions from MySQL...I will investigate this.
[17 Jan 2006 13:43] Vadim Tkachenko
Osku,

You are using debug version, I did not test it.
With debug version you will have absolutely diffferent results.
[17 Jan 2006 16:44] Heikki Tuuri
Osku,

can you run sysbench inside gdb, so that we would know where that uninitialized mutex happens?

Hmm... there might a a bug in InnoDB in creating trx objects or trx_undo objects with many concurrent transactions. Maybe in some case an InnoDB mutex object is not initialized though InnoDB uses it, or InnoDB uses a freed mutex object.

Regards,

Heikki
[19 Jan 2006 13:44] Heikki Tuuri
Hi!

The unitilialized mutex error was probably due to the sync0arr.c bug in ...signal_object() that I fixed last night. It tried to set an event in a garbage pointer which involves locking an OS mutex at a random location in memory.

Vadim's original crash, however, should be studied. I have never tested InnoDB with 2000 threads.

Regards,

Heikki
[15 Feb 2006 15:45] Ady Wicaksono
I have the same cases, i use only 200 thread... and yes... this bug is ignoring me since I upgrade MySQL4 to MySQL5.....
I use MySQL 5.0.18 Max ... RedHat 9 Kernel Linux 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 i686 i686 i386 GNU/Linux. RAM 2 Gbytes.

I Run 

# ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=test --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=200  --mysql-db=test --mysql-user=root  --mysql-password=simbah65infokom run
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 200

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

From processlist, many INSERT/UPDATE hang

mysql> show processlist;
+-----+------+-----------------+------+---------+------+--------+----------------------------------+
| Id  | User | Host            | db   | Command | Time | State  | Info                             |
+-----+------+-----------------+------+---------+------+--------+----------------------------------+
|   3 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   4 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   5 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   6 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   7 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   8 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|   9 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  10 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  11 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  12 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  13 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  14 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  15 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  16 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  17 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  18 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  19 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  20 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  21 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  22 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  23 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  24 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  25 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  26 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  27 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  28 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  29 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |
|  30 | root | localhost       | test | Execute |  365 | update | INSERT INTO test values(?,?,?,?) |

----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 1598035, signal count 95217
--Thread 2250521152 has waited at btr0cur.c line 359 for 220.00 seconds the semaphore:
X-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2232825664 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2217884992 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2250324672 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2223582912 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2227516608 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
Last time write locked in file btr0cur.c line 359
--Thread 2249538752 has waited at btr0cur.c line 366 for 220.00 seconds the semaphore:
S-lock on RW-latch at 0x4615e4bc created in file dict0dict.c line 3654
a writer (thread id 2252289472) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file btr0cur.c line 366
[15 Feb 2006 16:16] Ady Wicaksono
FYI, I just tried in MySQL 4.1.18, and found that MySQL 4.1.18 is also affected by this bug
[15 Feb 2006 16:27] Ady Wicaksono
Below tho innodb status for MySQL 4.1.18 testing, after a long time hang 

InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
060215 23:22:02InnoDB: Assertion failure in thread 1788512832 in file srv0srv.c line 1899
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=402653184
read_buffer_size=12578816
max_used_connections=202
max_connections=910
threads_connected=202
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1782208 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
 where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x6a9a82ec, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x815b8f0
0x4004a618
(nil)
0x420de407
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace
. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
InnoDB: Thread 1780124352 stopped in file os0sync.c line 501
[15 Feb 2006 16:35] Ady Wicaksono
I also need to inform that, hang on MySQL 5 happened faster than MySQL 4, MySQL is just inserting a million of rows and then hang, but MySQL 4 could inserting billion of rows and then hang
[15 Feb 2006 19:47] Heikki Tuuri
Ady,

please attach the COMPLETE UNEDITED .err log to this bug report. Preferably from the 5.0.xx server.

Regards,

Heikki
[16 Feb 2006 3:17] Ady Wicaksono
Hi Heikki

Please find the log here http://202.147.198.67/bugSysBenchMySQL.txt.gz

BTW, why i can't upload file on "files" menu?

Thx
[16 Feb 2006 13:13] Heikki Tuuri
Ady,

in  both your and Vadim's reports the 'main thread' of InnoDB seems to have died or hung after a very small number of INSERTs, in your case only 200 000! The main thread is not waiting for any semaphore. It is just stuck somewhere.

You are using prepared statements (with the notation ?,?,?,?) in the test. Do you get a hang also with normal statements?

Regards,

Heikki
[16 Feb 2006 13:32] Heikki Tuuri
Ady,

a thread is apparently doing a B-tree page split:

--Thread 2252289472 has waited at fsp0fsp.c line 686 for 256.00 seconds the sem\
aphore:
X-lock on RW-latch at 0x40140410 created in file buf0buf.c line 485
a writer (thread id 2252289472) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1
Last time read locked in file buf0flu.c line 576
Last time write locked in file fsp0fsp.c line 686

But the associated buffer pool page most probably is latched by the main thread which is doing a flush:

Pending writes: LRU 0, flush list 9, single page 0

Main thread process no. 10176, id 1788045632, state: flushing buffer pool pages

The question is what makes the main thread vulnerable in the flush operation under a high concurrent load. Could it use so much stack that the stack gets overwritten by another thread.

Have you set thread stacks in some specific way?

Regards,

Heikki
[16 Feb 2006 13:37] Heikki Tuuri
I am raising the priority and editing the synopsis to reflect this better.
[17 Feb 2006 3:45] Ady Wicaksono
Heikki

Question:

You are using prepared statements (with the notation ?,?,?,?) in the test. Do
you get a hang also with normal statements?

Answer:

I haven't test. But, when i upgrade MySQL4 to MySQL5, my web application get a hang on high load. 

I'll try to test later using Apache Benchmark (ab). However it's simply a PHP 4 page that doing insert on MySQL directly using php_mysql module.

When there're huge HTTP request hit this PHP page, it's equivalent with huge insert on MySQL. Since PHP has no database pooling, it will make many connection on MySQL and each connection will run insert into bla bla on table.

Finally, my friend suggest me to use SQLRelay (sqlrelay.sourceforge.net) so database connection will be limited, and insert into bla bla on table will be limited too without sacrifice HTTP request.

Question:
Have you set thread stacks in some specific way?

Answer:
No. MySQL thread stack size should be 196kbyte as default right?

Maybe useful, below is kernel configuration about max threads
# cat /proc/sys/kernel/threads-max
14336

Here is my.cnf:

========== cut here =============
[client]
port            = 3306
socket          = /var/lib/mysql/mysql.sock

# The MySQL server
[mysqld]
port               = 3306
#bind-address      = 127.0.0.1
socket             = /var/lib/mysql/mysql.sock
max_connections    = 910
max_connect_errors = 9999999
skip-locking
key_buffer         = 384M
max_allowed_packet = 16M
table_cache        = 2049
sort_buffer_size   = 12M
read_buffer_size   = 12M
myisam_sort_buffer_size = 64M
thread_cache       = 48
query_cache_size   = 32M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 8

innodb_data_home_dir =
innodb_data_file_path = /mysql1/data1:2000M;/mysql1/data2:2000M;/mysql1/data3:2000M;/mysql1/data4:2000M;/mysql1/data5:1000M;/m
ysql2/data6:2000M;/mysql2/data7:2000M;/mysql2/data8:2000M;/mysql2/data9:2000M;/mysql2/data10:1000M;/mysql0/data11:2000M;/mysql
0/data12:2000M;/mysql0/data13:2000M;/mysql0/data14:2000M;

# ------ OLD ---
innodb_log_group_home_dir = /mysql1
innodb_log_arch_dir = /mysql1
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 512M
innodb_additional_mem_pool_size = 40M
# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 100M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

[mysqldump]
quick
max_allowed_packet = 16M
add-drop-table=FALSE
========== cut here =============
[17 Feb 2006 22:39] Heikki Tuuri
Ady,

when you get more hangs, please post the complete unedited .err log here. They can be a valuable clue about what is wrong.

I will try to test this on a 4 x Opteron/Linux next week.

Regards,

Heikki
[20 Feb 2006 11:12] Ady Wicaksono
Heikki,

I'm preparing an SSH account for you in my Server to test if you interesting,
please contact me via email at adiwicaksono@yahoo.com.

What about your test at opteron linux?
[28 Feb 2006 17:17] Heikki Tuuri
Hi!

I have now tested this on a single-processor computer, with 240 threads. But no hang observed with the latest MySQL-5.0 tree with the compile-pentium-debug build. I ran both inside gdb and without it. I do get the warnings that Osku described.

My next step is to test on a 4 x AMD64 Linux.

Regards,

Heikki

heikki@127:~/sysbench-0.4.4-sp/sysbench$ ./sysbench --test=oltp --oltp-table-size=1000000 --oltp-table-name=sbtest --mysql-db=sbtest prepare
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Creating table 'sbtest'...
Creating 1000000 records in table 'sbtest'...
heikki@127:~/sysbench-0.4.4-sp/sysbench$ ./sysbench --test=oltp --oltp-table-size=1000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=240 --mysql-db=sbtest run
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 240

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

heikki@127:~/sysbench-0.4.4-sp/sysbench$ uname -a
Linux 127 2.6.8-2-686 #1 Tue Aug 16 13:22:48 UTC 2005 i686 GNU/Linux

heikki@127:~/mysql-5.0/sql$ ./mysqld
060228 18:57:39 [Warning] Changed limits: max_open_files: 1024  max_connections: 1014  table_cache: 64
060228 18:57:39  InnoDB: Started; log sequence number 6 1313589783
060228 18:57:39 [Warning] mysql.user table is not updated to new password format; Disabling new password usage until mysql_fix_privilege_tables is run
060228 18:57:39 [Note] ./mysqld: ready for connections.
Version: '5.0.19-debug'  socket: '/home/heikki/bugsocket'  port: 3307  Source distribution
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock
Warning: found too many locks at write: second write lock
Warning: found too many locks at write: enter write_lock

----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 98328845; in additional pool allocated 3380992
Buffer pool size   4096
Free buffers       1482
Database pages     2613
Modified db pages  448
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 14708, created 29603, written 31373
0.00 reads/s, 0.57 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 3132, id 1275108272, state: making checkpoint
Number of rows inserted 2017486, updated 0, deleted 0, read 1000963
26.65 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
[28 Feb 2006 18:08] Heikki Tuuri
Hi!

I tested 5.0.18 built with -O3 -g, with 900 threads on a 4 X AMD64 computer. But I could not get any hang. I cannot test with more threads now because the open files limit of 1024 would be exceeded.

I am putting this bug report to status 'Can't repeat'. The problems with 2048 threads may have been due to mysqld exhausting the resources of the OS.

Regards,

Heikki

bash-3.00$ uname -a
Linux staso03 2.6.9-22.ELsmp #1 SMP Mon Sep 19 18:00:54 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux

bash-3.00$ ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=900 --mysql-db=sbtest run
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 900

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

bash-3.00$ ./mysqld
InnoDB: The first specified data file /tmp/ibdata did not exist:
InnoDB: a new database to be created!
060228  9:26:24  InnoDB: Setting file /tmp/ibdata size to 12 MB
InnoDB: Database physically writes the file full: wait...
060228  9:26:25  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 5 MB
InnoDB: Database physically writes the file full: wait...
060228  9:26:25  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to 5 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system tables
InnoDB: Foreign key constraint system tables created
060228  9:26:26  InnoDB: Started; log sequence number 0 0
060228  9:26:26 [Note] ./mysqld: ready for connections.
Version: '5.0.18'  socket: '/home/htuuri/mysocket'  port: 3307  Source distribution
060228  9:32:33 [Note] ./mysqld: Normal shutdown

060228  9:32:33  InnoDB: Starting shutdown...
060228  9:32:36  InnoDB: Shutdown completed; log sequence number 0 43655
060228  9:32:36 [Note] ./mysqld: Shutdown complete

bash-3.00$ ./mysqld
060228  9:33:26 [Warning] Changed limits: max_open_files: 1024  max_connections: 1014  table_cache: 64
060228  9:33:26  InnoDB: Started; log sequence number 0 43655
060228  9:33:26 [Note] ./mysqld: ready for connections.
Version: '5.0.18'  socket: '/home/htuuri/mysocket'  port: 3307  Source distribution
060228  9:36:01 [ERROR] Error in accept: Too many open files

| 2600 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2601 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2602 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2603 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2604 | sbtest | localhost | sbtest | Execute |    6 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2605 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2606 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2607 | sbtest | localhost | sbtest | Execute |    4 | update | INSERT INTO sbtest values(?,?,?,?) |
| 2608 | sbtest | localhost | sbtest | Execute |    2 | update | INSERT INTO sbtest values(?,?,?,?) |
+------+--------+-----------+--------+---------+------+--------+------------------------------------+
901 rows in set (0.01 sec)

mysql> show variables like 'innodb%';
+---------------------------------+----------------------------+
| Variable_name                   | Value                      |
+---------------------------------+----------------------------+
| innodb_additional_mem_pool_size | 10485760                   |
| innodb_autoextend_increment     | 8                          |
| innodb_buffer_pool_awe_mem_mb   | 0                          |
| innodb_buffer_pool_size         | 67108864                   |
| innodb_checksums                | ON                         |
| innodb_commit_concurrency       | 0                          |
| innodb_concurrency_tickets      | 500                        |
| innodb_data_file_path           | /tmp/ibdata:12M:autoextend |
| innodb_data_home_dir            |                            |
| innodb_doublewrite              | ON                         |
| innodb_fast_shutdown            | 1                          |
| innodb_file_io_threads          | 4                          |
| innodb_file_per_table           | OFF                        |
| innodb_flush_log_at_trx_commit  | 1                          |
| innodb_flush_method             |                            |
| innodb_force_recovery           | 0                          |
| innodb_lock_wait_timeout        | 50                         |
| innodb_locks_unsafe_for_binlog  | OFF                        |
| innodb_log_arch_dir             |                            |
| innodb_log_archive              | OFF                        |
| innodb_log_buffer_size          | 8388608                    |
| innodb_log_file_size            | 5242880                    |
| innodb_log_files_in_group       | 2                          |
| innodb_log_group_home_dir       | ./                         |
| innodb_max_dirty_pages_pct      | 90                         |
| innodb_max_purge_lag            | 0                          |
| innodb_mirrored_log_groups      | 1                          |
| innodb_open_files               | 300                        |
| innodb_support_xa               | ON                         |
| innodb_sync_spin_loops          | 20                         |
| innodb_table_locks              | ON                         |
| innodb_thread_concurrency       | 20                         |
| innodb_thread_sleep_delay       | 10000                      |
+---------------------------------+----------------------------+
33 rows in set (0.00 sec)

mysql>
[1 Mar 2006 7:22] Vadim Tkachenko
Heikki,

please note I reported about 2048 threads.
for example MyISAM can easy handle ~3000 threads.

to increase limit of open files you could use
system wide ulimit -n 10000
[1 Mar 2006 20:06] Heikki Tuuri
Vadim,

I am not the sysadmin of the 4 X AMD64 computer. Was the bug visible only with 2048 threads, and not, for example, 990 threads? I tested now 990 threads with no problems.

srv_max_n_threads is the size of the sync array, and it is set to 10 000 or bigger in srv0start.c. The value 2048 should not be any hard limit.

I will next work on the sync arr optimization bug #15815. If this bug was caused by the OS choking under too many thread switches, then fixing #15815 may solve also this bug.

Regards,

Heikki
[1 Mar 2006 20:17] Vadim Tkachenko
Heikki,

I tested with 1024 threads.
InnoDB works fine under such workload.
[2 Mar 2006 12:41] Ady Wicaksono
Vadim

Why i can't create more than 254 thread?

# ulimit -a
core file size        (blocks, -c) 0
data seg size         (kbytes, -d) unlimited
file size             (blocks, -f) unlimited
max locked memory     (kbytes, -l) unlimited
max memory size       (kbytes, -m) unlimited
open files                    (-n) 1024
pipe size          (512 bytes, -p) 8
stack size            (kbytes, -s) 8192
cpu time             (seconds, -t) unlimited
max user processes            (-u) 7168
virtual memory        (kbytes, -v) unlimited
# ulimit -n 5012
# ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=400 --mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.1:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 400

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
FATAL: Thread #254 creation failed errno = 12 (Cannot allocate memory)
[2 Mar 2006 12:43] Ady Wicaksono
Heikki,

You use kernel 2.6, is it kernel issue or what? Maybe you have idea

I use kernel 2.4 SMP
[2 Mar 2006 12:56] Vadim Tkachenko
Ady,

You have too high stack size (8M for stack is giant)
Please change
ulimit -s 192
[2 Mar 2006 13:30] Ady Wicaksono
I try with 600 threads and MySQL got hang

# ulimit -s 192
# ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=600 --mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.1:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 600

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

-------------- mysql processlist full of these INSERT that wait for more than 624   seconds :( ---------- 
| 10722 | sbtest | localhost:60789 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10723 | sbtest | localhost:60790 | test | Execute |  628 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10724 | sbtest | localhost:60791 | test | Execute |  628 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10725 | sbtest | localhost:60792 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10726 | sbtest | localhost:60793 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10727 | sbtest | localhost:60794 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10728 | sbtest | localhost:60795 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10729 | sbtest | localhost:60796 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10730 | sbtest | localhost:60797 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10731 | sbtest | localhost:60798 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10732 | sbtest | localhost:60799 | test | Execute |  629 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10733 | sbtest | localhost:60800 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10734 | sbtest | localhost:60801 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |
| 10735 | sbtest | localhost:60802 | test | Execute |  624 | update | INSERT INTO sbtest (k,c,pad) values(?,?,?) |

-------------- mysql processlist ----------

-------------- Last lines of mysql.err ----------
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
InnoDB: Pending preads 0, pwrites 0
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
060302 20:25:40InnoDB: Assertion failure in thread 1779657152 in file srv0srv.c line 2092
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=10485760
read_buffer_size=520192
max_used_connections=901
max_connections=910
threads_connected=602
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 938432 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x6a13626c, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81916c0
0x40048618
(nil)
0x420de407
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trac
e. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0
060302 20:25:40  mysqld restarted
060302 20:25:40  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060302 20:25:41  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 6 389178624.
InnoDB: Doing recovery: scanned up to log sequence number 6 389238551
060302 20:25:41  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 9
2 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 0, file name
060302 20:25:41  InnoDB: Started; log sequence number 6 389238551
060302 20:25:41 [Note] /usr/sbin/mysqld-max: ready for connections.
Version: '5.0.18-max'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Edition - Experimental (GPL)
-------------- Last lines of mysql.err ----------

Heikki, i think this bug is not can't repeat, you should try with kernel 2.4 SMP

# uname -a
Linux db 2.4.20-8smp #1 SMP Thu Mar 13 17:45:54 EST 2003 i686 i686 i386 GNU/Linux

Thanks
[2 Mar 2006 14:20] Ady Wicaksono
Vadim,

May i know, u try this sysbench on what Linux kernel version?
[2 Mar 2006 14:53] Vadim Tkachenko
Ady,

RedHat ES 3,
2.4.21-15.ELsmp
quadxeon
[3 Mar 2006 3:55] Ady Wicaksono
Heikki,

I try to test this problem on the same machine, Dell PE 2850, Intel XEON 3.0Hgz HyperThread, 2 Gbytes Memory but using Kernel 2.6 

The problem is the same :(

# uname -a
Linux backup 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386 GNU/Linux
# MySQL version 5.0.15-max

# ulimit -s 192
# ulimit -n 4096
# ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=2048 --mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 2048

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

..........

On other side, these're MySQL processlist
------------------------
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6155 to server version: 5.0.15-max

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show processlist;
+------+------+-----------------+------+---------+------+--------------+------------------------------------+
| Id   | User | Host            | db   | Command | Time | State        | Info                               |
+------+------+-----------------+------+---------+------+--------------+------------------------------------+
| 4106 | root | localhost:46749 | test | Execute |  142 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4107 | root | localhost:46750 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4108 | root | localhost:46751 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4109 | root | localhost:46752 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4110 | root | localhost:46753 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4111 | root | localhost:46754 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4112 | root | localhost:46755 | test | Execute |    9 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4113 | root | localhost:46756 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4114 | root | localhost:46757 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4115 | root | localhost:46758 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4116 | root | localhost:46759 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4117 | root | localhost:46760 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4118 | root | localhost:46761 | test | Execute |   90 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4119 | root | localhost:46762 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4120 | root | localhost:46763 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4121 | root | localhost:46764 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4122 | root | localhost:46765 | test | Execute |  226 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4123 | root | localhost:46766 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4124 | root | localhost:46767 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4125 | root | localhost:46768 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4126 | root | localhost:46769 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4127 | root | localhost:46770 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4128 | root | localhost:46771 | test | Execute |  227 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4129 | root | localhost:46772 | test | Execute |  226 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4130 | root | localhost:46773 | test | Execute |  187 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4131 | root | localhost:46774 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4132 | root | localhost:46775 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4133 | root | localhost:46776 | test | Execute |  229 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4134 | root | localhost:46777 | test | Execute |  144 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4135 | root | localhost:46778 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4136 | root | localhost:46779 | test | Execute |  226 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4137 | root | localhost:46780 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
| 4138 | root | localhost:46781 | test | Execute |  228 | update       | INSERT INTO sbtest values(?,?,?,?) |
---------------------------------

Heiki, this bug drives me crazy :), I use it on application with huge insert/update, currently i use db pooling (SQLRelay) to decrease the impact of this bug.
[3 Mar 2006 4:08] Ady Wicaksono
For more information, last test on kernel 2.6 

1. mysql user could use up to 2.8 Gbytes Memory
2. Here is the mysql process
mysql    21360  130 25.8 2067072 536888 pts/0  Sl   17:36  37:52 /usr/sbin/mysqld-max --basedir=/ --datadir=/var/lib/mysql --
3. After a few minutes.... i decided to stop sysbench, but MySQL not crash, it simply hang, but very annoying
4. I run sysbench again but with 600 thread, and it's work fine...

 # ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=600 --mysql-db=test run --mysql-password=simbah65infokom --mysql-user=root --mysql-host=127.0.0.1
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 600

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

No INSERT process hang.
[3 Mar 2006 17:40] Heikki Tuuri
Ady,

does it hang with 990 threads? With 1500 threads?

Regards,

Heikki
[4 Mar 2006 4:34] Ady Wicaksono
I try with 990 threads, and below the result.

$ uname -a
Linux db 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 i686 i686 i386 GNU/Linux

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 1
cpu MHz         : 2993.085
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm pni monitor ds_cpl cid cx16 xtpr
bogomips        : 5931.00

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 3.00GHz
stepping        : 1
cpu MHz         : 2993.085
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm pni monitor ds_cpl cid cx16 xtpr
bogomips        : 5980.16

$  gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,java,f95,ada --enable-java-awt=gtk --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --host=i386-redhat-linux
Thread model: posix
gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)

$ ps aux|grep mysqld-max
mysql    21360  5.0 23.4 1859492 485968 ?      Sl   Mar03  75:08 /usr/sbin/mysqld-max --basedir=/ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/lib/mysql/backuplistener.pid --skip-locking --port=3306 --socket=/var/lib/mysql/mysql.sock

$ free
             total       used       free     shared    buffers     cached
Mem:       2074700    1568536     506164          0     118524     806772
-/+ buffers/cache:     643240    1431460
Swap:      4192924         44    4192880

# ulimit -s 192
# ulimit -n 4096
# ./sysbench --test=oltp --oltp-table-size=100000000  --oltp-table-name=sbtest --oltp-test-mode=nontrx --oltp-nontrx-mode=insert --max-requests=0 --num-threads=990 --mysql-db=test run --mysql-password=sbtest --mysql-user=sbtest --mysql-host=127.0.0.1
sysbench v0.4.4-sp:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql

Running the test with following options:
Number of threads: 990

Doing OLTP test.
Running non-transactional test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Threads started!

No INSERT Hang, but the performance of insert is poor, look at the data below, as a end user, 990 threads should be equivalent to successfull 990 insert/seconds, but it's not happened here. I don't know the explanation

mysql> select NOW(); select count(*) from sbtest;
+---------------------+
| NOW()               |
+---------------------+
| 2006-03-04 18:27:15 |
+---------------------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
|    17038 |
+----------+
1 row in set (34.57 sec)

mysql> select NOW(); select count(*) from sbtest;
+---------------------+
| NOW()               |
+---------------------+
| 2006-03-04 18:27:52 |
+---------------------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
|    17488 |
+----------+
1 row in set (1 min 37.25 sec)

I'll try later with 1500 thread
[7 Mar 2006 15:44] Ady Wicaksono
Heikki, 

I cancel to test with 1500 thread, 

This bug maybe relatted to http://bugs.mysql.com/bug.php?id=15815,

We decide to downgrade to MySQL 4.1.x, which more stable on concurrent insert

Thank you.
[10 Mar 2006 8:20] Jeff Freund
From the descriptions here, I believe that this issue just happened for us.  Machine has been 100% stable in production for ~2 months before this incident.

We are running 5.0.18-standard-log on 2xDual Core Opteron, with 
2.6.9-22.0.1.106.unsupportedsmp #1 SMP x86_64 x86_64 x86_64 GNU/Linux.  Max allowed connections=400.

InnoDB appeared to just freeze up.   Alive check for database (select 1;) ran as usual - anything accessing innodb tables was stuck. Typical traffic on the db averages a few hundred up to pecks of about 1500 queries per second, split between selects/inserts/updates.
[10 Mar 2006 8:54] Ady Wicaksono
Hi Jeff,

For a temporary solutions while you want using MySQL 5.0.x with InnoDB
you may use SQLRelay (sqlrelay.sourceforge.net) to limit connection ~ limit threads of MySQL. It quite usefull.

However, i prefer to downgrade my MySQL from 5.0.x to MySQL 4.
[10 Mar 2006 15:23] Heikki Tuuri
Jeff,

can you please post the complete, unedited .err log from the hung server?

Did you run prepared statements?

It could be something different from the hang problem in this bug report. This bug report only contains hangs that are triggered by an artificial workload of several hundred threads in sysbench. I have not seen any report from a real-world application.

Regards,

Heikki
[10 Mar 2006 16:53] Jeff Freund
Thanks Heikki -- We do use Prepared Statements.

There actually were no entries in the .err log before/while it was frozen up.  Below I have included the contents of the .err log upon restart.

Also, I have the results of SHOW INNODB STATUS while it was "frozen" - however it is too big to post in the comments here, and I can't attach a file as I am not the original submitter.  Please advise on how to get you that information.

Thanks.
Jeff

.err
---------------------------------------------
060309 23:31:25 [Note] /usr/sbin/mysqld: Normal shutdown

060309 23:31:25 [Note] Slave I/O thread killed while reading event
060309 23:31:25 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000011', position 33467209
060309 23:38:06  mysqld started
060309 23:38:06  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
060309 23:38:07  InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 39 3064686203.
InnoDB: Doing recovery: scanned up to log sequence number 39 3064686203
InnoDB: Transaction 0 341644458 was in the XA prepared state.
InnoDB: 15 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 14 row operations to undo
InnoDB: Trx id counter is 0 341645824
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 1464524, file name mysql-bin.000011
InnoDB: Last MySQL binlog file position 0 1070123614, file name /db_logs/mysql/repl/mysql-bin.000086
InnoDB: Starting in background the rollback of uncommitted transactions
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645461, 1 rows to undo
060309 23:38:08  InnoDB: Started; log sequence number 39 3064686203
060309 23:38:08 [Note] Recovering after a crash using /db_logs/mysql/repl/mysql-bin

InnoDB: Rolling back of trx id 0 341645461 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645428, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645428 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645339, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645339 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645162, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645162 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645122, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645122 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645120, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645120 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645088, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645088 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645056, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645056 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645049, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645049 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341645030, 1 rows to undo

InnoDB: Rolling back of trx id 0 341645030 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341644587, 1 rows to undo

InnoDB: Rolling back of trx id 0 341644587 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341644570, 1 rows to undo

InnoDB: Rolling back of trx id 0 341644570 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341644522, 1 rows to undo

InnoDB: Rolling back of trx id 0 341644522 completed
060309 23:38:08  InnoDB: Rolling back trx with id 0 341644491, 1 rows to undo

InnoDB: Rolling back of trx id 0 341644491 completed
060309 23:38:08  InnoDB: Rollback of non-prepared transactions completed
060309 23:38:18 [Note] Starting crash recovery...
060309 23:38:18  InnoDB: Starting recovery for XA transactions...
060309 23:38:18  InnoDB: Transaction 0 341644458 in prepared state after recovery
060309 23:38:18  InnoDB: Transaction contains changes to 22 rows
060309 23:38:18  InnoDB: 1 transactions in prepared state after recovery
060309 23:38:18 [Note] Found 1 prepared transaction(s) in InnoDB
060309 23:38:18 [Note] Crash recovery finished.
060309 23:38:18 [ERROR] /usr/sbin/mysqld: Table './mysql/user' is marked as crashed and should be repaired
060309 23:38:18 [Warning] Checking table:   './mysql/user'
060309 23:38:18 [ERROR] 1 client is using or hasn't closed the table properly
060309 23:38:18 [ERROR] /usr/sbin/mysqld: Table './mysql/db' is marked as crashed and should be repaired
060309 23:38:18 [Warning] Checking table:   './mysql/db'
060309 23:38:18 [ERROR] 1 client is using or hasn't closed the table properly
060309 23:38:18 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.18-standard-log'  socket: '/db_data/mysql/mysql.sock'  port: 3306  MySQL Community Edition - Standard (GPL)
060309 23:38:18 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000011' at position 1464524, relay log './cms-mysql01-relay-bin.001929' position: 235
060309 23:38:18 [Note] Slave I/O thread: connected to master 'repl@cms-mysql02:3306',  replication started in log 'mysql-bin.000011' at position 33467209
[10 Mar 2006 17:32] Heikki Tuuri
Jeff,

please send the output to heikki.tuuri@oracle.com

If the output is > 1 MB, then gzip it.

Regards,

Heikki
[10 Mar 2006 22:26] Heikki Tuuri
Jeff,

your hang report looks different from the ones by Vadim and Ady.

---TRANSACTION 0 341644491, ACTIVE 2908 sec, process no 15722, OS thread id 1179335008, thread declared inside InnoDB 498
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1216, undo log entries 1
MySQL thread id 334757, query id 395178082
<deleted>
---TRANSACTION 0 341644458, ACTIVE (PREPARED) 2908 sec, process no 15722, OS thread id 1177528672
6 lock struct(s), heap size 1216, undo log entries 22
MySQL thread id 334694, query id 395178059 <deleted>
commit
Trx read view will not see trx with id >= 0 341644469, sees < 0 341644381

The threads are not waiting for an InnoDB semaphore, but are probably waiting for a MySQL mutex. There was a similar bug report in the Forums a few weeks ago. I think also that was associated with prepared statements.

You may be hitting this bug:

http://bugs.mysql.com/bug.php?id=18116

Or you may want to open a new bug report.

Regards,

Heikki
[10 Mar 2006 22:42] Heikki Tuuri
It was Anda Zhao who reported a hang that looks similar to Jeff's:

http://forums.mysql.com/read.php?22,72478,75227#msg-75227
[29 Mar 2006 20:34] Heikki Tuuri
Hmm... Vadim's original hang report looks similar to Ady's.

----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 107048906; in additional pool allocated 17150464
Buffer pool size   4096
Free buffers       0
Database pages     4095
Modified db pages  56
Pending reads 0
Pending writes: LRU 0, flush list 2, single page 0
Pages read 14376, created 29335, written 29457
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 13784, id 2909039536, state: flushing buffer pool pages

It is the main thread that is stuck flushing buffer pool pages.

The main thread:

--Thread 2909039536 has waited at buf0flu.c line 586 for 11.00 seconds the sema\
phore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2661153712) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

In Vadim's output we see the main thread waiting for a semaphore for a few seconds, then waking up, and then ending up waiting again! This looks like a livelock of the main thread. The question is why the write I/O thread does not handle the pending writes.

In buf0flu.c, the main thread does call buf_flush_buffered_writes() before waiting for the S-latch:

                if (!locked) {
                        buf_flush_buffered_writes();

                        rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE);
                }
[29 Mar 2006 20:40] Heikki Tuuri
The bug may be an InnoDB race condition in fil0fil.c or os0file.c. For some reason a few pending write requests are never handled but are lost.
[7 Apr 2006 22:19] Heikki Tuuri
Hi!

I analyzed Vadim's .err log in more detail. We see that the master thread, which is doing a buffer pool flush, becomes starved when it tries to get an S-latch on a certain page. Many other threads acquire an X-latch to the same page meanwhile.

buf0flu.c:
                mutex_exit(&(buf_pool->mutex));

                if (!locked) {
                        buf_flush_buffered_writes();

                        rw_lock_s_lock_gen(&(block->lock), BUF_IO_WRITE); ### waits here!
                }

Looks like this is not a genuine hang. All the way until the assertion, there are threads that have just waited for < 10 seconds for a semaphore. But some threads have waited for > 600 seconds, that is why InnoDB gets an assertion failure and crashes mysqld.

Thus, this problem probably will be fixed when #15815 is fixed.

Regards,

Heikki

--Thread 2909039536 has waited at buf0flu.c line 586 for 11.00 seconds the sema\
phore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2661153712) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

--Thread 2909039536 has waited at buf0flu.c line 586 for 4.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2877311920) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

--Thread 2909039536 has waited at buf0flu.c line 586 for 1.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2810477488) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

--Thread 2909039536 has waited at buf0flu.c line 586 for 0.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2654129072) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

--Thread 2909039536 has waited at buf0flu.c line 586 for 5.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2659748784) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28

--Thread 2909039536 has waited at buf0flu.c line 586 for 5.00 seconds the semap\
hore:
S-lock on RW-latch at 0xb1f13c58 created in file buf0buf.c line 485
a writer (thread id 2505006000) has reserved it in mode  exclusive
number of readers 0, waiters flag 1
Last time read locked in file ../include/btr0btr.ic line 28
Last time write locked in file ../include/btr0btr.ic line 28
[7 Apr 2006 23:06] Heikki Tuuri
A workaround for this bug may be setting in my.cnf innodb_thread_concurrency=8 or even =1.

The default is 8 in 4.1.x. In 5.0, the default is not to restrict it at all. That may explain why these hang bugs started to flow in with 5.0. It is our old friend, 'thread thrashing' that in an extreme form may look like a hang.
[8 Apr 2006 5:50] Heikki Tuuri
Assigning this to Marko.

Patch for 5.0: in mysqld.cc:

  {"innodb_thread_concurrency", OPT_INNODB_THREAD_CONCURRENCY,
   "Helps in performance tuning in heavily concurrent environments. "
   "Sets the maximum number of threads allowed inside InnoDB. Value 0"
   " will disable the thread throttling.",
   (gptr*) &srv_thread_concurrency, (gptr*) &srv_thread_concurrency,
   0, GET_LONG, REQUIRED_ARG, 0, 0, 1000, 0, 1, 0},

set the default back to 8:

  {"innodb_thread_concurrency", OPT_INNODB_THREAD_CONCURRENCY,
   "Helps in performance tuning in heavily concurrent environments. "
   "Sets the maximum number of threads allowed inside InnoDB. Value 0"
   " will disable the thread throttling.",
   (gptr*) &srv_thread_concurrency, (gptr*) &srv_thread_concurrency,
   8, GET_LONG, REQUIRED_ARG, 0, 0, 1000, 0, 1, 0},

In MySQL-5.1 we choose to keep the default unrestricted because it contains Osku's improvements to the sync array. It may be more resilient to thread thrashing.

Regards,

Heikki
[8 Apr 2006 10:51] Heikki Tuuri
Hi!

Since this is a change to MySQL code, I assign this to Alexander Ivanov. The default should be changed to 8 in 5.0, but NOT in 5.1.

Best regards,

Heikki
[10 Apr 2006 12:45] Heikki Tuuri
Sorry, the new version should be:

{"innodb_thread_concurrency", OPT_INNODB_THREAD_CONCURRENCY,
   "Helps in performance tuning in heavily concurrent environments. "
   "Sets the maximum number of threads allowed inside InnoDB. Value 0"
   " will disable the thread throttling.",
   (gptr*) &srv_thread_concurrency, (gptr*) &srv_thread_concurrency,
   0, GET_LONG, REQUIRED_ARG, 8, 0, 1000, 0, 1, 0},

I had put the '8' in the wrong place!
[11 Apr 2006 8:04] Alexander Ivanov
Fixed in 5.0.21.
[17 Apr 2006 17:08] Mike Hillyer
Documented in 5.0.21 functionality change section of changelog:

   <listitem>
        <para>
          Default for <literal>INNODB_THREAD_CONCURRENCY</literal>
          changed to <literal>8</literal>. (Bug #15868)
        </para>
      </listitem>
[10 May 2006 10:01] Marko Mäkelä
Please document the change also at http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html. Please also document this earlier change:

2006/01/18 15:20:56+03:00 aivanov (at) mysql.com 
  Applied patch recieved from Marko:
   Make innodb_thread_concurrency 0 by default, and extend the usable
   range from 0 to 1000 (0 disables the thread throttling).
[17 May 2006 20:04] Heikki Tuuri
Paul,

value 0 disables thread throttling.

The default was no throttling in 5.0.x - 5.0.20.

In 5.0.21 the default is 8 again.

Regards,

Heikki
[22 May 2006 18:19] Paul DuBois
Default in 5.0: 0 through 5.0.20, 8 thereafter.
Default in 5.1: 20.
[25 May 2006 8:28] Saran Chari
Jeff,

We're hitting what seems to be the same problem - your error log output looks exactly like ours when we try and restart the server.  I'd be very interested to know if either the workaround of setting innodb_thread_concurrency=8 OR the bug fix for bug #18116 worked for you.  If you read this, could you please contact me offline at saran@flixster.com?

Thanks,
Saran
[25 May 2006 17:18] Paul DuBois
Setting back to In Progress and assigning to Elliot pending
incorporation of changes to 5.1 tree.
[26 May 2006 14:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6906
[26 May 2006 21:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6921
[9 Jun 2006 5:24] Heikki Tuuri
Paul,

Elliot's patch is now in 5.1.11. The default is 8 there. Please document.

Regards,

Heikki
[9 Jun 2006 15:48] Paul DuBois
Noted in 5.1.11 changelog and in 5.1 InnoDB chapter.
[8 Nov 2006 14:47] Heikki Tuuri
The fix to http://bugs.mysql.com/bug.php?id=22868 probably fixes this.
[24 Aug 2007 9:11] Jon Stephens
See also Bug #30505.
[28 Aug 2007 21:34] James Day
Jon, Timothy,

Earlier Marko wrote "In 5.1, the default seems to be innodb_thread_concurrency=20, as you can see in sql/mysqld.cc. I think it should be 8, just like in the most recent 5.0 version. Also, the minimum there is 1, which is wrong, because the value 0 is interpreted as "infinite".
Furthermore, the error (duplicate label OPT_INNODB_THREAD_CONCURRENCY) I reported a while ago still has not been fixed."

Now, the default was changed to 8 but it looks as though if you try to specify 0/unlimited it's changed to 1 so you can't actually set unlimited at the moment.

Looks as though there might be a bug for MySQL to fix?

For those following along: setting high values of innodb_thread_concurrency - hundreds to 1000 - can produce best performance for some workloads and systems. For others, low values in the 4-8 range are best. You need to experiment to see which is best for you particular load. Don't stick to just low values. If high values are wrong you may see a semaphore wait timeout error from InnoDB. That's your sigbal that the setting is too high and you should go lower.
[29 Aug 2007 12:36] Heikki Tuuri
This is the definition in the current 5.1 ha_inno0db.cc:

static MYSQL_SYSVAR_ULONG(thread_concurrency, srv_thread_concurrency,
  PLUGIN_VAR_RQCMDARG,
  "Helps in performance tuning in heavily concurrent environments. Sets the max\
imum number of threads allowed inside InnoDB. Value 0 will disable the thread t\
hrottling.",
  NULL, NULL, 8, 0, 1000, 0);
[30 Aug 2007 18:26] Paul DuBois
The 5.1 manual now specifies that 0 disables concurrency (no concurrency checking), and that the default value is 20 before 5.1.11 and 8 from 5.1.11 on.
[30 Aug 2007 20:12] James Day
Two bits of trivia in the docs change: "differnt" and "1000. you can".

Sorry, I was wrong with my earlier comment - I was looking at an old source versions while checking the history of this and must have missed switching back. Thanks for the correction Heikki, Paul.
[31 Aug 2007 1:23] Paul DuBois
Trivia corrected. :-)