MySQL Bugs: #34399: MySQL hangs with connection pooling option enabled

Bug #34399	MySQL hangs with connection pooling option enabled
Submitted:	7 Feb 2008 22:56	Modified:	24 Apr 2009 15:31
Reporter:	luojia chen	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S1 (Critical)
Version:	6.0.5	OS:	Any (Solaris Express Community Edition snv_81 SPARC, Linux)
Assigned to:	Paul DuBois	CPU Architecture:	Any

Description:
System:
HW: SunFire v880  
    8x750MHZ
    32GB RAM

Test workload: sysbench
MySQL is configured as bellow:
PARAM="--port=3306 \
--socket=/tmp/mysql.sock \
--user=root \
--datadir=$FSPATH \
--basedir=$BASEPATH \
--log-error=/usr/local/mysql/data/global_error.txt \
--max_connections=12000 \
--max_connect_errors=10 \
--table_cache=2048 \
--max_prepared_stmt_count=32764 \
--max_allowed_packet=1048576 \
--binlog_cache_size=1048576 \
--max_heap_table_size=67108864 \
--sort_buffer_size=65536 \
--join_buffer_size=1048576 \
--thread_cache=8 \
--thread_concurrency=16 \
--thread_stack=64K \
--query_cache_size=0 \
--query_cache_limit=2M \
--ft_min_word_len=4 \
--default_table_type=Innodb \
--transaction_isolation=REPEATABLE-READ \
--tmp_table_size=64M \
--skip-locking \
--server-id=1 \
--thread_handling=pool-of-threads \
--thread_pool_size=10 \
--innodb_data_home_dir=$FSPATH \
--innodb_data_file_path=ibdata1:100M:autoextend \
--innodb_log_group_home_dir=$FSPATH \
--innodb_buffer_pool_size=1G\
--innodb_additional_mem_pool_size=20M \
--innodb_log_file_size=500M \
--innodb_log_buffer_size=8M \
--innodb_flush_log_at_trx_commit=1 \
--innodb_lock_wait_timeout=300 \
--innodb_locks_unsafe_for_binlog=1 \
--innodb_max_dirty_pages_pct=90 \
--innodb_thread_concurrency=0" 

When run the read-write test, the mysql server can't be reached, and error message in the "/usr/local/mysql/data/global_error.txt" shows "port-dissociate: No such file or directory"

How to repeat:
1. Generate the load
Load 1M row database using the following command:
sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=sbtest --max-requests=0 prepare

2. Run the Test
Running sysbench OLTP read-write test case with 1000 user connections using the following command:
sysbench --test=oltp --oltp-read-only=off --oltp-dist-type=special --num-threads=1000 --oltp-table-size=1000000 --mysql-db=sbtest --max-requests=0 --max-time=600 run

While running the sysbench with the above command, I couldn't reach the MySQL server from mysql client - while running #mysql, mysql> wasn't shown up)

Thank you for the report.

Please provide output of SHOW VARIABLES LIKE 'max_connections';

Please also provide output of `ulimit -a` and error log file.

#ulimit -a
fime(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
memory(kbytes) unlimited

mysql>show variables like 'max_connections'
max_connections      12000

uploaded the error log file: bug-data-34399.tar.gz

max_connections      12000
That's too much for Solaris.

http://dev.mysql.com/doc/refman/5.0/en/too-many-connections.html

Try to use not more then 1024.

It depends on the Solaris thread handling.

My system has 32G memory. While not setting thread_handling as "pool-of-threads", MySQL runs well without this problem (used the same settings(max-connections=12000)

If I reduced the "max-connections=1200", the same problem happened

The same problem was observed while reducing:
max_connections = 512

MySQL run well without such problem if not setting "thread_handling" as "pool-of-threads" in the same sysbench rw test:
sysbench --test=oltp --oltp-read-only=off --oltp-dist-type=special --num-thread=512 --oltp-table-size=1000000 --mysql-db=sbtest --max-requests=0 --max-time=600 run

Thank you for the feedback.

I can not repeat described behavior. Although nearest machine I found with SunOS sol10-sparc-c 5.10 Generic_118833-33 sun4u sparc SUNW,Sun-Fire-V245 Solaris

I can repeat the same problem on my 2-way Sun Blade 1000 server.

I have put my system(Sun Fire v880) with test app/scripts set up ready to access on Internet. If it is possible for you, would you check it my system. If so, I can send to you the login in seperate email

Jenny,

thank you for the feedback and offer.

If you could open connection to your server for time enough for me to verify the report (probably 1-3 working days, depend from other issues I am loaded with) and for developers to fix (unknown) it would be possible for us to use your server.

Hi, Sveta,

Please let me know your emails, so that I would send to you the login in seperate email.

Thanks

Jenny,

you can add comment to bug report privately. But I just sent email from my address, so you can use it if you wish.

Thank you for the provided information.

Verified as described on your machine.

any update on this bug ? whether this will be fixed or what is the problem ?

By the way I can see this behavior on RHEL too with simple mysqlslap

Bug #39822 reported on Linux was marked as duplicate of this one.

Bug #43753 reported on Linux was marked as duplicate of this one.

This is not a bug per se, but a case for which the thread pool implementation will not work.

If you have a load that consists of transactions with multiple DDL operations, _and_ these set an exclusive lock (i.e INSERT/UPDATE), _and_ you have less threads available in the pool than the number of sessions that try to execute these transactions, you may run into this problem.

A simple example, using two sessions (S1 and S2), and one thread in the pool (T1) :

S1 :  BEGIN;
T1 : executes and returns
S1 :  UPDATE t SET a=1 WHERE id = 7;
T1 : executes and returns
S2 :  BEGIN
T1 : executes and returns
S2 :  UPDATE t SET a=1 WHERE id = 7;
T1 : executes, but runs into a lock an holds
S1 : COMMIT
T1 : still executing S2's UPDATE, so the COMMIT is queued
... (hang for 50 seconds) ...
T1 : Lock timeout for the S2's UPDATE, return with an error
S2 : Gets an error
T1 : Execute S1 COMMIT and return
S1 : get an OK.

The default lock timeout value for innodb is 50 seconds. 

Increasing the number of worker threads will reduce the probability of the problem, but you will need at least the same number of worker threads as the number of running sessions to avoid it completely.

I'm setting this report to Documenting, so the doc crew can assess if we need to clarify the documentation on this point.

The operation of the thread pool is described here:
http://dev.mysql.com/doc/refman/6.0/en/connection-threads.html

There is some discussion about conditions under which pooling will not work well, but I agree that it doesn't cover the situation described by Magne. I'll look into updating the docs.

Reassigning to myself as docs bug.

Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

Addition to
http://dev.mysql.com/doc/refman/6.0/en/connection-threads.html:

Thread pooling works best if the normal workload consists
mostly of lightweight queries, with no more than a few
simultaneous transactions or long-running statements.

Thread pooling is less suitable when you have N threads in
the pool but it is possible that all or most of the threads
will be tied up servicing transactions or long-running
statements simultaneously. If this happens, the workload from
other clients will be stalled. Configuring the server to use
a pool larger than the maximum number of simultaneous
transactions or long-running statements will help in this
case.

The following examples illustrate when a too-small thread
pool might cause problems. Assume a thread pool size of 1 for
simplicity, but the same issues can occur for a larger pool
and larger number of clients.

* One client issues a statement that takes a long time to
process. This ties up the service thread. If other
clients issue statements, they are queued and must wait
for the statement currently executing to finish.

* One client acquires an exclusive lock on a table. If
another client attempts to use the table, its statement
blocks and the service thread remains occupied waiting
for the block to end. Any further statements from the
original client cannot execute, either, because there is
no free thread available.

In the following scenario, conflicting lock requests
cause deadlock.

Session 1:

LOCK TABLES t1 WRITE;

The service thread executes the statement and becomes
free.

Session 2:

LOCK TABLES t1 WRITE;

The service thread starts to execute the statement but
blocks waiting for the exiting lock on t1 to be released.
The thread does not become free.

Session 1:

INSERT INTO t1 VALUES(1,2);

The statement cannot execute and is queued because the
service thread remains allocated to session 2. At this
point, the clients are deadlocked and neither cannot
continue.

A similar situation can occur in the absence of explicit
locks if simultaneous transactions attempt to access
tables such that one transaction blocks another. Assume
that t1 is an InnoDB table.

Session 1:

START TRANSACTION;
SELECT * FROM t1 WHERE id = 7 FOR UPDATE;

The service thread executes each of the statements in
turn and becomes free. The SELECT
statement acquires an exclusive lock for the selected row
or rows.

Session 2:

START TRANSACTION;
UPDATE t1 SET a = a+1 WHERE id = 7;

The service thread executes the first statement and
attempts to execute the second. However, the UPDATE
blocks because it must lock the same rows already locked
by session 1, so the service thread waits and does not
become free.

Session 1:

COMMIT;

The statement cannot execute and is queued because the
service thread remains allocated to the session 2
statement. At this point, the transactions are
deadlocked. Eventually, InnoDB times out waiting for the
lock and the session 2 UPDATE fails with an error. The
service thread becomes free and executes the COMMIT for
session 1.