Bug #44100 Server crashes if max_connections decreased below certain level
Submitted: 6 Apr 2009 0:06 Modified: 24 Jan 2013 15:01
Reporter: Elena Stepanova Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.0.72, 5.0.79, 5.1.30, 5.1.34, 6.0.10 OS:Linux
Assigned to: CPU Architecture:Any

[6 Apr 2009 0:06] Elena Stepanova
Description:
If max_connections variable is set to a value less than the number of currently open connections, the server crashes soon afterwards.

The following conditions seem to be required to get the server crash:

- if the current number of open connections is Copn, and max_delayed_threads value is Dmax, the new max_connections value Cmaxnew should be less than (Copn - Dmax - 11) (approximately);
- some of the open connections should be doing something, like executing a query -- apparently, the number of "idle" connections should be less than the new max_connections value (not sure if that's strictly necessary)

The problem concerns both debug and release (non-debug) versions. 

Debug version crashes quite reliably with the following stack:
(5.1.34-enterprise-commercial-advanced-debug-log)  

Program terminated with signal 6, Aborted.
#0  0x00002b42a7b7dea3 in pthread_kill () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002b42a7b7dea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000008aaf48 in my_write_core (sig=6) at stacktrace.c:311
#2  0x00000000005caf97 in handle_segfault (sig=6) at mysqld.cc:2537
#3  <signal handler called>
#4  0x00002b42a8162b95 in raise () from /lib64/libc.so.6
#5  0x00002b42a8163f90 in abort () from /lib64/libc.so.6
#6  0x00002b42a815c256 in __assert_fail () from /lib64/libc.so.6
#7  0x000000000089d864 in queue_insert (queue=0xd32420,
    element=0x43a4eff0 "zґыI") at queues.c:212
#8  0x00000000008abe2a in thr_alarm (alrm=0x43a4f030,
    sec=<value optimized out>, alarm_data=0x43a4eff0) at thr_alarm.c:218
#9  0x00000000005bf4e3 in my_real_read (net=0x15734d8, complen=0x43a4f090)
    at net_serv.cc:830
#10 0x00000000005bf9cd in my_net_read (net=0x3da1) at net_serv.cc:997
#11 0x00000000005e1993 in do_command (thd=0x15733f0) at sql_parse.cc:800
#12 0x00000000005d2a63 in handle_one_connection (arg=<value optimized out>)
    at sql_connect.cc:1116
#13 0x00002b42a7b79143 in start_thread () from /lib64/libpthread.so.0
#14 0x00002b42a81f274d in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()

Release version crashes less reliably, although still often, and stack might be different, below are two examples:
(5.1.34-enterprise-commercial-advanced-log)

Program terminated with signal 11, Segmentation fault.
#0  0x00002b92cca4aea3 in pthread_kill () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002b92cca4aea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005dbf55 in handle_segfault (sig=11) at mysqld.cc:2537
#2  <signal handler called>
#3  0x00002b92cd06b435 in malloc_consolidate () from /lib64/libc.so.6
#4  0x00002b92cd06c813 in _int_free () from /lib64/libc.so.6
#5  0x00002b92cd06c95c in free () from /lib64/libc.so.6
#6  0x00000000005de782 in one_thread_per_connection_end (thd=0x1386870,
    put_in_cache=true) at mysqld.cc:1832
#7  0x00000000005e5b19 in handle_one_connection (arg=<value optimized out>)
    at sql_connect.cc:1123
#8  0x00002b92cca46143 in start_thread () from /lib64/libpthread.so.0
#9  0x00002b92cd0bf74d in clone () from /lib64/libc.so.6
#10 0x0000000000000000 in ?? ()

Program terminated with signal 6, Aborted.
#0  0x00002af91466eea3 in pthread_kill () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00002af91466eea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x00000000005dbf55 in handle_segfault (sig=6) at mysqld.cc:2537
#2  <signal handler called>
#3  0x00002af914c53b95 in raise () from /lib64/libc.so.6
#4  0x00002af914c54f90 in abort () from /lib64/libc.so.6
#5  0x00002af914c8a35b in __libc_message () from /lib64/libc.so.6
#6  0x00002af914c8f34e in malloc_printerr () from /lib64/libc.so.6
#7  0x00002af914c912d4 in _int_malloc () from /lib64/libc.so.6
#8  0x00002af914c92d36 in malloc () from /lib64/libc.so.6
#9  0x00000000008915f2 in my_malloc (size=8040, my_flags=16115)
    at my_malloc.c:35
#10 0x0000000000891ff3 in init_alloc_root (mem_root=0x438c6f40,
    block_size=<value optimized out>, pre_alloc_size=8024) at my_alloc.c:63
#11 0x0000000000585ced in init_sql_alloc (mem_root=0x3edf, block_size=16115,
    pre_alloc=6) at thr_malloc.cc:58
#12 0x00000000006272b6 in open_tables (thd=0x1326ed0, start=0x438c6fe8,
    counter=0x438c6ff8, flags=4294967295) at sql_base.cc:4478
#13 0x0000000000627b20 in open_and_lock_tables_derived (thd=0x1326ed0,
    tables=0x133c040, derived=true) at sql_base.cc:4990
#14 0x00000000005e940a in execute_sqlcom_select (thd=0x1326ed0,
    all_tables=0x133c040) at mysql_priv.h:1551
#15 0x00000000005efad8 in mysql_execute_command (thd=0x1326ed0)
    at sql_parse.cc:2205
#16 0x00000000005f1a77 in mysql_parse (thd=0x1326ed0,
    inBuf=0x133be00 "select \"SERVER STILL ALIVE, no crash\" from connection_test.t", length=60, found_semicolon=0x438c90f0) at sql_parse.cc:5903
#17 0x00000000005f28e3 in dispatch_command (command=COM_QUERY, thd=0x1326ed0,
    packet=0x1333dd1 "select \"SERVER STILL ALIVE, no crash\" from connection_test.t", packet_length=<value optimized out>) at sql_parse.cc:1217
#18 0x00000000005f31a6 in do_command (thd=0x1326ed0) at sql_parse.cc:858
#19 0x00000000005e5c66 in handle_one_connection (arg=<value optimized out>)
    at sql_connect.cc:1116
#20 0x00002af91466a143 in start_thread () from /lib64/libpthread.so.0
#21 0x00002af914ce374d in clone () from /lib64/libc.so.6
#22 0x0000000000000000 in ?? ()

Workaround could be to check process list before reducing the value; or, if the value has to be changed anyway, to do it gradually, in steps less than described.

How to repeat:
perl ./mysql-test-run.pl --mysqld=--innodb --do-test=connections_crash

The attached test uses minimal values I could figure. In fact, the absolute values are not so important.
The test does the following.
It sets max_connections to 20 and max_delayed_threads to 1, and opens 14 connections.
Then it starts a transaction with a query on an InnoDB table in one connection, and make 12 connections wait for a lock with update query on the same table. While the connections are waiting, the 14th connection sets max_connections value to 2 and then the first one commits the insert transaction. The server crashes either when other connections finish their queries, or when they disconnect.

InnoDB table is not needed to get the server crash, it is only used in the test to get a long transaction easily. 

Suggested fix:
There is a logic which produces a warning 
Warning: thr_alarm queue is full
-- it appears in similar situations, but with slightly different scenarios. Possibly, the conditions when it is triggered could be extended to cover the use case which causes the crash.
[6 Apr 2009 0:09] Elena Stepanova
test to reproduce bug 44100

Attachment: connections_crash.test (application/octet-stream, text), 1.91 KiB.

[7 Dec 2012 13:38] Sveta Smirnova
There is a typo in the test case. Proper test case would be attached in few seconds.
[7 Dec 2012 13:38] Sveta Smirnova
correct test case

Attachment: connections_crash.test (application/octet-stream, text), 1.94 KiB.

[7 Dec 2012 13:39] Sveta Smirnova
Still reproducible in versions 5.0.97 and 5.1.67, not reproducible in versions 5.5.30 and 5.7.1
[24 Jan 2013 15:01] Paul DuBois
Noted in 5.1.69, 5.5.31, 5.6.11, 5.7.1 changelogs.

Setting max_connections to a value less than the current number of
open connections caused the server to exit.
[20 Apr 2013 13:46] Laurynas Biveinis
5.1 is not pushed ATM.

5.5$ bzr log -r 4173 -n0
------------------------------------------------------------
revno: 4173 [merge]
committer: Venkata Sidagam <venkata.sidagam@oracle.com>
branch nick: 5.5
timestamp: Thu 2013-01-24 14:13:42 +0530
message:
  Bug #11752803  SERVER CRASHES IF MAX_CONNECTIONS DECREASED BELOW 
                 CERTAIN LEVEL
  
  Merging from 5.1 to 5.5
    ------------------------------------------------------------
    revno: 2661.830.84
    committer: Venkata Sidagam <venkata.sidagam@oracle.com>
    branch nick: 5.1
    timestamp: Thu 2013-01-24 14:02:54 +0530
    message:
      Bug #11752803  SERVER CRASHES IF MAX_CONNECTIONS DECREASED BELOW 
                     CERTAIN LEVEL
            
      Problem description: mysqld crashes when we update the max_connections 
      variable to lesser value than the number of currently open connections.
            
      Analysis: The "alarm_queue.max_elements" size will be decided at the 
      server start time and it will get modified if we change max_connections 
      value. In the current scenario the value of "alarm_queue.max_elements" 
      is decremented when the max_connections is set to 2. When updating the  
      "alarm_queue.max_elements" value we are not updating "max_used_alarms" 
      value. Hence, instead of getting the warning "thr_alarm queue is full" 
      it is ending up in asserting the server at the time of inserting new 
      elements in the queue.
            
      Fix: the fix is to dynamically increase the size of the alarm_queue.
      In order to do that, queue_insert_safe() should be used instead if
      queue_insert().

4668 in 5.6.