Bug #75960 Crash on tmp tables (mysqld got signal 11 ; Using temporary; Using filesort)
Submitted: 18 Feb 2015 21:09 Modified: 29 Mar 2015 4:37
Reporter: Alexander Kralin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:5.6.14 OS:Linux (CentOS 5.9 x86_64)
Assigned to: CPU Architecture:Any

[18 Feb 2015 21:09] Alexander Kralin
Description:
Hello!

A couple pretty similar crashes has been observed recently on one of our production servers. Please see details below. Looks like in this scenario, server tried to initialize temporary table and if a temporary was not initialized, the whole server crashed. It may be better to return error to one particular query instead of crashing the whole mysqld process.

Crash example on a server running CentOS 5.9, ext4 filesystem, /tmp is a dedicated partition on FusionIO ioScale2 SSD.

Error log:

14:00:02 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=125
max_threads=2500
thread_count=37
connection_count=35
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5162313 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x2b28c4cf1650
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2b28dc4910a0 thread_stack 0x40000
/usr/sbin/mysqld-3307(my_print_stacktrace+0x35)[0x90f5e5]
/usr/sbin/mysqld-3307(handle_fatal_signal+0x3e8)[0x671e78]
/lib64/libpthread.so.0[0x2b2772224ca0]
/usr/sbin/mysqld-3307(_ZNK7handler22ha_statistic_incrementEM17system_status_vary+0x8)[0x589708]
/usr/sbin/mysqld-3307(_ZN9ha_myisam9write_rowEPh+0x20)[0xaee660]
/usr/sbin/mysqld-3307(_ZN7handler12ha_write_rowEPh+0x106)[0x591aa6]
/usr/sbin/mysqld-3307[0x6cd831]
/usr/sbin/mysqld-3307[0x6d1b72]
/usr/sbin/mysqld-3307(_Z10sub_selectP4JOINP13st_join_tableb+0x162)[0x6d2202]
/usr/sbin/mysqld-3307[0x6d1b72]
/usr/sbin/mysqld-3307(_Z10sub_selectP4JOINP13st_join_tableb+0x162)[0x6d2202]
/usr/sbin/mysqld-3307(_ZN4JOIN4execEv+0x26c)[0x6ceacc]
/usr/sbin/mysqld-3307[0x715049]
/usr/sbin/mysqld-3307(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x71636c]
/usr/sbin/mysqld-3307(_Z13handle_selectP3THDP13select_resultm+0x175)[0x716575]
/usr/sbin/mysqld-3307[0x6f02a9]
/usr/sbin/mysqld-3307(_Z21mysql_execute_commandP3THD+0x34ae)[0x6f53de]
/usr/sbin/mysqld-3307(_Z11mysql_parseP3THDPcjP12Parser_state+0x318)[0x6f8fa8]
/usr/sbin/mysqld-3307(_Z16dispatch_command19enum_server_commandP3THDPcj+0x11b6)[0x6fa9d6]
/usr/sbin/mysqld-3307(_Z10do_commandP3THD+0xd7)[0x6fc287]
/usr/sbin/mysqld-3307(_Z24do_handle_one_connectionP3THD+0x116)[0x6c3c86]
/usr/sbin/mysqld-3307(handle_one_connection+0x45)[0x6c3d65]
/usr/sbin/mysqld-3307(pfs_spawn_thread+0x126)[0xadcad6]
/lib64/libpthread.so.0[0x2b277221c83d]
/lib64/libc.so.6(clone+0x6d)[0x2b27734dff8d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (28223e20): select connector2_.id as id11_, connector2_.name as name11_, connector2_.flags as flags11_, connector2_.class_name as class4_11_, connector2_.description as descript5_11_, connector2_.domain_fk as domain6_11_, connector2_.post_query as post7_11_, connector2_.starting_connector_msg as starting8_11_, connector2_.url as url11_ from pd_campaign pdcampaign0_ inner join pd_campaign_connectors connectors1_ on pdcampaign0_.id=connectors1_.pd_campaign_fk inner join connector connector2_ on connectors1_.connector_fk=connector2_.id where pdcampaign0_.id=10359600000647 order by connector2_.id ASC
Connection ID (thread ID): 251076
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
150217 06:02:03 mysqld_safe Number of processes running now: 0
150217 06:02:03 mysqld_safe mysqld restarted

How to repeat:
I was able to reproduce pretty similar (but not exactly the same) behavior simply be switching /tmp to read-only mode. I.e. 

# chmod 0555 tmp
# mysql -e "select ..."
# and we got a backtrace :)

20:13:41 UTC - mysqld got signal 11 ;
...
Thread pointer: 0x2b5ff80444a0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2b5ff55780a0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x90f5e5]
/usr/sbin/mysqld(handle_fatal_signal+0x3e8)[0x671e78]
/lib64/libpthread.so.0[0x2b5ed2ad8ca0]
/usr/sbin/mysqld(mi_extra+0x2b)[0xb088fb]
/usr/sbin/mysqld(_ZN13QEP_tmp_table8end_sendEv+0x49)[0x6d1e89]
/usr/sbin/mysqld(_Z13sub_select_opP4JOINP13st_join_tableb+0x75)[0x6d25a5]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x284)[0x6ceae4]
/usr/sbin/mysqld[0x715049]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x71636c]
/usr/sbin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x716575]
/usr/sbin/mysqld[0x6f02a9]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x34ae)[0x6f53de]
...  

Suggested fix:
It would be nice to write a message in logs and return error to one particular query in case of an error with a temporary file instead of crashing the whole servers.
[18 Feb 2015 21:10] Alexander Kralin
Backtrace from core file:

(gdb) bt
#0  0x00002b2772221e32 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000671e9a in handle_fatal_signal (sig=11) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/signal_handler.cc:248
#2  <signal handler called>
#3  handler::ha_statistic_increment (this=0x2ab90cb0, offset=&system_status_var::ha_write_count) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/handler.cc:2387
#4  0x0000000000aee660 in ha_myisam::write_row (this=0x2ab90cb0, buf=0x2e306010 "\202|-G\bl\t") at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/storage/myisam/ha_myisam.cc:814
#5  0x0000000000591aa6 in handler::ha_write_row (this=0x2ab90cb0, buf=0x2e306010 "\202|-G\bl\t") at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/handler.cc:7205
#6  0x00000000006cd831 in end_write (join=0x207ec030, join_tab=0x207f5240, end_of_records=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:2994
#7  0x00000000006d1b72 in evaluate_join_record (join=0x207ec030, join_tab=0x207f4f40) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:1598
#8  0x00000000006d2202 in sub_select (join=0x207ec030, join_tab=0x207f4f40, end_of_records=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:1273
#9  0x00000000006d1b72 in evaluate_join_record (join=0x207ec030, join_tab=0x207f4c40) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:1598
#10 0x00000000006d2202 in sub_select (join=0x207ec030, join_tab=0x207f4c40, end_of_records=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:1273
#11 0x00000000006ceacc in do_select (this=0x207ec030) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:930
#12 JOIN::exec (this=0x207ec030) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:191
#13 0x0000000000715049 in mysql_execute_select (thd=0x2b28c4cf1650, select_lex=0x2b28c4cf3e70, free_join=true) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_select.cc:1100
#14 0x000000000071636c in mysql_select (thd=0x2b28c4cf1650, tables=<value optimized out>, wild_num=<value optimized out>, fields=<value optimized out>, conds=<value optimized out>, order=<value optimized out>, group=0x2b28c4cf3f70,
    having=0x0, select_options=2148272640, result=0x2b28e06ebe18, unit=0x2b28c4cf3828, select_lex=0x2b28c4cf3e70) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_select.cc:1221
#15 0x0000000000716575 in handle_select (thd=0x2b28c4cf1650, result=0x2b28e06ebe18, setup_tables_done_option=0) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_select.cc:110
#16 0x00000000006f02a9 in execute_sqlcom_select (thd=0x2b28c4cf1650, all_tables=0x28224f70) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:5094
#17 0x00000000006f53de in mysql_execute_command (thd=0x2b28c4cf1650) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:2642
#18 0x00000000006f8fa8 in mysql_parse (thd=0x2b28c4cf1650,
    rawbuf=0x28223e20 "select connector2_.id as id11_, connector2_.name as name11_, connector2_.flags as flags11_, connector2_.class_name as class4_11_, connector2_.description as descript5_11_, connector2_.domain_fk as dom"...,
    length=<value optimized out>, parser_state=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:6235
#19 0x00000000006fa9d6 in dispatch_command (command=COM_QUERY, thd=0x2b28c4cf1650, packet=0x2b28c5a90361 "", packet_length=591)
    at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:1334
#20 0x00000000006fc287 in do_command (thd=0x2b28c4cf1650) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:1036
#21 0x00000000006c3c86 in do_handle_one_connection (thd_arg=0x2b28c4cf1650) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_connect.cc:982
#22 0x00000000006c3d65 in handle_one_connection (arg=0x2b28c7386bf0) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_connect.cc:898
#23 0x0000000000adcad6 in pfs_spawn_thread (arg=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/storage/perfschema/pfs.cc:1858
#24 0x00002b277221c83d in start_thread () from /lib64/libpthread.so.0
#25 0x00002b27734dff8d in clone () from /lib64/libc.so.6
[18 Feb 2015 21:10] Alexander Kralin
(gdb) fr 3
#3  handler::ha_statistic_increment (this=0x2ab90cb0, offset=&system_status_var::ha_write_count) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/handler.cc:2387
2387    /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/handler.cc: No such file or directory.
        in /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/handler.cc
(gdb) p table
$1 = (TABLE *) 0x0

(gdb) fr 6
#6  0x00000000006cd831 in end_write (join=0x207ec030, join_tab=0x207f5240, end_of_records=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc:2994
2994    /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc: No such file or directory.
        in /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_executor.cc
(gdb) p join_tab->table->s.path
$2 = {str = 0x24ca1630 "/tmp/#sql_61d0_0", length = 16}

(gdb) fr 18
#18 0x00000000006f8fa8 in mysql_parse (thd=0x2b28c4cf1650,
    rawbuf=0x28223e20 "select connector2_.id as id11_, connector2_.name as name11_, connector2_.flags as flags11_, connector2_.class_name as class4_11_, connector2_.description as descript5_11_, connector2_.domain_fk as dom"...,
    length=<value optimized out>, parser_state=<value optimized out>) at /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc:6235
6235    /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc: No such file or directory.
        in /export/home/pb2/build/sb_0-10188268-1378799579.95/rpm/BUILD/mysql-5.6.14/mysql-5.6.14/sql/sql_parse.cc
(gdb) set print elements 1000
(gdb) p thd->query_string
$3 = {string = {
    str = 0x28223e20 "select connector2_.id as id11_, connector2_.name as name11_, connector2_.flags as flags11_, connector2_.class_name as class4_11_, connector2_.description as descript5_11_, connector2_.domain_fk as domain6_11_, connector2_.post_query as post7_11_, connector2_.starting_connector_msg as starting8_11_, connector2_.url as url11_ from pd_campaign pdcampaign0_ inner join pd_campaign_connectors connectors1_ on pdcampaign0_.id=connectors1_.pd_campaign_fk inner join connector connector2_ on connectors1_.connector_fk=connector2_.id where pdcampaign0_.id=10359600000647 order by connector2_.id ASC", length = 591}, cs = 0x12b4bc0}
(gdb) p thd->db
$5 = 0x2b28c67b0cd0 "PARTITION_DB_103596"
(gdb) p thd->start_time
$2 = {tv_sec = 1424181601, tv_usec = 983553}
# 2015-02-17 06:00:01

Explain plain of the query which provokes this crash:

explain select connector2_.id as id11_, connector2_.name as name11_, connector2_.flags as flags11_, connector2_.class_name as class4_11_, connector2_.description as descript5_11_, connector2_.domain_fk as domain6_11_, connector2_.post_query as post7_11_, connector2_.starting_connector_msg as starting8_11_, connector2_.url as url11_ from pd_campaign pdcampaign0_ inner join pd_campaign_connectors connectors1_ on pdcampaign0_.id=connectors1_.pd_campaign_fk inner join connector connector2_ on connectors1_.connector_fk=connector2_.id where pdcampaign0_.id=10359600000647 order by connector2_.id ASC;
+----+-------------+--------------+--------+---------------------------------------------+---------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table        | type   | possible_keys                               | key     | key_len | ref                                           | rows | Extra                                        |
+----+-------------+--------------+--------+---------------------------------------------+---------+---------+-----------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pdcampaign0_ | const  | PRIMARY                                     | PRIMARY | 8       | const                                         |    1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | connectors1_ | ref    | PRIMARY,fk_pd_campaign_connectors_connector | PRIMARY | 8       | const                                         |    1 | Using index                                  |
|  1 | SIMPLE      | connector2_  | eq_ref | PRIMARY                                     | PRIMARY | 8       | PARTITION_DB_103596.connectors1_.connector_fk |    1 | NULL                                         |
+----+-------------+--------------+--------+---------------------------------------------+---------+---------+-----------------------------------------------+------+----------------------------------------------+
3 rows in set (0.00 sec)
[18 Feb 2015 21:44] MySQL Verification Team
Please try 5.6.23.  This stack trace seems like a known bug listed in change history.

http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-23.html
"For failure to create a temporary table due to being out of file descriptors, the server exited rather than returning an error. (Bug #18948649)"

Note the "out of file descriptors" could also be something else, like out of disk space.
[19 Feb 2015 0:14] Alexander Kralin
Shane, thank for the quick review. Going to try 5.6.23 .
[20 Feb 2015 21:57] Alexander Kralin
Hello!

Just verified behavior with 5.6.23. This version does not crash if /tmp inaccessible and returns error as expected.

ERROR 1 (HY000) at line 1: Can't create/write to file '/database/3307/tmp/#sql_432d_0.MYI' (Errcode: 13 - Permission denied)

Looks like we can close the feature request.

Thank you for the help!
[29 Mar 2015 4:37] MySQL Verification Team
closing, as 5.6.23 solves the issue as expected.