Bug #76346 | innodb_thread_concurrency=1 and queries using intrinsic temp tables, causes hang | ||
---|---|---|---|
Submitted: | 16 Mar 2015 23:04 | Modified: | 12 Jun 2015 9:34 |
Reporter: | Roel Van de Paar | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.7.6-m16, 5.7.8 | OS: | Any |
Assigned to: | Naga Satyanarayana Bodapati | CPU Architecture: | Any |
Tags: | query shutdown hang |
[16 Mar 2015 23:04]
Roel Van de Paar
[16 Mar 2015 23:09]
Roel Van de Paar
Error log
Attachment: mysqld.err (application/octet-stream, text), 35.10 KiB.
[16 Mar 2015 23:10]
Roel Van de Paar
Ah, looks like some sort of long semaphore wait issue (ref error log) 2015-03-16T22:51:57.845878Z 0 [Note] Event Scheduler: Purging the queue. 3 events 2015-03-16T23:02:36.908074Z 0 [Warning] InnoDB: A long semaphore wait: --Thread 140046824883968 has waited at trx0purge.cc line 153 for 324.00 seconds the semaphore: Mutex at 0x7f5f3f67b7a0 created file trx0rseg.cc line 211, lock var 1 Last time reserved in file /bzr/mysql-5.7.6-m16_dbg/storage/innobase/trx/trx0trx.cc line 1528
[16 Mar 2015 23:10]
Roel Van de Paar
(!) Note that this run was only a single threaded run (i.e. a single client/thread) (!)
[16 Mar 2015 23:11]
Roel Van de Paar
gdb trace, taken early on in the wait
Attachment: gdb_std.txt (text/plain), 33.73 KiB.
[16 Mar 2015 23:11]
Roel Van de Paar
gdb full trace, taken somewhat later
Attachment: gdb_full.txt (text/plain), 81.65 KiB.
[16 Mar 2015 23:12]
Roel Van de Paar
gdb trace, taken later again (diffs from previous gdb_std.txt)
Attachment: gdb_std_later.txt (text/plain), 33.43 KiB.
[19 Mar 2015 22:58]
Roel Van de Paar
Guys, I think this issue is really serious and should get high prio attention. It's happening all over the 5.7.6-m16 build from what I can see. Only takes 1-3 trials to have it happen.
[19 Mar 2015 23:56]
Roel Van de Paar
Stacks from another example
Attachment: gdb_2_std.txt (text/plain), 30.69 KiB.
[19 Mar 2015 23:57]
Roel Van de Paar
Full stacks from the other example mentioned in last comment
Attachment: gdb_2_full.txt (text/plain), 31.71 KiB.
[19 Mar 2015 23:59]
Roel Van de Paar
Added another example. Error log not showing any long semaphore waits yet (13 min lockup); ------------ 2015-03-19T23:44:03.203802Z 0 [Note] Shutting down slave threads 2015-03-19T23:44:05.203885Z 0 [Note] Forcefully disconnecting 2 remaining clients 2015-03-19T23:44:05.203896Z 0 [Warning] /sda/MS-mysql-5.7.6-m16-linux-x86_64-debug/bin/mysqld: Forcing close of thread 6 user: 'root' 2015-03-19T23:44:05.203920Z 0 [Note] Event Scheduler: Killing the scheduler thread, thread id 1 2015-03-19T23:44:05.203927Z 0 [Note] Event Scheduler: Waiting for the scheduler thread to reply 2015-03-19T23:44:05.203989Z 0 [Note] Event Scheduler: Stopped 2015-03-19T23:44:05.203998Z 0 [Note] Event Scheduler: Purging the queue. 0 events ------------ <EOF> There were other errors in the log, from executing various testcase sql's. Server was started clean/with freshly init data dir (--init)
[20 Mar 2015 0:00]
Roel Van de Paar
Again the trial was single threaded. Notice the mutex issues in the gdb stacks.
[20 Mar 2015 8:50]
MySQL Verification Team
threads hanging around in ut_delay suggest similar cause to bug #76402
[20 Mar 2015 9:28]
Roel Van de Paar
Agreed re: ut_delay. I removed the options discussed there, but the issue is still present, though apparently less regular (TBD). Attaching new (and indeed different looking) stacks from a hang.
[20 Mar 2015 9:29]
Roel Van de Paar
gdb_std.txt
Attachment: gdb_std.txt (text/plain), 23.31 KiB.
[20 Mar 2015 9:29]
Roel Van de Paar
gdb_std.txt
Attachment: gdb_std.txt (text/plain), 23.31 KiB.
[20 Mar 2015 9:29]
Roel Van de Paar
gdb_full.txt
Attachment: gdb_full.txt (text/plain), 99.59 KiB.
[20 Mar 2015 10:16]
Roel Van de Paar
The 'long semaphore wait' seems unrelated.
[20 Mar 2015 10:17]
Roel Van de Paar
Another set of stacks, similar to last set (after options were removed) gdb_std.txt
Attachment: gdb_std.txt (text/plain), 23.32 KiB.
[20 Mar 2015 10:17]
Roel Van de Paar
matching gdb_full.txt
Attachment: gdb_full.txt (text/plain), 99.99 KiB.
[23 Mar 2015 22:08]
Roel Van de Paar
First testcase; DROP DATABASE test;CREATE DATABASE test;USE test; SET @@session.max_heap_table_size=0; SET @@global.innodb_thread_concurrency=1; SELECT name,n_cols,file_format,row_format FROM information_schema.innodb_sys_tables; select convert_tz(0,'a','a'); Client will hang on last query. Pressing CTRL+C will not help; mysql> select convert_tz(0,'a','a'); ^C^C -- query aborted ^C^C -- query aborted ^C^C -- query aborted ^C^C -- query aborted Press CTRL+Z to exit and kill the PID. When this hang is occurring, one can check status with another client: mysql> show processlist; +----+------+-----------+------+---------+------+-------------+------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+-----------+------+---------+------+-------------+------------------------------+ | 2 | root | localhost | test | Query | 540 | System lock | select convert_tz(0,'a','a') | | 29 | root | localhost | test | Query | 0 | starting | show processlist | +----+------+-----------+------+---------+------+-------------+------------------------------+ 2 rows in set (0.00 sec) And, trying to shutdown server in this case will hang also. Looking at my previous stack in this issue (http://bugs.mysql.com/file.php?id=22521&bug_id=76346) I notice: #16 0x00000000010a0bd4 in dispatch_command (command=COM_QUERY, thd=0x7f0a9001a000, packet=0x7f0a90512011 "set @@session.time_zone='UTC';", packet_length=30) at /bzr/mysql-5.7.6-m16_dbg/sql/sql_parse.cc:1249 So this would have been another occurrence, likely of the same problem. Issue looks clearly related to timezones, maybe related to nano (see stacks). Testcase is 100% reproducible so will not add additional stacks for this testcase, devs can break in with gdb directly.
[23 Mar 2015 22:10]
Roel Van de Paar
#16 0x00000000010a0bd4 in dispatch_command (command=COM_QUERY, thd=0x7fd14f41a000, packet=0x7fd14f446011 "SET @@time_zone = '6';", packet_length=22) at /bzr/mysql-5.7.6-m16_dbg/sql/sql_parse.cc:1249 From another previously uploaded stack. Again, timezone related.
[23 Mar 2015 22:18]
Roel Van de Paar
Serious regression in 5.7.6-m16. The issue does NOT reproduce in 5.7.5-m15. 5.7.5-m15 output is immediate; mysql> select convert_tz(0,'a','a'); +-----------------------+ | convert_tz(0,'a','a') | +-----------------------+ | NULL | +-----------------------+ 1 row in set (0.00 sec)
[24 Mar 2015 9:33]
Roel Van de Paar
Another testcase DROP DATABASE test;CREATE DATABASE test;USE test; SET @@global.innodb_thread_concurrency=1; SELECT table_catalog, table_schema, table_name, column_name FROM information_schema.columns WHERE table_catalog IS NULL OR table_catalog <> 'def'; HELP '%function_2';
[24 Mar 2015 9:34]
Roel Van de Paar
Last testcase seems to indicate it's not TZ related.
[24 Mar 2015 9:40]
MySQL Verification Team
what seems important is that some sql queries cause innodb tables in the mysql database to be accessed!
[24 Mar 2015 11:02]
MySQL Verification Team
Hello Roel, Thank you for the report and test case. Observed similar behavior with 5.7.8(debug/release builds). Thanks, Umesh
[24 Mar 2015 11:04]
MySQL Verification Team
// 5.7.8 - pmp [umshastr@hod03]~/bugs: ./pmp.sh 10 __io_getevents_0_4,os_aio_linux_collect,os_aio_linux_handle,fil_aio_wait,io_handler_thread,start_thread,clone 3 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,srv_worker_thread,start_thread,clone 1 sigwait,signal_hand,pfs_spawn_thread,start_thread,clone 1 sigwaitinfo,timer_notify_thread_func,pfs_spawn_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,srv_purge_coordinator_suspend,srv_purge_coordinator_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,buf_resize_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,buf_dump_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,my_cond_wait,inline_mysql_cond_wait,compress_gtid_table,pfs_spawn_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_error_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,pc_sleep_if_needed,buf_flush_page_cleaner_coordinator,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,lock_wait_timeout_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,dict_stats_thread,start_thread,clone 1 poll,vio_io_wait,vio_socket_io_wait,vio_ssl_read,net_read_raw_loop,net_read_packet_header,net_read_packet,my_net_read,Protocol_classic::read_packet,Protocol_classic::get_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 1 poll,Mysqld_socket_listener::listen_for_connection_event,Connection_acceptor<Mysqld_socket_listener>::connection_event_loop,mysqld_main,main 1 nanosleep,os_thread_sleep,srv_master_sleep,srv_master_thread,start_thread,clone 1 nanosleep,os_thread_sleep,srv_conc_enter_innodb_with_atomics,srv_conc_enter_innodb,innobase_srv_conc_enter_innodb,ha_innobase::index_read,ha_innobase::index_first,ha_innobase::rnd_next,handler::ha_rnd_next,rr_sequential,search_topics,mysqld_help,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 1 [umshastr@hod03]~/bugs: ./pmp.sh 10 __io_getevents_0_4,os_aio_linux_collect,os_aio_linux_handle,fil_aio_wait,io_handler_thread,start_thread,clone 3 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,srv_worker_thread,start_thread,clone 1 sigwaitinfo,timer_notify_thread_func,pfs_spawn_thread,start_thread,clone 1 pthread_join,my_thread_join,mysqld_main,main 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,srv_purge_coordinator_suspend,srv_purge_coordinator_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,buf_resize_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,os_event::wait,os_event::wait_low,os_event_wait_low,buf_dump_thread,start_thread,clone 1 pthread_cond_wait@@GLIBC_2.3.2,native_cond_wait,my_cond_wait,inline_mysql_cond_wait,Global_THD_manager::wait_till_no_thd,close_connections,signal_hand,pfs_spawn_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,srv_error_monitor_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,pc_sleep_if_needed,buf_flush_page_cleaner_coordinator,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,lock_wait_timeout_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,ib_wqueue_timedwait,fts_optimize_thread,start_thread,clone 1 pthread_cond_timedwait@@GLIBC_2.3.2,os_event::timed_wait,os_event::wait_time_low,os_event_wait_time_low,dict_stats_thread,start_thread,clone 1 nanosleep,os_thread_sleep,srv_master_sleep,srv_master_thread,start_thread,clone 1 nanosleep,os_thread_sleep,srv_conc_enter_innodb_with_atomics,srv_conc_enter_innodb,innobase_srv_conc_enter_innodb,ha_innobase::index_read,ha_innobase::index_first,ha_innobase::rnd_next,handler::ha_rnd_next,rr_sequential,search_topics,mysqld_help,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 1 nanosleep,os_thread_sleep,srv_conc_enter_innodb_with_atomics,srv_conc_enter_innodb,innobase_srv_conc_enter_innodb,ha_innobase::index_read,ha_innobase::index_first,ha_innobase::rnd_next,handler::ha_rnd_next,rr_sequential,join_init_read_record,sub_select,do_select,JOIN::exec,handle_query,execute_sqlcom_select,mysql_execute_command,mysql_parse,dispatch_command,do_command,handle_connection,pfs_spawn_thread,start_thread,clone 1
[8 Apr 2015 19:02]
Daniel Price
Posted by developer: Fixed as of the upcoming 5.7.8, 5.8.0 releases, and here's the changelog entry: When "innodb_thread_concurrency=1", queries on intrinsic temporary tables caused other sessions to hang. Queries on intrinsic temporary tables should not increment the number of active threads. Thank you for the bug report.
[12 Jun 2015 9:34]
Roel Van de Paar
For clarity, this issue is still present in the RC (which is 5.7.7)