Bug #50589 | Server hang on a query evaluated using a temporary table | ||
---|---|---|---|
Submitted: | 25 Jan 2010 12:37 | Modified: | 8 Jul 2010 16:11 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Locking | Severity: | S2 (Serious) |
Version: | next-mr, next-mr-4284 | OS: | Any |
Assigned to: | Konstantin Osipov | CPU Architecture: | Any |
[25 Jan 2010 12:37]
Philip Stoev
[25 Jan 2010 12:45]
Philip Stoev
To reproduce: Step 1. With the RQG , Issue the following command: perl runall.pl --grammar=conf/WL5004_sql.yy --gendata=conf/WL5004_data.zz --basedir=/build/bzr/mysql-next-mr --mysqld=--enable-performance-schema --mysqld=--secure-file-priv=/tmp --queries=100K This provides the background load on the server. This grammar contains a lot of DDL, but smaller-scale hangs have been observed even with a simple, DML only grammar; Logging to table is turned on by default. The RQG will print various messages and the workload itself may cause a crash or a deadlock. You can disregard all output. Step 2. While Step 1 is running, issue the following query using a mysql client: SELECT * FROM (INFORMATION_SCHEMA . `GLOBAL_VARIABLES` AS table1 RIGHT OUTER JOIN performance_schema . `EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME` AS table2 ON ( table2 . `MAX_TIMER_WAIT` = table1 . `VARIABLE_VALUE` ) ) WHERE ( ( table2 . `SUM_TIMER_WAIT` >= table2 . `SUM_TIMER_WAIT` OR table1 . `VARIABLE_VALUE` > 0) OR table2 . `MAX_TIMER_WAIT` > table2 . `EVENT_NAME`) ORDER BY table2 . `THREAD_ID` , table2 . `SUM_TIMER_WAIT` LIMIT 169 ; This query will take up to 40 seconds. Step 3. While Step 2 is running, issue SHOW PROCESSLIST in a separate mysql client. You will observe that you can not connect at all, or that SHOW PROCESSLIST will take up to 20 seconds for a command that generally executes instanteneously.
[25 Jan 2010 12:59]
Philip Stoev
There is a query that causes the hang and uses performance_schema tables only SELECT * FROM (performance_schema . `EVENTS_WAITS_SUMMARY_BY_THREAD_BY_EVENT_NAME` AS table1 RIGHT JOIN performance_schema . `EVENTS_WAITS_HISTORY` AS table2 ON ( table2 . `FLAGS` = table1 . `MAX_TIMER_WAIT` ) ) WHERE ( (NOT table1 . `AVG_TIMER_WAIT` <= table1 . `AVG_TIMER_WAIT` AND table2 . `OBJECT_SCHEMA` <> '2005-03-11 08:29:55') OR table2 . `THREAD_ID` >= table1 . `SUM_TIMER_WAIT`) ORDER BY table2 . `FLAGS` , table2 . `SPINS` , table1 . `EVENT_NAME` LIMIT 151;
[25 Jan 2010 13:50]
Philip Stoev
It turns out PERFORMANCE_TABLE is not required at all. If the participating tables are cloned with CREATE ... LIKE and INSERT ... SELECT, the issue is reproducible with MyISAM as well. Also, the background load can even be --threads=1 and the issue will still show up.
[25 Jan 2010 14:15]
Philip Stoev
bug50589.dump
Attachment: bug50589.dump (application/octet-stream, text), 50.92 KiB.
[25 Jan 2010 14:17]
Philip Stoev
To reproduce on a tree that does not contain performance_schema, load the attached SQL dump in a separate database and then use the following SELECT SELECT * FROM (t1 AS table1 RIGHT OUTER JOIN t2 AS table2 ON ( table2 . `MAX_TIMER_WAIT` = table1 . `VARIABLE_VALUE` ) ) WHERE ( ( table2 . `SUM_TIMER_WAIT` >= table2 . `SUM_TIMER_WAIT` OR table1 . `VARIABLE_VALUE` > 0) OR table2 . `MAX_TIMER_WAIT` > table2 . `EVENT_NAME`) ORDER BY table2 . `THREAD_ID` , table2 . `SUM_TIMER_WAIT` LIMIT 169 ;
[26 Jan 2010 9:08]
Philip Stoev
The longest wait occurs here: mysql> select * from EVENTS_WAITS_HISTORY_LONG where thread_id = 31 order by timer_wait desc limit 1\G *************************** 1. row *************************** THREAD_ID: 31 EVENT_ID: 174 EVENT_NAME: wait/io/file/myisam/dfile SOURCE: mi_delete_table.c:69 TIMER_START: 23409590088000 TIMER_END: 32260511409200 TIMER_WAIT: 8850921321200 SPINS: NULL OBJECT_SCHEMA: NULL OBJECT_NAME: /build/bzr/mysql-next-mr/mysql-test/var/tmp/#sql_1e0f_1.MYD OBJECT_TYPE: FILE OBJECT_INSTANCE_BEGIN: 139737476289808 NESTING_EVENT_ID: NULL OPERATION: delete NUMBER_OF_BYTES: NULL FLAGS: 0 1 row in set (0.03 sec) Strace shows as follows: [pid 7748] 11:00:04.561035 unlink("/build/bzr/mysql-next-mr/mysql-test/var/tmp/#sql_1e0f_1.MYD" <unfinished ...> [pid 7748] 11:00:13.736838 <... unlink resumed> ) = 0 <9.175728> in other words, removing the temporary file that is being produced by the query takes many seconds, during which time locks are being held that block the rest of the server.
[26 Jan 2010 11:33]
Jon Olav Hauglid
The IS query is trying to close() a temp file used in the join. While doing this, it holds the THR_LOCK_open mutex. While this happens, there is a chance that a statement (for example ALTER TABLE) from the RGQ workload will try to open a table (my_open()). This will cause it to block on THR_LOCK_open while holding LOCK_open. In turn, this can cause SHOW PROCESSLIST to block on LOCK_open when it tries to write the statement to the general query log. I'm only able to reproduce this if I put the temp files on a NFS share so accessing the temp files becomes very slow.
[26 Jan 2010 12:57]
Philip Stoev
If logging to CSV is enabled, essentially all statements are affected, including SHOW PROCESSLIST, since every statement has to open the CSV log file and write to it. If logging to CSV is not enabled, DML and DDL statements may be affected.
[16 Feb 2010 11:29]
Konstantin Osipov
WL#3983
[15 Jun 2010 15:13]
Konstantin Osipov
Queued into trunk-runtime, 3054 Konstantin Osipov 2010-06-11 WL#5419 "LOCK_open scalability: make tdc_refresh_version an atomic counter" Split the large LOCK_open section in open_table(). Do not call open_table_from_share() under LOCK_open. Remove thd->version. This fixes Bug#50589 "Server hang on a query evaluated using a temporary table" Bug#51557 "LOCK_open and kernel_mutex are not happy together" Bug#49463 "LOCK_table and innodb are not nice when handler instances are created". This patch has effect on storage engines that rely on ha_open() PSEA method being called under LOCK_open. In particular: 1) NDB is broken and left unfixed. NDB relies on LOCK_open being kept as part of ha_open(), since it uses auto-discovery. While previously the NDB open code was race-prone, now it simply fails on asserts. 2) HEAP engine had a race in ha_heap::open() when a share for the same table could be added twice to the list of shares, or a dangling reference to a share stored in HEAP handler. This patch aims to address this problem by 'pinning' the newly created share in the internal HEAP engine share list until at least one handler instance is created using that share. @ include/heap.h Add members to HP_CREATE_INFO. Declare heap_release_share(). @ sql/lock.cc Remove thd->version, use thd->open_tables->s->version instead. @ sql/repl_failsafe.cc Remove thd->version. @ sql/sql_base.cc - close_thread_table(): move handler cleanup code outside the critical section protected by LOCK_open. - remove thd->version - split the large critical section in open_table() that opens a new table from share and is protected by LOCK_open into 2 critical sections, thus reducing the critical path. - make check_if_table_exists() acquire LOCK_open internally. - use thd->open_tables->s->version instead of thd->refresh_version to make sure that all tables in thd->open_tables are in the same refresh series. @ sql/sql_base.h Add declaration for check_if_table_exists(). @ sql/sql_class.cc Remove init_open_tables_state(), it's now equal to reset_open_tables_state(). @ sql/sql_class.h Remove thd->version, THD::init_open_tables_state(). @ sql/sql_plugin.cc Use table->m_needs_reopen to mark the table as stale rather than manipulate with thd->version, which is no more. @ sql/sql_udf.cc Use table->m_needs_reopen to mark the table as stale rather than manipulate with thd->version, which is no more. @ sql/table.h Remove an unused variable. @ sql/tztime.cc Use table->m_needs_reopen to mark the table as stale rather than manipulate with thd->version, which is no more. @ storage/heap/CMakeLists.txt Add heap tests to cmake build files. @ storage/heap/ha_heap.cc Fix a race when ha_heap::ha_open() could insert two HP_SHARE objects into the internal share list or store a dangling reference to a share in ha_heap instance, or wrongly set implicit_emptied. @ storage/heap/hp_create.c Optionally pin a newly created share in the list of shares by increasing its open_count. This is necessary to make sure that a newly created share doesn't disappear while a HP_INFO object is being created to reference it. @ storage/heap/hp_open.c When adding a new HP_INFO object to the list of objects in the heap share, make sure the open_count is not increased twice. @ storage/heap/hp_test1.c Adjust the test to new function signatures. @ storage/heap/hp_test2.c Adjust the test to new function signatures.
[22 Jun 2010 13:07]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100622130139-u05awgya93zvbsop) (version source revid:marko.makela@oracle.com-20100603095032-v5ptkkzt1bhz0m1d) (merge vers: 5.1.48) (pib:16)
[22 Jun 2010 13:09]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100622130623-r7yhm89fz9n5t9nb) (version source revid:alik@sun.com-20100622130528-187gd949sa9b6pa6) (pib:16)
[8 Jul 2010 16:11]
Paul DuBois
Noted in 5.5.5 changelog. Previously, the server held a global mutex while performing file operations such as deleting an .frm or data file, or reading table metadata from an .frm file or index statistics from a data file. Now the mutex is not held for these operations. Instead, the server uses metadata locks.