Bug #15758 | Holding adaptive search latch in innobase_query_caching_of_table_permitted() | ||
---|---|---|---|
Submitted: | 14 Dec 2005 20:05 | Modified: | 11 Apr 2006 3:04 |
Reporter: | Heikki Tuuri | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.0.16 | OS: | |
Assigned to: | Alexander Ivanov | CPU Architecture: | Any |
[14 Dec 2005 20:05]
Heikki Tuuri
[14 Dec 2005 20:17]
Heikki Tuuri
Jan, please look at this. Heikki
[15 Dec 2005 12:03]
Jan Lindström
I have not been able to repeat his problem with a information given. I have asked from the user if he can add following line of code to ha_innodb.cc to line 987: ut_a(trx->has_search_latch == FALSE); This would crash the server and we would get a stack trace. However, while waiting for answer, I would like to know why following is in the code: if (trx->has_search_latch) { ut_print_timestamp(stderr); sql_print_error("The calling thread is holding the adaptive " "search, latch though calling " "innobase_query_caching_of_table_permitted."); ut_a(trx->has_search_latch == FALSE); } innobase_release_stat_resources(trx); Note that innobase_release_stat_resources will release adaptive search latch if a transaction holds that. Regards, Jan
[15 Dec 2005 12:38]
Heikki Tuuri
Jan, if trx is holding the latch, we might end up in a deadlock. The question is when in execution does MySQL call that function, and why we have not released the statement level resources before that. The comment in ha_innodb.cc says: " /********************************************************************** The MySQL query cache uses this to check from InnoDB if the query cache at the moment is allowed to operate on an InnoDB table. The SQL query must be a non-locking SELECT. The query cache is allowed to operate on certain query only if this function returns TRUE for all tables in the query. If thd is not in the autocommit state, this function also starts a new transaction for thd if there is no active trx yet, and assigns a consistent read view to it if there is no read view yet. Why a deadlock of threads is not possible: the query cache calls this function at the start of a SELECT processing. Then the calling thread cannot be holding any InnoDB semaphores. The calling thread is holding the query cache mutex, and this function will reserver the InnoDB kernel mutex. Thus, the 'rank' in sync0sync.h of the MySQL query cache mutex is above the InnoDB kernel mutex. */ my_bool innobase_query_caching_of_table_permitted( /*======================================*/ " If MySQL only calls this at the start of new SQL statement, why InnoDB has not released the 'statement level resources' before that. InnoDB releases them in ha_innobase::external_lock() when the lock count becomes zero. And it also releases them in ::start_stmt(). Regards, Heikki
[16 Dec 2005 6:20]
Jan Lindström
Additional information received from the user. Last query processed was select count(*) with a subquery. Replication is used in user's installation. Regards, Jan
[20 Dec 2005 9:55]
Jan Lindström
I have not been able to repeat this problem. Additional information requested from the user. Regards, Jan
[22 Dec 2005 5:33]
Heikki Tuuri
Jan, you can add diagnostic code: trx_print() to the error print in upcoming 5.0.18. When some user gets the error again, we will know the query and other interesting detail. Regards, Heikki
[16 Jan 2006 6:19]
Oleksandr Byelkin
Fixed status of the bug after re-assignment.
[1 Mar 2006 12:31]
Konstantin Osipov
Heikki, I'm sorry I can't provide an estimation. I am getting an email whenever this bug is updated. I'm working hard to get close to it.
[2 Mar 2006 19:00]
Chad MILLER
Claimed from Bug-Priority meeting direction.
[28 Mar 2006 3:31]
Pera Kajonpotisuwan
I'm running mysql 5.0.15 and also having the same problem as you guys mentioned. I'm thinking to upgrade to newer version, so could you guys have any comments on this? Please kindly recommend me. PK. pera@321load.com
[28 Mar 2006 16:13]
Gui Gros
I upgraded to 5.0.19 and I am having this problem too, I had to go back to 5.0.18 because the server was crashing. I'll try to make some test and find a way to reproduce. here is what I got in my error log 060324 11:35:42060324 11:35:42 [ERROR] The calling thread is holding the adaptive search, latch though calling innobase_query_caching_of_table_permitted. TRANSACTION 0 3706224, ACTIVE 0 sec, OS thread id 12084 mysql tables in use 2, locked 0 , holds adaptive hash latch MySQL thread id 5, query id 35213 localhost 127.0.0.1 adminuser1 Sending data select count(*) from ( select distinct users.id, userslogins.login,users.BusinessName, users.firstname,users.lastname,users.id as UserId,sites.name as Site,userslogins.id as LoginId from ( ( ( iislogs left join userslogins on iislogs.LoginId = userslogins.Id ) left join users on userslogins.UserId=users.id ) left join sites on sites.id=users.siteid ) where users.id>0 order by users.siteid,users.businessname ASC ) as t 060324 11:35:42InnoDB: Assertion failure in thread 12084 in file .\ha_innodb.cpp line 985 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html InnoDB: about forcing recovery. InnoDB: Thread 27356 stopped in file .\os\os0sync.c line 487 InnoDB: Thread 27320 stopped in file .\sync\sync0arr.c line 126 InnoDB: Thread 26468 stopped in file .\read\read0read.c line 203 060324 11:40:33 [Note] C:\mysql\bin\mysqld.exe: Normal shutdown 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 91 user: 'root' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 90 user: 'adminuser1' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 89 user: 'adminuser1' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 88 user: 'root' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 84 user: 'root' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 71 user: 'BarterService' 060324 11:40:35 [Warning] C:\mysql\bin\mysqld.exe: Forcing close of thread 64 user: 'BarterService' 060324 11:42:30 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... 060324 11:42:30 InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 0 3705393139. InnoDB: Doing recovery: scanned up to log sequence number 0 3705393139 InnoDB: Last MySQL binlog file position 0 1355035, file name .\BCIserver1-bin.000252 060324 11:42:31 InnoDB: Started; log sequence number 0 3705393139 060324 11:42:31 [Note] Recovering after a crash using BCIserver1-bin 060324 11:42:31 [Note] Starting crash recovery... 060324 11:42:31 [Note] Crash recovery finished. 060324 11:42:32 [Note] C:\mysql\bin\mysqld.exe: ready for connections.
[29 Mar 2006 12:01]
Heikki Tuuri
Reassigning this to Jan Lindström. Please remove the assertion ut_error that you added to ha_innodb.cc in 5.0.19 which crashes the mysqld server if it prints a warning about the adaptive latch. It was not our intention to crash the server, just to print diagnostic information. Regards, Heikki
[29 Mar 2006 12:19]
Jan Lindström
Applied to 5.0 and 5.1 trees on InnoDB internal svn. Regards, Jan
[29 Mar 2006 15:57]
Heikki Tuuri
Hi! In the latest printout we see SELECT COUNT(*) being performed on a subquery named as 't'. It may be that the query cache code tries to cache BOTH the result of the subquery 't', AND the SELECT COUNT(*) on 't'. This would explain the warnings. InnoDB may acquire the adaptive hash index latch in in the subquery 't', and then print a warning when the query cache for the second time asks about cacheability. We should test inside gdb in which phases MySQL calls query cache routines. Then we should check from semaphore ranks (also the query cache mutex has a rank) if a deadlock of threads is possible. InnoDB should release the adaptive hash index latch soon enough that no higher rank semaphore can be reserved while holding it. select count(*) from ( select distinct users.id, userslogins.login,users.BusinessName, users.firstname,users.lastname,users.id as UserId,sites.name as Site,userslogins.id as LoginId from ( ( ( iislogs left join userslogins on iislogs.LoginId = userslogins.Id ) left join users on userslogins.UserId=users.id ) left join sites on sites.id=users.siteid ) where users.id>0 order by users.siteid,users.businessname ASC ) as t Regards, Heikki
[30 Mar 2006 11:49]
Heikki Tuuri
Assigning this to myself, since I may ne able to find the bug today.
[30 Mar 2006 15:04]
Heikki Tuuri
The reason for the COUNT(*) FROM (SELECT ...) AS t problem is probably solved. MySQL for some reason runs the query in the derived table first, and only AFTER that calls the query cache routines. InnoDB assumes that MySQL first calls the query cache routine and after that runs the query. InnoDB can acquire the adaptive hash latch when running the derived table query, and then print a warning when MySQL calls innobase_query_caching_of_table_permitted(). (gdb) bt #0 row_search_for_mysql (buf=0x89a4840 "\001\002", mode=0, prebuilt=0x42b31068, match_mode=1, direction=1) at row0sel.c:3059 #1 0x08219b22 in ha_innobase::general_fetch (this=0x89a4718, buf=0x0, direction=0, match_mode=0) at ha_innodb.cc:4158 #2 0x081b8900 in join_read_next_same (info=0x0) at sql_select.cc:10204 #3 0x081b77a6 in sub_select (join=0x89acb60, join_tab=0x899c5dc, end_of_records=64) at sql_select.cc:9660 #4 0x081b7993 in evaluate_join_record (join=0x89acb60, join_tab=0x899c478, error=1, report_error=0x0) at sql_select.cc:9770 #5 0x081b7857 in sub_select (join=0x89acb60, join_tab=0x899c478, end_of_records=64) at sql_select.cc:9655 #6 0x081b7392 in do_select (join=0x89acb60, fields=0x0, table=0x89aeb48, procedure=0x0) at sql_select.cc:9418 #7 0x081ab343 in JOIN::exec (this=0x89acb60) at sql_select.cc:1342 #8 0x081ac86a in mysql_select (thd=0x89800c8, rref_pointer_array=0x899aeb8, tables=0x899b2e8, wild_num=0, fields=@0x0, conds=0x899b788, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2424588801, result=0x899bc58, unit=0x899aef8, select_lex=0x899ad98) at sql_select.cc:1885 #9 0x08253590 in mysql_derived_filling (thd=0x89800c8, lex=0x8980104, orig_table_list=0x0) at sql_derived.cc:256 #10 0x082530d4 in mysql_handle_derived (lex=0x8980104, processor=0x82534a0 <mysql_derived_filling>) at sql_derived.cc:57 #11 0x0819ec2b in open_and_lock_tables (thd=0x89800c8, tables=0x899ba68) at sql_class.h:1492 #12 0x08175cba in mysql_execute_command (thd=0x89800c8) at sql_parse.cc:2461 #13 0x0817e610 in mysql_parse (thd=0x89800c8, inBuf=0x899aa20 "select count(*) from (select distinct t1.a, t2.b from t t1, t t2 where t1.a = t2.b) as k", length=144179460) at sql_parse.cc:5628 #14 0x08174483 in dispatch_command (command=144179400, thd=0x89800c8, packet=0x89929f1 "select count(*) from (select distinct t1.a, t2.b from t t1, t t2 where t1.a = t2.b) as k", packet_length=144288288) at sql_parse.cc:1713 #15 0x08173fcd in do_command (thd=0x89800c8) at sql_parse.cc:1514 #16 0x081733a2 in handle_one_connection (arg=0x89800c8) at sql_parse.cc:1158 #17 0x40042b63 in start_thread () from /lib/tls/libpthread.so.0 #18 0x4024c18a in clone () from /lib/tls/libc.so.6 (gdb) c Continuing. (gdb) bt #0 innobase_query_caching_of_table_permitted (thd=0x89800c8, full_name=0x899cef8 "", full_name_len=7, unused=0x899b418) at ha_innodb.cc:972 #1 0x0821e86c in ha_innobase::register_query_cache_table (this=0x899d120, thd=0x899cef8, table_key=0x899cef8 "", key_length=144297720, call_back=0x899cef8, engine_data=0x89800c8) at ha_innodb.h:191 #2 0x082442b5 in Query_cache::ask_handler_allowance (thd=0x89800c8, tables_used=0x899b2e8) at sql_cache.cc:2953 #3 0x08241b59 in Query_cache::store_query (this=0x84ae7e0, thd=0x89800c8, tables_used=0x899ba68) at sql_cache.cc:842 #4 0x08175d00 in mysql_execute_command (thd=0x89800c8) at sql_parse.cc:2493 #5 0x0817e610 in mysql_parse (thd=0x89800c8, inBuf=0x899aa20 "select count(*) from (select distinct t1.a, t2.b from t t1 t t2 where t1.a = t2.b) as k", length=144179460) at sql_parse.cc:5628 #6 0x08174483 in dispatch_command (command=144179400, thd=0x89800c8, packet=0x89929f1 "select count(*) from (select distinct t1.a, t2.b from t t , t t2 where t1.a = t2.b) as k", packet_length=144288288) at sql_parse.cc:1713 #7 0x08173fcd in do_command (thd=0x89800c8) at sql_parse.cc:1514 #8 0x081733a2 in handle_one_connection (arg=0x89800c8) at sql_parse.cc:1158 #9 0x40042b63 in start_thread () from /lib/tls/libpthread.so.0 #10 0x4024c18a in clone () from /lib/tls/libc.so.6 (gdb)
[30 Mar 2006 15:41]
Heikki Tuuri
This patch over 5.0.18 should fix the bug. It should be applied to 5.1, too. --- sql_cache.cc 2006-03-30 18:41:16.566271888 +0300 +++ sql_cache.cc-15758 2006-03-30 18:40:44.499146832 +0300 @@ -830,6 +830,7 @@ flags.sql_mode, flags.max_sort_length, flags.group_concat_max_len)); + ha_release_temporary_latches(thd); STRUCT_LOCK(&structure_guard_mutex); if (query_cache_size == 0)
[30 Mar 2006 16:30]
Heikki Tuuri
Elliot, I am assigning this to you. You should ask Sanja or some other person who knows the query cache to review this patch and apply it to 5.0 and 5.1. I ran make test on 5.0.18, and it passed (except a few tests that require charset ucs2). I also ran ibtests and: perl run-all-tests --small-test --create-options=engine=innodb no crash of mysqld. The patch simply makes InnoDB to release the adaptive hash index latch before MySQL acquires the query cache mutex. The patch should be quite safe, I do not see how it could break anything. Regards, Heikki
[31 Mar 2006 21:30]
David Paquette
I am currently having the same problem. I am in the process of deploying a system (deadline fast approaching). When will a fix be available for Windows? -David
[3 Apr 2006 23:32]
Charles Tran
Any eta on when this patch will be released in the community editions? Is there a temporary workaround? If I can know what type of queries caused this, I can try to work around the issue.
[4 Apr 2006 1:22]
David Paquette
I set the query cache size to 0 Mb and the problem seems to have stopped. There is an obvious performance hit with this work-around, but it might be a feasible temporary solution, depending on your requirements. -David
[7 Apr 2006 10:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/4622
[7 Apr 2006 15:09]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/4648
[10 Apr 2006 11:10]
Alexander Ivanov
Fixed in 5.0.21 and 5.1.10-beta
[11 Apr 2006 3:04]
Paul DuBois
Noted in 5.0.21, 5.1.10 changelogs. <literal>InnoDB</literal> failure to release an adaptive hash index latch could cause a server crash if the query cache was enabled. (Bug #15758)
[19 Apr 2006 21:31]
Gui Gros
I guess we need to reopen this one. I upgraded to 5.0.20 and the problem is still there. 060419 15:51:06060419 15:51:06 [ERROR] The calling thread is holding the adaptive search, latch though calling innobase_query_caching_of_table_permitted. TRANSACTION 0 5524100, ACTIVE 0 sec, OS thread id 20988 mysql tables in use 2, locked 0 , holds adaptive hash latch MySQL thread id 163, query id 1213689 localhost 127.0.0.1 adminuser1 Sending data select count(*) from ( select distinct users.id, userslogins.login,users.BusinessName, users.firstname,users.lastname,users.id as UserId,sites.name as Site,userslogins.id as LoginId from ( ( ( iislogs left join userslogins on iislogs.LoginId = userslogins.Id ) left join users on userslogins.UserId=users.id ) left join sites on sites.id=users.siteid ) where users.id>0 order by users.siteid,users.businessname ASC ) as t 060419 15:51:06InnoDB: Assertion failure in thread 20988 in file .\ha_innodb.cpp line 992 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html InnoDB: about forcing recovery. InnoDB: Thread 22420 stopped in file .\sync\sync0arr.c line 126 InnoDB: Thread 23288 stopped in file .\os\os0sync.c line 487 InnoDB: Thread 13052 stopped in file .\os\os0sync.c line 211
[19 Apr 2006 21:34]
Gui Gros
OK please ignore my last comment, I guess I'll wait 5.0.21 and see....
[19 Apr 2006 21:34]
Mark Leith
Hi, This was fixed within 5.0.21, not 5.0.20, unfortunately. It should be available within the next release. Best regards Mark