Bug #39508 | INSERT queries hang indefinitely on AMD64, again | ||
---|---|---|---|
Submitted: | 18 Sep 2008 2:29 | Modified: | 6 Apr 2010 12:08 |
Reporter: | Eric Jensen | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: General | Severity: | S1 (Critical) |
Version: | 5.0.67-x86-64 | OS: | Linux (Debian Linux 4.0 x86_64) |
Assigned to: | CPU Architecture: | Any |
[18 Sep 2008 2:29]
Eric Jensen
[18 Sep 2008 17:02]
Sveta Smirnova
Thank you for the report. Which version of GLIBC do you use? Could you also please install on one of slaves version 5.0.67 build by MySQL build team and available from http://dev.mysql.com/downloads/mysql/5.0.html#downloads: I want to check if problem is repeatable with MySQL's binaries as well with Debian binaries.
[18 Sep 2008 17:07]
Eric Jensen
we use the latest debian etch security update with libc6 2.3.6.ds1-13etch7 i will install your build on one box...mind if i use the intel compiler one?
[18 Sep 2008 17:18]
Sveta Smirnova
Thank you for the feedback. > i will install your build on one box...mind if i use the intel compiler one? No, I don't mind. Intel compiler one should be fine.
[7 Oct 2008 15:39]
Eric Jensen
We had this happen again on two boxes. We gathered better stats from one of them: w9, which had delay_key_write=ALL, ASYNC_QUERY_CONCURRENCY=4 and the debian 5.0.51a-9-log about four hours later, w7 hangs on an insert too. it had delay_key_write=OFF, ASYNC_QUERY_CONCURRENCY=0 and the debian 5.0.67-0.dotdeb.1-log interestingly, w11 which had the intel mysql 5.0.67 build had no problem and w12 which had the debian 5.0.51a-12~bpo40+1-log build AND skip-concurrent-insert had no problem. although it's not much to go on, i would tentatively say this is therefore a problem in the debian build when skip-concurrent-insert is turned on, which would be consistent with http://bugs.mysql.com/bug.php?id=8555 perhaps you guys will find something in the post mortem info i will post to corroborate this.
[21 Oct 2008 18:58]
Eric Jensen
We had this happen again on three of our slaves last night. All of them had concurrent inserts turned on, the ones with it off did not have the problem. Two of them had debian mysql's. But, in an interesting twist the third one had the tarball build directly from you guys: mysql-5.0.67-linux-x86_64-icc-glibc23.tar.gz So apparently this is not a problem with debian builds. But, it does indeed appear to be a problem with concurrent inserts...we have now transitioned all of our hosts except the backup one which has no read traffic to skip-concurrent-insert
[21 Oct 2008 19:00]
Eric Jensen
oh, also interesting was that a select in a "killed" state was caught hung too this time, and we could do "show global status" on this intel build without that hanging everything like it did previously.
[27 Oct 2008 16:14]
Eric Jensen
We just had this happen again on w11, but with skip-concurrent-insert turned on this time! I guess that theory is out the window. This was with the debian build of 5.0.67
[7 Nov 2008 22:49]
Eric Jensen
digging through the stack traces we provided, it appears this could have something to do with the query cache? we have disabled it and await the next disaster
[28 Nov 2008 6:56]
Eric Jensen
We have been running for a few weeks now with the query cache disabled and have not run into the problem again. Given this and that I see the query cache in the stack traces I provided, it seems like there is indeed some deadlock potential in the query cache somewhere. It is probably worth someone more familiar with it reading through those traces.
[5 Jan 2009 12:28]
Davi Arnaut
MySQL 5.0.68 fixes a query cache deadlock with similar trace. It would be interesting to test whether the problem is present on 5.0.68.
[5 Jan 2009 17:20]
Eric Jensen
We haven't run 5.0.68. I leave it to you to determine whether this was the same problem, as we don't plan on re-enabling the query cache to test. Thanks!
[9 Jan 2009 22:57]
Davi Arnaut
Do you guys use MyISAM's merge tables?
[9 Jan 2009 23:44]
Eric Jensen
nope
[19 Jan 2009 16:39]
Stéphane Queraud
We have the same problem on freebsd 7, amd64. first we though it came from the 5.1 version, we downgraded to 5.0.67 and the problem still occurs. currently the query_cache is enabled, and we're waiting for the next hang/crash. next time we'll restart with query_cache OFF. do you know if this deadlock bug fix has been included in the 5.1.30 version ?
[21 Jan 2009 14:17]
Stéphane Queraud
finaly we managed to make it stable since 24hours now. query_cache is still enabled, what we changed is: concurrency_insert = 0, and reduced table_cache to 256.
[21 Jan 2009 15:47]
Davi Arnaut
Stéphane, it would be interesting to do those testing on 5.0.68 as it contains a fix for a query cache deadlock. If the server deadlocks, please try to get a core file or backtrace from it.
[21 Jan 2009 17:23]
Eric Jensen
Stéphane, be careful with that. We turned off concurrent inserts and the probability of the deadlock seemed to go down but we still ran into one...you can read through everything we tried in this bug's comments.
[8 Nov 2009 19:20]
Valeriy Kravchuk
I see the following in my.cnf uploaded: query_cache_size = 384M Please, check if the problem is repeatable with disabled query cache.
[8 Nov 2009 20:42]
Eric Jensen
We have not encountered this again with query_cache_type = 0
[8 Nov 2009 20:53]
Valeriy Kravchuk
Then I am wondering if this can be related to http://bugs.mysql.com/bug.php?id=43758. Please, check (SQL and/or other threads status for 'freeing items' when hang happens, for example).
[8 Nov 2009 21:09]
Eric Jensen
After the permutations we went through, it does seem this was related to the query cache. However, I can't say whether it is a duplicate of that other bug or not. I did go through the post-mortem info we attached to this ticket and saw that one of them had the output of "show processlist". The hung replication insert thread is in the "Connect" state.
[6 Mar 2010 12:08]
Sveta Smirnova
Thank you for the feedback. If you think this is related to query cache would be good if you test it with version 5.0.90 where one of bugs wer fixed or even with 5.1.44 where both bugs are fixed. Please consider this possibility and let us know about results.
[6 Apr 2010 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".