Bug #19524 | MySQL deadlocks | ||
---|---|---|---|
Submitted: | 4 May 2006 1:30 | Modified: | 13 Jul 2006 15:25 |
Reporter: | Alan Kasindorf | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 5.0.21-max | OS: | Linux (Debian x86_64) |
Assigned to: | CPU Architecture: | Any |
[4 May 2006 1:30]
Alan Kasindorf
[5 May 2006 21:31]
Mark Leith
Hi, Unfortunately we do not currently have a Debian x86_64 machine to test against at this time, and we also need a complete test case to try and run against this. Is it always randomly just the *first* thread or *any* thread that causes the hanging? We have seen certain cases where we seem to have a problem when using the NPTL threading library (rogue hung threads), where switching to LinuxThreads has fixed the issue. You can check which threading library is in use with: getconf GNU_LIBPTHREAD_VERSION You can try to force it to LinuxThreads with: export LD_ASSUME_KERNEL=2.4.0 You will need to restart MySQL for this to take effect (and should add it to your start up script, near the start). Also check the notes here: http://hashmysql.org/index.php?title=Opteron_HOWTO And there was one possible glibc bug: https://launchpad.net/distros/ubuntu/+source/glibc/+bug/18012 So also please let us know your glibc version. If you could also come up with a more definitive test case that would be great. Look forward to hearing from you. Best regards Mark
[5 May 2006 21:54]
Alan Kasindorf
It is using NPTL 0.40, which is an incredibly old version of that. Next week I will try an upgraded version of glibc and try to reproduce the bug in cases of old glibc, and new glibc, then I will update the bug report with further information. Thanks!
[12 May 2006 11:33]
Valeriy Kravchuk
Please, reopen this bug report when you'll have results of your tests.
[22 May 2006 17:57]
Alan Kasindorf
glibc was the latest in debian sarge, which ran NPTL 0.4.0 (libc6 2.3.2.ds1-22) I upgraded glibc to the latest from debian etch, version 2.3.6-3. After doing this the MyISAM deadlock went away. I have not been able to reproduce a deadlock using MyISAM tables. however I can still deadlock a *busy* server by running: FLUSH TABLES WITH READ LOCK; SHOW MASTER STATUS; UNLOCK TABLES; - all within a couple seconds. Looks like *any* queries which entered "Waiting for readlock" mode during that short time will *never* exit that mode. Server exhibits the same issues as before and has to be force restarted. New queries that come in after the unlocking appear to work though? All queries which have been deadlockes are running against InnoDB tables. No errors appear in the error log. When I do the same on a slave system which is just running replication, it will not lock up. Running: export LD_ASSUME_KERNEL=2.4.0 Then: getconf GNU_LIBPTHREAD_VERSION displays an error: getconf: error while loading shared libraries: libc.so.6: cannot open shared object file: No such file or directory. Think I'm being an idiot again... If I can reproduce the crash on a slave I'll run gdb on it and provide a trace. Keep in mind for all intents the system that MySQL should be touching is as new as, or newer than, debian testing (custom vanilla kernel 2.6.16.6 and the latest Etch glibc). I'm unsure what other old crappy part of my OS could be causing thread hangups like this.
[24 May 2006 14:34]
Mark Leith
Hi Alan, OK it looks like you may have hit another bug that we have seen with the NTPL threading library, which I feel has had a fix commited today. Namely Bug #20048, found and fixed by Monty very recently: http://bugs.mysql.com/bug.php?id=20048 Regards Mark
[13 Jun 2006 15:25]
Valeriy Kravchuk
As bug #20048 fixed in 5.0.23, please, either try to build from current sources or wait for 5.0.23 to be offically released and reopen this report if the deadlocks described will still occur.
[13 Jul 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".