Bug #50998 | Deadlock in MDL code during test rqg_mdl_stability | ||
---|---|---|---|
Submitted: | 8 Feb 2010 15:00 | Modified: | 7 Mar 2010 1:01 |
Reporter: | John Embretsen | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Locking | Severity: | S2 (Serious) |
Version: | mysql-next-4284 | OS: | Any |
Assigned to: | Dmitry Lenev | CPU Architecture: | Any |
Tags: | pushbuild, rqg_pb2, test failure |
[8 Feb 2010 15:00]
John Embretsen
[8 Feb 2010 15:07]
John Embretsen
GDB backtrace from all threads, linux x86
Attachment: bug50998_stacktrace-all-threads_linux.txt (text/plain), 32.46 KiB.
[10 Feb 2010 14:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/99830 3088 Dmitry Lenev 2010-02-10 Fix for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability". When start of statement's waiting on a metadata lock created more than one loop in waiters graph server might have entered deadlock condition. The problem was that in the case described above MDL deadlock detector had to perform several searches for deadlock but forgot to reset Deadlock_detection_context before performing new search. Failure to do so has broken assumption in code resposible for choosing victim that if Deadlock_detection_context::victim is set we also have read lock on m_waiting_for_lock for this context. As result this lock could have been unlocked more times than it was acquired which corrupted rwlock's state which led to server deadlock. This fix ensures that such reset is done before each attempt to find a deadlock. @ mysql-test/r/mdl_sync.result Added test for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability" as well as coverage for the case when addition of statement waiting for metadata lock adds several loops in the waiters graph and therefore several searches for deadlock should be performed by MDL deadlock detector. @ mysql-test/t/mdl_sync.test Added test for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability" as well as coverage for the case when addition of statement waiting for metadata lock adds several loops in the waiters graph and therefore several searches for deadlock should be performed by MDL deadlock detector. @ sql/mdl.cc Ensure that in cases when MDL deadlock detector had to perform several searches for deadlock because several loops in waiters graph are possible we reset Deadlock_detection_context before performing each search. Failure to do so has broken assumption in code resposible for choosing victim that if Deadlock_detection_context::victim is set we also have read lock on m_waiting_for_lock for this context. As result this lock could have been unlocked more times than it was acquired which corrupted rwlock's state (no one was able to acquire write lock on it anymore).
[10 Feb 2010 15:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/99842 3088 Dmitry Lenev 2010-02-10 Fix for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability". When start of statement's waiting on a metadata lock created more than one loop in waiters graph server might have entered deadlock condition. The problem was that in the case described above MDL deadlock detector had to perform several searches for deadlock but forgot to reset Deadlock_detection_context before performing new search. Failure to do so has broken assumption in code resposible for choosing victim that if Deadlock_detection_context::victim is set we also have read lock on m_waiting_for_lock for this context. As result this lock could have been unlocked more times than it was acquired which corrupted rwlock's state which led to server deadlock. This fix ensures that such reset is done before each attempt to find a deadlock. @ mysql-test/r/mdl_sync.result Added test for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability" as well as coverage for the case when addition of statement waiting for metadata lock adds several loops in the waiters graph and therefore several searches for deadlock should be performed by MDL deadlock detector. @ mysql-test/t/mdl_sync.test Added test for bug #50998 "Deadlock in MDL code during test rqg_mdl_stability" as well as coverage for the case when addition of statement waiting for metadata lock adds several loops in the waiters graph and therefore several searches for deadlock should be performed by MDL deadlock detector. @ sql/mdl.cc Ensure that in cases when MDL deadlock detector had to perform several searches for deadlock because several loops in waiters graph are possible we reset Deadlock_detection_context before performing each search. Failure to do so has broken assumption in code resposible for choosing victim that if Deadlock_detection_context::victim is set we also have read lock on m_waiting_for_lock for this context. As result this lock could have been unlocked more times than it was acquired which corrupted rwlock's state (no one was able to acquire write lock on it anymore).
[10 Feb 2010 15:49]
Dmitry Lenev
Fix for this bug was pushed into mysql-next-4284 tree. Since this issue was not repeatable outside of this non-public-available tree there is nothing to document. So I am simply closing this report.
[11 Feb 2010 14:11]
John Embretsen
The test no longer deadlocks on Linux, so the it looks like the fix was good. However, the same test still deadlocks on Windows, so that may be a different issue. I intend to file a new bug for that once tests have been run for today's pushes. The test does not deadlock the same way on Solaris (SPARC), but instead crashes with what seems like some kind of infinite loop in MDL_Lock::find_deadlock(), see http://bugs.mysql.com/bug.php?id=51093 (reported today).
[11 Feb 2010 15:08]
John Embretsen
Windows deadlock now reported as http://bugs.mysql.com/bug.php?id=51105.
[16 Feb 2010 16:46]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:jon.hauglid@sun.com-20100211140522-unpky24gmq8fkhhj) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:55]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:dlenev@mysql.com-20100210154603-9hux05vnrgxonb9t) (pib:16)
[6 Mar 2010 11:00]
Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:01]
Paul DuBois
No changelog entry needed.