Bug #46654 | False deadlock on concurrent DML/DDL with partitions, inconsistent behavior | ||
---|---|---|---|
Submitted: | 11 Aug 2009 15:15 | Modified: | 7 Mar 2010 1:58 |
Reporter: | Elena Stepanova | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Locking | Severity: | S3 (Non-critical) |
Version: | 5.4.4-alpha | OS: | Any |
Assigned to: | Jon Olav Hauglid | CPU Architecture: | Any |
Tags: | regression |
[11 Aug 2009 15:15]
Elena Stepanova
[11 Aug 2009 15:25]
Elena Stepanova
For 5.1.37, the behaviour with and without partitions is the same, and looks identical to what happens without partitions on 5.4.4.
[12 Aug 2009 2:11]
Elena Stepanova
Looks similar to bug#46272, but not quite sure whether it's a duplicate due to the different behavior for partitioned tables vs non-partitioned, which the other bug does not mention.
[25 Aug 2009 16:52]
Konstantin Osipov
The fix is to remove tdc_remove_table in sql_partition from two places where they are unnecessary.
[27 Aug 2009 9:25]
Jon Olav Hauglid
Using the test case in the bug description, the only call to tdc_remove_table() comes from the initial "DROP TABLE IF EXISTS tbl_with_partitions"; The two deadlocks are reported from Open_table_context::request_backoff_action() with identical backtraces to the deadlock in Bug#46272.
[27 Aug 2009 11:19]
Jon Olav Hauglid
ALTER TABLE calls prep_alter_part_table() in sql_partition.cc which sets TABLE_SHARE's version = 0. This happens before ALTER TABLE starts waiting for the shared lock to disappear inside upgrade_shared_lock_to_exclusive(). The following INSERT will notice that share->version now is different from refresh_version and call Open_table_context::request_backoff_action() which triggers the DEADLOCK error.
[17 Sep 2009 7:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/83557 2822 Jon Olav Hauglid 2009-09-17 Bug #46654 False deadlock on concurrent DML/DDL with partitions, inconsistent behavior This is a preliminary version of the patch. The problem was that if one connection is running a multi-statement transaction which involves a single partitioned table, and another connection attempts to alter the table, the first connection gets ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE statement in another connection is timed out (which happens after innodb_lock_wait_timeout seconds). The reason for this was that the prepare phase for ALTER TABLE for partitioned tables removed the table from the table definition cache before it started waiting on the lock. The transaction running in the first connection would notice this and report ER_LOCK_DEADLOCK. The removal was unnecessary as it also is done at a later point when the lock has been taken. This patch changes the ALTER TABLE code so that tdc_remove_table() is only called when the prepare phase fails. Test case added in bug46654.test. The test will be merged into an existing file before the final commit/push.
[22 Sep 2009 8:31]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/84057 2830 Jon Olav Hauglid 2009-09-22 Bug #46654 False deadlock on concurrent DML/DDL with partitions, inconsistent behavior The problem was that if one connection is running a multi-statement transaction which involves a single partitioned table, and another connection attempts to alter the table, the first connection gets ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE statement in another connection has timed out or failed. The reason for this was that the prepare phase for ALTER TABLE for partitioned tables removed all instances of the table from the table definition cache before it started waiting on the lock. The transaction running in the first connection would notice this and report ER_LOCK_DEADLOCK. This patch changes the prep_alter_part_table() ALTER TABLE code so that tdc_remove_table() is no longer called. Instead, only the TABLE instance changed by prep_alter_part_table() is marked as needing reopen. The patch also removes an unnecessary call to tdc_remove_table() from mysql_unpack_partition() as the changed TABLE object is destroyed by the caller at a later point. Test case added in partition_sync.test.
[22 Sep 2009 11:45]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/84112 2832 Jon Olav Hauglid 2009-09-22 Bug #46654 False deadlock on concurrent DML/DDL with partitions, inconsistent behavior The problem was that if one connection is running a multi-statement transaction which involves a single partitioned table, and another connection attempts to alter the table, the first connection gets ER_LOCK_DEADLOCK and cannot proceed anymore, even when the ALTER TABLE statement in another connection has timed out or failed. The reason for this was that the prepare phase for ALTER TABLE for partitioned tables removed all instances of the table from the table definition cache before it started waiting on the lock. The transaction running in the first connection would notice this and report ER_LOCK_DEADLOCK. This patch changes the prep_alter_part_table() ALTER TABLE code so that tdc_remove_table() is no longer called. Instead, only the TABLE instance changed by prep_alter_part_table() is marked as needing reopen. The patch also removes an unnecessary call to tdc_remove_table() from mysql_unpack_partition() as the changed TABLE object is destroyed by the caller at a later point. Test case added in partition_sync.test.
[22 Sep 2009 11:46]
Jon Olav Hauglid
Pushed to mysql-6.0-codebase-bugfixing (version 6.0.14-alpha).
[23 Sep 2009 6:40]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/84242 2834 Jon Olav Hauglid 2009-09-23 Followup to Bug#46654 False deadlock on concurrent DML/DDL with partitions, inconsistent behavior Partition_sync.test uses features only available in debug builds. Disabling the test for non-debug builds.
[30 Sep 2009 8:17]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20090929093622-1mooerbh12e97zux) (version source revid:alik@sun.com-20090923103200-kyo2bakdo6tfb2fb) (merge vers: 6.0.14-alpha) (pib:11)
[1 Oct 2009 0:08]
Paul DuBois
Noted in 6.0.14 changelog. Deadlock occurred if one session was running a multiple-statement transaction that involved a single partitioned table and another session attempted to alter the table. Setting report to NDI pending push into 5.4.x.
[20 Nov 2009 12:54]
Bugs System
Pushed into 5.6.0-beta (revid:davi.arnaut@sun.com-20091119234808-xbjpkwaxjt5x5c0b) (version source revid:jon.hauglid@sun.com-20090928163426-2lg1gofzz44xzzxf) (merge vers: 6.0.14-alpha) (pib:13)
[20 Nov 2009 12:57]
Bugs System
Pushed into 6.0.14-alpha (revid:kostja@sun.com-20091120124947-yi6h2jbgw0kbciwm) (version source revid:jon.hauglid@sun.com-20090928163426-2lg1gofzz44xzzxf) (merge vers: 6.0.14-alpha) (pib:13)
[20 Nov 2009 23:41]
Paul DuBois
Noted in 5.6.0 changelog.
[10 Dec 2009 13:42]
Jon Olav Hauglid
Pushed to mysql-next-4284 (5.6.0-beta).
[16 Jan 2010 23:12]
Mattias Jonsson
This also affects 5.1, see bug#49881 (I closed that as a duplicate of this). Is it possible to backport this to 5.1?
[21 Jan 2010 10:39]
Jon Olav Hauglid
The patch depends on several other changesets that are not in 5.1, so a backport would not be straight-forward.
[21 Jan 2010 19:38]
Nicolas Brousse
What about a port to 5.5 ?
[22 Jan 2010 13:16]
Jon Olav Hauglid
Yes, this fix will be part of a 5.5 release.
[16 Feb 2010 16:44]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100216101445-2ofzkh48aq2e0e8o) (version source revid:kostja@sun.com-20091211154405-c9yhiewr9o5d20rq) (merge vers: 6.0.14-alpha) (pib:16)
[16 Feb 2010 16:54]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100216101208-33qkfwdr0tep3pf2) (version source revid:kostja@sun.com-20091211103945-198h3pt8w7ypk20u) (pib:16)
[17 Feb 2010 0:55]
Paul DuBois
Setting report to Need Merge pending push of Celosia into release tree.
[6 Mar 2010 10:54]
Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20100216221947-luyhph0txl2c5tc8) (merge vers: 5.5.99-m3) (pib:16)
[7 Mar 2010 1:58]
Paul DuBois
Moved 5.6.0 changelog entry to 5.5.3