Bug #47151 | Replication failure on highly concurrent DROP DATABASE and othe DDL | ||
---|---|---|---|
Submitted: | 5 Sep 2009 16:21 | Modified: | 7 Dec 2009 13:00 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 6.0-codebase | OS: | Any |
Assigned to: | Daogang Qu | CPU Architecture: | Any |
[5 Sep 2009 16:21]
Philip Stoev
[5 Sep 2009 16:23]
Philip Stoev
Grammar for bug 47151
Attachment: bug47151.yy (application/octet-stream, text), 23.85 KiB.
[26 Nov 2009 16:36]
Philip Stoev
bug47151-large.yy
Attachment: bug47151-large.yy (application/octet-stream, text), 30.88 KiB.
[26 Nov 2009 16:38]
Philip Stoev
I just uploaded a MDL-targeted RQG grammar that is particularily productive when it comes to replication failures. To run: perl runall.pl \ --gendata=conf/WL5004_data.zz \ --rpl_mode=row \ --duration=60 \ --queries=100K \ --basedir=/build/bzr/6.0-codebase-bugfixing \ --mysqld=--log-output=file \ --grammar=conf/bug47151-large.yy \ --mem
[26 Nov 2009 16:56]
Philip Stoev
Requesting a re-triage. Since this bug was filed, it was discovered that various replication issues happen at lower concurrencies. Also, the purpose of the MDL locking is to prevent such issues completely, regardless of the concurrency level and the realism of the scenario. Therefore, please give this bug a higher tag and let's use it to figure out the root issue.
[3 Dec 2009 12:45]
Philip Stoev
I meant schema DDL is not protected, so failures on DROP schema are to be expected.
[6 Dec 2009 12:01]
Daogang Qu
DROP DATABASE constructs a list of tables to drop, by performing a read on the filesystem directory. Obviously, this has a race: if between the scan and actual DROP the contents of the directory is changed, DROP DATABASE will not drop everything, or will try to drop something that might be no longer there. Offending operatoins include: CREATE/DROP TRIGGER, ALTER TABLE db1.t1 RENAME db2.t2, other operations that move directory files around. The only solution for the problem is to make sure that DDL operations take "scoped" lock, which MySQL doesn't do. I.e. DROP TABLE or RENAME TABLE needs not only take an exclusive lock on the table itself (which it currently does), but an intention exclusive lock on the database name.
[7 Dec 2009 5:02]
Daogang Qu
Hi Philip, According to the above root cause, bug#47151 should be closed. Are you agree?
[7 Dec 2009 5:39]
MySQL Verification Team
Daogang, I can't comment on the MDL stuff, but are you saying it's acceptable to have binlog corruption noted in the bug description too?
[7 Dec 2009 7:35]
Daogang Qu
The problem in the bug description will disappear after the above root cause is resolved.
[7 Dec 2009 13:00]
Philip Stoev
I am closing this bug. Will open a new one if a replication failure is observed in MDL-controlled operations.