Bug #38186 | falcon_bug_31295 fails on pushbuild | ||
---|---|---|---|
Submitted: | 16 Jul 2008 22:04 | Modified: | 9 Oct 2008 13:02 |
Reporter: | Vladislav Vaintroub | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Falcon storage engine | Severity: | S3 (Non-critical) |
Version: | 6.0 | OS: | Any |
Assigned to: | Vladislav Vaintroub | CPU Architecture: | Any |
Tags: | pushbuild, test failure |
[16 Jul 2008 22:04]
Vladislav Vaintroub
[17 Jul 2008 7:21]
John Embretsen
This has happended before in Pushbuild: 2008-06-30 17:10:00 bzr_mysql-6.0-falcon 36 (Kevin_Lewis__klewis-20080630171000-1), 'vm-win2003-32-a' Win32 VS2003 -max-nt. I have also seen this happen intermittently on a non-Pushbuild Linux host, when running the full falcon suite.
[17 Jul 2008 20:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49982 2752 Vladislav Vaintroub 2008-07-17 Bug#38186 : CREATE TABLESPACE fails sometimes if called directly after DROP TABLESPACE on the same file The problem is that tablespace files are not deleted directly on client COMMIT. Deletion is done later, by background(Gopher) thread. Thus the file may still exist when "create tablespace" is called. The solution is to to modify CREATE TABLESPACE and if the file is there and there are pending DROP operation, wait for some seconds until the file is deleted. An attempt to solve this problem was done previously, but waiting for deletion was put on the wrong place. Client thread was waiting for gopher in TablespaceManager::dropTablespace, which can caused a "soft-deadlock" : - Client thread was holding syncSysDDL lock and was waiting for Gopher to delete file - Gopher did not run, was stuck in thread->sleep and waited for Scheduler to wakeup - Scheduler was doing scavenge and was stuck waiting for syncSysDDL lock in Database:updateCardinalities() This soft-deadlock was resolved after 10 seconds, client gave up and returned with "tablespace file already exists". The refinement to this solution is given here . Client waits for file to be deleted in StorageHandler::createTablespace while no Falcon locks are held. Thus the soft-deadlock described is not possible with this solution
[17 Jul 2008 20:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/49984 2752 Vladislav Vaintroub 2008-07-17 Bug#38186 : CREATE TABLESPACE fails sometimes if called directly after DROP TABLESPACE on the same file The problem is that tablespace files are not deleted directly on client COMMIT. Deletion is done later, by background(Gopher) thread. Thus the file may still exist when "create tablespace" is called. The solution is to to modify CREATE TABLESPACE and if the file is there and there are pending DROP operation, wait for some seconds until the file is deleted. An attempt to solve this problem was done previously, but waiting for deletion was put on the wrong place. Client thread was waiting for gopher in TablespaceManager::dropTablespace, which can caused a "soft-deadlock" : - Client thread was holding syncSysDDL lock and was waiting for Gopher to delete file - Gopher did not run, was stuck in thread->sleep and waited for Scheduler to wakeup - Scheduler was doing scavenge and was stuck waiting for syncSysDDL lock in Database:updateCardinalities() This soft-deadlock was resolved after 10 seconds, client gave up and returned with "tablespace file already exists". The refinement to this solution is given here . Client waits for file to be deleted in StorageHandler::createTablespace while no Falcon locks are held. Thus the soft-deadlock described is not possible with this solution
[7 Aug 2008 18:14]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51131 2769 Vladislav Vaintroub 2008-08-07 Bug#38186 Problem: Waiting for tablespace to be dropped has no effect. Cause: We are checking the the existence of wrong file (tableSpaceName) Solution: Change first parameter in waitForPendingDrops to be filename, not tablespace name.
[7 Aug 2008 18:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51132 2769 Vladislav Vaintroub 2008-08-07 Bug#38186 Problem: Waiting for tablespace to be dropped has no effect. Cause: We are checking the the existence of wrong file (tableSpaceName) Solution: Change first parameter in waitForPendingDrops to be filename, not tablespace name.
[7 Aug 2008 18:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/51133 2769 Vladislav Vaintroub 2008-08-07 Bug#38186 Problem: Waiting for tablespace to be dropped has no effect. Cause: We are checking the the existence of wrong file (tableSpaceName) Solution: Change first parameter in waitForPendingDrops to be filename, not tablespace name.
[8 Aug 2008 11:33]
John Embretsen
Thanks, Vlad, the latest patch seems to "fix" the issue way better than the previous attempts! I tried out the fix on a Linux host where this test has been known to fail almost consistently every second to third time it is run. In the falcon-team branch, I saw no failures even when running falcon_bug_31295 100 times in a row, nor when running the falcon suite. Without this fix, this particular test failed in almost every run of the falcon suite. On a side note, it is interesting how every third run of the test is much slower than the other runs, e.g.: falcon.falcon_bug_31295 [ pass ] 1302 falcon.falcon_bug_31295 [ pass ] 305 falcon.falcon_bug_31295 [ pass ] 304 falcon.falcon_bug_31295 [ pass ] 1296 falcon.falcon_bug_31295 [ pass ] 298 falcon.falcon_bug_31295 [ pass ] 346 falcon.falcon_bug_31295 [ pass ] 1288 falcon.falcon_bug_31295 [ pass ] 286 falcon.falcon_bug_31295 [ pass ] 310 falcon.falcon_bug_31295 [ pass ] 1291 I also applied the patch to the mysql-6.0-falcon branch, where I've seen similar failures with a new test, falcon_tablespace_priv, on this particular host. There, I did not see any of these tests failing even when running them 100 times in a row. When running the falcon suite, falcon_bug_31295 test failed once (after the 10 second timeout), but this is not easily reproducible (it passes most of the time). The latter case is probably just a matter of adjusting the timeout value to accomodate really slow or busy machines.
[22 Aug 2008 21:31]
Kevin Lewis
This fix is in version 6.0.7
[28 Aug 2008 20:17]
Bugs System
Pushed into 6.0.7-alpha (revid:cbell@mysql.com-20080822132131-uveo6wiuecy6m2b8) (version source revid:cbell@mysql.com-20080822132131-uveo6wiuecy6m2b8) (pib:3)
[14 Sep 2008 2:09]
Bugs System
Pushed into 6.0.6-alpha (revid:vvaintroub@mysql.com-20080717203845-q5bo9rdpbigm1nat) (version source revid:hakan@mysql.com-20080716160034-sdexuyp3qow7zlc6) (pib:3)
[9 Oct 2008 13:02]
Jon Stephens
Documented in the 6.0.7 changelog as follows: CREATE TABLESPACE failed when invoked immediately following a DROP TABLESPACE statement that used the same tablespace name.
[21 Oct 2008 20:06]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/56749 2875 Vladislav Vaintroub 2008-10-21 Bug#39789: Falcon recovery failure after several CREATE + DROP TABLESPACE Solved in this patch: -Drop tablespace in REDO phase removed datafiles without taking into account that some of them should stay after the recovery. In this case, even if metadata for tablespace would be present after recovery, the data is lost together with the file. Typical scenario for such failed recovery would be something like create tablespace A add datafile 'f' engine falcon; drop tablespace A; create tablespace B add datafile 'f' engine falcon; create table T tablespace B; Recovery for "drop tablespace A" would delete file used by tablespace B and invalidate table T. Solution for this problem is to calculate whether datafiles should stay after recovery or can be deleted. Calculation done in the pass1 of recovery, deletion in the REDO phase. While fixing this, other smaller problems were fixed - failure to recover, when the database crashed after datafile had been created but before header was written. - redoCreateTableSpace did not attempt to create tablespaces, but only opened existing ones. - fixed a long-standing problem that tablespace file was present even after "drop tablespace" successfully completed. This was previously worked around in Bug#38186
[30 Oct 2008 13:51]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/57456 2889 Vladislav Vaintroub 2008-10-30 Fixed a long-standing problem that tablespace file was present even after "drop tablespace" successfully completed. This problem was previously worked around in Bug#38186.
[30 Oct 2008 14:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/57464 2889 Vladislav Vaintroub 2008-10-30 Fixed a long-standing problem that tablespace file was present even after "drop tablespace" successfully completed. This problem was previously worked around in Bug#38186.
[31 Oct 2008 0:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/57515 2892 Vladislav Vaintroub 2008-10-31 Bug #38186: CREATE TABLESPACE failed when invoked immediately following a DROP TABLESPACE statement that used the same tablespace name. Fixed a long-standing problem that tablespace file was present even after "drop tablespace" successfully completed. The problem was worked around previously, this patch contains refined solution. Previously, file was closed and deleted when the gopher thread processed dropTableSpace record. Now the file is removed earlier, in the client thread . Gopher thread only closes the file descriptor later. This technique ensures that - datafile does not exist after "drop tablespace" is completed - file descriptor remains valid as long as it can be used by commited , write-incomplete transactions. On Windows, removing an open file requires FILE_SHARE_DELETE flag for CreateFile. Also file has to be renamed to a unique name before deletion. This functionality is implemented in posix-alike winOpen() and winUnlink() functions This patch also contains missing test case for Bug#38186.
[20 Nov 2008 12:56]
Bugs System
Pushed into 6.0.8-alpha (revid:vvaintroub@mysql.com-20081031002913-hg55ea1xjrimzx25) (version source revid:vvaintroub@mysql.com-20081031002913-hg55ea1xjrimzx25) (pib:5)