Bug #39789 Falcon recovery failure after several CREATE + DROP TABLESPACE
Submitted: 1 Oct 2008 19:57 Modified: 13 Dec 2008 9:57
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Vladislav Vaintroub CPU Architecture:Any
Tags: F_RECOVERY, pb2, test failure

[1 Oct 2008 19:57] Philip Stoev
Description:
After executing a workload involving CREATE and DROP TABLESPACE, Falcon recovery will fail with the following errors:

How to repeat:
Use test case from bug#39138 http://bugs.mysql.com/bug.php?id=39138:

Random Query Generator grammar file: http://bugs.mysql.com/file.php?id=10122

To reproduce this bug, please clone the latest revision of mysql-test-extra-6.0 and execute:

$ cd mysql-test/gentest
$ perl runall.pll \
  --basedir=/path/to/6.0-falcon-team \
  --grammar=conf/bug39138.yy \
  --reporters=Recovery \
  --threads=1 \
  --queries=1000

Suggested fix:
It appears to me that Falcon recovery can not handle the situation where a tablespace is both created and dropped within the same serial log. As such, extra tracking on which tablespaces are present at what point of time will be required to get it right.
[1 Oct 2008 20:52] Philip Stoev
The recovery error is either:

# 23:44:36 Exception: can't find table space 1
# 23:44:36 Exception: Recovery failed: can't find table space 1

or:

# 23:47:53 Exception: can't open file "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/f18": No such file or directory (2)
# 23:47:53 Couldn't open table space file "f18" for tablespace "a": can't open file "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/f18": No such file or directory (2)
# 23:47:53 Exception: can't open file "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/f18": No such file or directory (2)
# 23:47:53 Exception: Recovery failed: can't open file "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/f18": No such file or directory (2)
# 23:47:54 081001 23:47:54 [ERROR] Falcon: Recovery failed: can't open file "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/f18": No such file or directory (2)

If the test is modified to not use DROP TABLESPACE, the first error is more likely.
[3 Oct 2008 21:06] Omer Barnir
triage: setting tag to SR60BETA
[30 Oct 2008 12:11] John Embretsen
This bug causes the test falcon_ddl to fail in Pushbuild 2, so I am  tagging this as a test failure.

Symptom in recovery part of falcon_ddl is:

# 12:24:22 Recovering database /export/home/pb2/test/sb_1-108002-1225364755.58/mysql-6.0.8-alpha-linux-i686-test/vardirs/master-data_recovery/falcon_master.fts ...
# 12:24:22 Serial Log possible gap: 90721 - 80818
# 12:24:22 first recovery block is 83041
# 12:24:22 last recovery block is 90720
# 12:24:22 recovery read block is 87477
# 12:24:22 Exception: can't find table space 625
# 12:24:22 Exception: Recovery failed: can't find table space 625
# 12:24:24 081030 12:24:24 [ERROR] Falcon: Recovery failed: can't find table space 625
# 12:24:24 081030 12:24:24 [ERROR] Plugin 'Falcon' init function returned error.
# 12:24:24 081030 12:24:24 [ERROR] Plugin 'Falcon' registration as a STORAGE ENGINE failed.

I have also logged Server Bug#40397 for the cases when the server crashes after Falcon fails to initialize during recovery due to this bug.
[6 Nov 2008 0:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57954

2905 Vladislav Vaintroub	2008-11-06
       Bug#39789 : Falcon recovery failure after several CREATE + DROP TABLESPACE
      
      The problem in this specific test was removal of datafiles in SRLDropTable
      space::redo(),even if tablespace with the same datafile path in use by another 
      tablespace instance.
      
      The fix reimplements create/dropTablespace handling in recovery in 
      a manner consistent with the rest of Falcon recovery.
      
       - pass1(check) of recovery  collects information about which tablespaces 
      are dropped at the end
       - during pass1  recovery is prepared to handle references to
       non-existing tablespaces
      
       - pass2 and redo skip any reference to dropped tablespace
      -  tablespace that  are not marked as deleted are created during 
      pass2  (creating "physical" objects)
[14 Nov 2008 2:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/58716

2912 Vladislav Vaintroub	2008-11-14
      Bug#39789 : Falcon recovery failure after several CREATE + DROP TABLESPACE
            
      The problem in this specific situation was removal of datafiles in SRLDropTable
      space::redo(),even if tablespace with the same datafile path in use by another 
      tablespace instance.
            
      The fix reimplements create/dropTablespace handling in recovery in 
      a manner consistent with the rest of Falcon recovery.
            
      -  pass1(check) of recovery  collects information about what tablespaces 
         are dropped at the end
      
      -  pass2 and redo skip any reference to dropped tablespace 
        (except for SRLDropTableSpace record)
      
      -  tablespace (not marked as dropped) are created during 
         pass2 of recovery
      
      -  tablspaces marked as dropped are "redropped" in phase2 of recovery 
         i.e removed from memory structures (hashtables and lists).
         Datafile that is possibly left over in previous session is deleted
         as well.
      
      This patch moves tableSpaceId from classes derived from SerialLogRecord
      to the parent class, for technical reasons - it is much simpler to check
      if log record should be skipped in different recovery phases.
[11 Dec 2008 14:19] Bugs System
Pushed into 6.0.9-alpha  (revid:vvaintroub@mysql.com-20081114023011-7gn8m0y6dledg2ab) (version source revid:hky@sun.com-20081127084516-nbu7693932vcz2st) (pib:5)
[13 Dec 2008 9:57] MC Brown
A note was added to the 6.0.9 changelog: 

Recovery of a tablespace for FALCON tables could fail if the tablespace was already in use.