Bug #42344 Falcon recovery Exception: read error on page Invalid argument (22)
Submitted: 26 Jan 2009 10:12 Modified: 15 May 2009 17:02
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Vladislav Vaintroub CPU Architecture:Any
Tags: F_RECOVERY
Triage: Triaged: D2 (Serious)

[26 Jan 2009 10:12] Philip Stoev
Description:
When recovering after a kill -9, Falcon reported the following:

# 23:28:37 New table space FALCON_USER, id 1, type 0, filename falcon_user.fts
# 23:28:37 New table space FALCON_TEMPORARY, id 2, type 0, filename falcon_temporary.fts
# 23:28:37 Recovering database /dev/shm/var/master-data_recovery/falcon_master.fts ...
# 23:28:37 /nFirst recovery block is 400
# 23:28:37 Last recovery block is 446
# 23:28:37 Recovery read block is 436
# 23:28:37 Recovery phase 1...
# 23:28:37 Processed:    13248
# 23:28:37
# 23:28:37 Recovery phase 2...
# 23:28:37 Processed:    13248
# 23:28:37
# 23:28:37 Recovery phase 3...
# 23:28:38 === Scavenge Cycle 1 - FALCON_MASTER - 0 seconds
# 23:28:38 0: Activity on /dev/shm/var/master-data_recovery/falcon_master.fts: 10 fetches, 5 reads, 0 writes, 1 flushWrites
# 23:28:38 1: Activity on /build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_temporary.fts: 0 fetches, 1 reads, 0 writes, 0 flushWrites
# 23:28:38 1: Activity on /build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts: 164079 fetches, 59 reads, 0 writes, 0 flushWrites
# 23:28:39 Exception: read error on page 1461 of "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts": Invalid argument (22)
# 23:28:39 Exception: Recovery failed: read error on page 1461 of "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts": Invalid argument (22)
# 23:28:40 Exception: can't continue after fatal error
# 23:28:40 090125 23:28:40 [ERROR] Falcon: can't continue after fatal error
# 23:28:40 090125 23:28:40 [ERROR] Plugin 'Falcon' init function returned error.
# 23:28:40 090125 23:28:40 [ERROR] Plugin 'Falcon' registration as a STORAGE ENGINE failed.

How to repeat:
The tablespace will be uploaded shortly.
[26 Jan 2009 10:16] Philip Stoev
The tablespace is available at:

http://69.61.72.18/bug42344.zip

Looks like the "invalid argument" error is printed only when restoring on a memory filesystem.

When restoring on ext3, no error is printed but recovery still fails silently. Strace shows no failing pread calls.
[26 Jan 2009 15:51] Philip Stoev
This tablespace was produced by the following command. Note that --mem is being used and that after applying the mask to the SQL grammar, only transactions and UPDATEs remain. Unfortunately, running the test again did not result in a recovery failure .

$ perl runall.pl \
  --mysqld=--falcon-page-size=4K \
  --mem --rows=10000 --threads=16 \
  --mask=2435 \
  --queries=1000000 \
  --duration=300 \
  --basedir=/build/bzr/6.0-falcon-team \
  --engine=Falcon \
  --grammar=conf/combinations.yy \
  --gendata=conf/combinations.zz \
  --reporter=Deadlock,ErrorLog,Backtrace,Recovery \
  --mysqld=--loose-falcon-lock-wait-timeout=1 \
  --mysqld=--log-output=none
[24 Feb 2009 11:28] Olav Sandstå
With the improved error message added for reporting this kind of failure the 
error now is reported like this in a newer crash:

# 18:02:01 Recovering from lost page inventory page 3
# 18:02:09 Exception: pread on file /export/home/pb2/test/sb_1-346730-1235408294.71/mysql-6.0.11-alpha-linux-i686-test/vardirs/master-data_recovery/falcon_user.fts from  page 143188 (offset 586498048) returned 0 bytes instead of 4096 (possible read behind EOF)
# 18:02:09 Exception: Recovery failed: pread on file /export/home/pb2/test/sb_1-346730-1235408294.71/mysql-6.0.11-alpha-linux-i686-test/vardirs/master-data_recovery/falcon_user.fts from  page 143188 (offset 586498048) returned 0 bytes instead of 4096 (possible read behind EOF)
# 18:02:09 Exception: can't continue after fatal error
[2 Mar 2009 15:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68001

3043 Vladislav Vaintroub	2009-03-02
      Bug #42344 Falcon recovery Exception: read error on page 
      Invalid argument (22)
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
     @ storage/falcon/Dbb.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/Debug.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/Index2Page.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/IndexPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/IndexPage.h
        rename IndexPage::priorPage and IndexPage::parentPage to 
        IndexPage::unused
     @ storage/falcon/IndexRootPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
        Check for root page instead of page->parentPage == 0 when splitting
        index.
     @ storage/falcon/IndexRootPage.h
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/PageInventoryPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLIndexPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLIndexPage.h
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLVersion.h
        Changed SRLIndexPage logging
[2 Mar 2009 16:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68011

3043 Vladislav Vaintroub	2009-03-02
      Bug#42344  Falcon recovery Exception: read error on page 
      Bug#42903  Falcon: Inconsistent tablespace after recovery
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
[2 Mar 2009 18:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68025

3043 Vladislav Vaintroub	2009-03-02
      Bug#42344  Falcon recovery Exception: read error on page 
      Bug#42903  Falcon: Inconsistent tablespace after recovery
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
[2 Apr 2009 17:39] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version source revid:christopher.powers@sun.com-20090304040340-b4zoglfws0iswqm1) (merge vers: 6.0.11-alpha) (pib:6)
[15 May 2009 17:02] MC Brown
A note has been added to the 6.0.11 changelog: 

Recovery of Falcon tables with indexes can fail because the index page information has not been recorded properly