Bug #42344 Falcon recovery Exception: read error on page Invalid argument (22)
Submitted: 26 Jan 11:12 Modified: 15 May 19:02
Reporter: Philip Stoev
Status: Closed
Category:Server: Falcon Severity:S1 (Critical)
Version:6.0-falcon-team OS:Any
Assigned to: Vladislav Vaintroub Target Version:6.0-beta
Tags: F_RECOVERY
Triage: Triaged: D2 (Serious)

[26 Jan 11:12] Philip Stoev
Description:
When recovering after a kill -9, Falcon reported the following:

# 23:28:37 New table space FALCON_USER, id 1, type 0, filename falcon_user.fts
# 23:28:37 New table space FALCON_TEMPORARY, id 2, type 0, filename falcon_temporary.fts
# 23:28:37 Recovering database /dev/shm/var/master-data_recovery/falcon_master.fts ...
# 23:28:37 /nFirst recovery block is 400
# 23:28:37 Last recovery block is 446
# 23:28:37 Recovery read block is 436
# 23:28:37 Recovery phase 1...
# 23:28:37 Processed:    13248
# 23:28:37
# 23:28:37 Recovery phase 2...
# 23:28:37 Processed:    13248
# 23:28:37
# 23:28:37 Recovery phase 3...
# 23:28:38 === Scavenge Cycle 1 - FALCON_MASTER - 0 seconds
# 23:28:38 0: Activity on /dev/shm/var/master-data_recovery/falcon_master.fts: 10
fetches, 5 reads, 0 writes, 1 flushWrites
# 23:28:38 1: Activity on
/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_temporary.fts: 0
fetches, 1 reads, 0 writes, 0 flushWrites
# 23:28:38 1: Activity on
/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts: 164079
fetches, 59 reads, 0 writes, 0 flushWrites
# 23:28:39 Exception: read error on page 1461 of
"/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts": Invalid
argument (22)
# 23:28:39 Exception: Recovery failed: read error on page 1461 of
"/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_user.fts": Invalid
argument (22)
# 23:28:40 Exception: can't continue after fatal error
# 23:28:40 090125 23:28:40 [ERROR] Falcon: can't continue after fatal error
# 23:28:40 090125 23:28:40 [ERROR] Plugin 'Falcon' init function returned error.
# 23:28:40 090125 23:28:40 [ERROR] Plugin 'Falcon' registration as a STORAGE ENGINE
failed.

How to repeat:
The tablespace will be uploaded shortly.
[26 Jan 11:16] Philip Stoev
The tablespace is available at:

bug42344.zip">http://69.61.72.18/bug42344.zip

Looks like the "invalid argument" error is printed only when restoring on a memory
filesystem.

When restoring on ext3, no error is printed but recovery still fails silently. Strace
shows no failing pread calls.
[26 Jan 16:51] Philip Stoev
This tablespace was produced by the following command. Note that --mem is being used and
that after applying the mask to the SQL grammar, only transactions and UPDATEs remain.
Unfortunately, running the test again did not result in a recovery failure .

$ perl runall.pl \
  --mysqld=--falcon-page-size=4K \
  --mem --rows=10000 --threads=16 \
  --mask=2435 \
  --queries=1000000 \
  --duration=300 \
  --basedir=/build/bzr/6.0-falcon-team \
  --engine=Falcon \
  --grammar=conf/combinations.yy \
  --gendata=conf/combinations.zz \
  --reporter=Deadlock,ErrorLog,Backtrace,Recovery \
  --mysqld=--loose-falcon-lock-wait-timeout=1 \
  --mysqld=--log-output=none
[24 Feb 12:28] Olav Sandstaa
With the improved error message added for reporting this kind of failure the 
error now is reported like this in a newer crash:

# 18:02:01 Recovering from lost page inventory page 3
# 18:02:09 Exception: pread on file
/export/home/pb2/test/sb_1-346730-1235408294.71/mysql-6.0.11-alpha-linux-i686-test/vardirs/master-data_recovery/falcon_user.fts
from  page 143188 (offset 586498048) returned 0 bytes instead of 4096 (possible read
behind EOF)
# 18:02:09 Exception: Recovery failed: pread on file
/export/home/pb2/test/sb_1-346730-1235408294.71/mysql-6.0.11-alpha-linux-i686-test/vardirs/master-data_recovery/falcon_user.fts
from  page 143188 (offset 586498048) returned 0 bytes instead of 4096 (possible read
behind EOF)
# 18:02:09 Exception: can't continue after fatal error
[2 Mar 16:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68001

3043 Vladislav Vaintroub	2009-03-02
      Bug #42344 Falcon recovery Exception: read error on page 
      Invalid argument (22)
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
     @ storage/falcon/Dbb.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/Debug.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/Index2Page.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/IndexPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage
     @ storage/falcon/IndexPage.h
        rename IndexPage::priorPage and IndexPage::parentPage to 
        IndexPage::unused
     @ storage/falcon/IndexRootPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
        Check for root page instead of page->parentPage == 0 when splitting
        index.
     @ storage/falcon/IndexRootPage.h
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/PageInventoryPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLIndexPage.cpp
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLIndexPage.h
        remove IndexPage::priorPage and IndexPage::parentPage usage.
     @ storage/falcon/SRLVersion.h
        Changed SRLIndexPage logging
[2 Mar 17:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68011

3043 Vladislav Vaintroub	2009-03-02
      Bug#42344  Falcon recovery Exception: read error on page 
      Bug#42903  Falcon: Inconsistent tablespace after recovery
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
[2 Mar 19:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/68025

3043 Vladislav Vaintroub	2009-03-02
      Bug#42344  Falcon recovery Exception: read error on page 
      Bug#42903  Falcon: Inconsistent tablespace after recovery
      
      The exception is caused during recovery of an index page, when 
      trying to  insert a first record to the parent page of the current 
      page. Parent page was not existing - this caused reads behind the 
      EOF of falcon_user.fts 
      
      The problem is that parent page itself was not logged.
      Another (worse) problem that can result into corrupted index is that 
      that parent pointer is not reliable and often can be wrong for 
      non-leaf pages. When non-leaf page is split, half or its nodes are 
      transferred to the newly created page. For  consistency, parentPage
      of corresponding child pages had to be changed and change had to be 
      logged and this was not done.
      
      The fix is to remove all occurenced parentPage pointer from the 
      Falcon code. Having a link to parent page was an idea that is not 
      implemented consistently. Also, this patch removed all occurenced 
      to priorPage from the Falcon code.This was another inconsistently 
      implemented idea.
      
      IndexPage::priorPage and IndexPage::parentPage are renamed to  
      int32 IndexPage::unused[2], for future use and to prevent the need
      to upgrade for existing databases.
      
      Recovery is changed to perform careful writes in the right order 
      during the index split (newly created page, then page that overflowed, 
      then parent page or the newly creted page).
      
      Doing recovery like this could result in a slightly detuned index, if 
      crash or kill occures before parent page is logged. Detuned means
      there is a page without a parent. Index traversal will still find it
      when traversing the page level via IndexPage::next links. 
      
      "Detuned" problem can be fixed  by logging  all pages involved in index 
      split at once. This is something to consider in the future.
[2 Apr 19:39] Bugs System
Pushed into 6.0.11-alpha (revid:hky@sun.com-20090402144811-yc5kp8g0rjnhz7vy) (version
source revid:christopher.powers@sun.com-20090304040340-b4zoglfws0iswqm1) (merge vers:
6.0.11-alpha) (pib:6)
[15 May 19:02] MC Brown
A note has been added to the 6.0.11 changelog: 

Recovery of Falcon tables with indexes can fail because the index page information has
not been recorded properly