Bug #39212 Falcon exception on recovery when one of the log files is empty
Submitted: 3 Sep 2008 12:38 Modified: 9 Jan 2009 14:06
Reporter: Philip Stoev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0-falcon-team OS:Linux
Assigned to: Vladislav Vaintroub
Tags: F_RECOVERY
Triage: Triaged: D2 (Serious)

[3 Sep 2008 12:38] Philip Stoev
Description:
If Falcon has to do a recovery before the second serial log has been utilized, the following exception is written on STDERR:

Exception: serial read error on "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_master.fl2": Invalid argument (22)

As per Vlad's instructions, every Exception during the recovery phase should be treated as a bug.

How to repeat:
Get the latest mysql-test-extra-6.0 tree and execute:

$ cd mysql-test-extra-6.0/mysql-test/gentest
$ perl runall.pl \
  --reporters=Recovery \
  --grammar=conf/falcon_recovery.yy \
  --basedir=/path/to/6.0-falcon-team \
  --duration=60 \
  --engine=falcon

This will start a very short stress test. Before the second serial log has come into rotation, the framework will kill the server and initiate recovery. The Exception will be visible during the recovery phase.

Suggested fix:
In addition to fixing the bug itself, exceptions should case immediate failure in the recovery and the server should not start at all
[3 Sep 2008 15:42] Philip Stoev
The problem also exists in the form of "serial log write error", as seen below:

New table space FALCON_USER, id 1, type 0, filename falcon_user.fts
Recovering database /build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_master.fts ...
Exception: serial read error on "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_master.fl2": Invalid argument (22)
first recovery block is 1
last recovery block is 1590
recovery read block is 47
Recovery complete
1: Commit transaction 2
1: Commit transaction 3
1: Commit transaction 4
1: Commit transaction 5
1: Commit transaction 6
1: Commit transaction 7
1: Commit transaction 8
1: Commit transaction 9
1: Switching log files (0 used)
1: Commit transaction 10
Exception: serial write error on "/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_master.fl2": Invalid argument (22)
[Falcon] Error: can't continue after fatal error
Bugcheck: can't continue after fatal error
080903 18:34:06 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
# 2008-09-03 18:34:06 [11195]
key_buffer_size=8388572
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 337751 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
# 2008-09-03 18:34:06 [11195]
thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x30000
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(my_print_stacktrace+0x26)[0x87ccd9b]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(handle_segfault+0x2cb)[0x82b0f27]
[0x110400]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Error::debugBreak()+0x12)[0x858adfc]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Error::error(char const*, ...)+0x7b)[0x858ae79]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(IO::writePages(int, int, unsigned char const*, int)+0x23)[0x85a3d89]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Cache::ioThread()+0x499)[0x855e0cd]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Cache::ioThread(void*)+0x11)[0x855e71d]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Thread::thread()+0x56)[0x85494f2]
/build/bzr/6.0-falcon-team/mysql-test/../sql/mysqld(Thread::thread(void*)+0x17)[0x8549717]
/lib/libpthread.so.0[0x57d32f]
/lib/libc.so.6(clone+0x5e)[0x49a27e]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
[3 Sep 2008 15:43] Philip Stoev
Tablespace files that cause both errors.

Attachment: bug39212.zip (application/x-zip-compressed, text), 430.88 KiB.

[8 Oct 2008 12:17] Vladislav Vaintroub
Can reproduce with the test above.
Falcon recovers ok. There seems to be a bug in the either in the test  framework or in innodb that prevent complete server recovery (on Windows, did not test anywhere else).

...

InnoDB: Error: log file .\ib_logfile0 is of different size 0 5242880 bytes
InnoDB: than specified in the .cnf file 0 56623104 bytes!
081008 14:12:41 [ERROR] Plugin 'InnoDB' init function returned error.
081008 14:12:41 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

New table space FALCON_USER, id 1, type 0, filename falcon_user.fts
New table space FALCON_TEMPORARY, id 2, type 0, filename falcon_temporary.fts
New table space y, id 3, type 0, filename f19
New table space m, id 4, type 0, filename f8
New table space t, id 5, type 0, filename f2
New table space u, id 6, type 0, filename f18
New table space s, id 7, type 0, filename f20
New table space n, id 8, type 0, filename f10
New table space g, id 9, type 0, filename f6
New table space c, id 10, type 0, filename f3
New table space p, id 11, type 0, filename f16
New table space v, id 12, type 0, filename f4
New table space i, id 13, type 0, filename f15
New table space b, id 14, type 0, filename f5
New table space f, id 15, type 0, filename f1
New table space a, id 16, type 0, filename f11
New table space d, id 17, type 0, filename f7
New table space o, id 18, type 0, filename f12
New table space l, id 19, type 0, filename f17
New table space k, id 20, type 0, filename f9
New table space e, id 21, type 0, filename f14
New table space w, id 22, type 0, filename f13
Recovering database G:\bzr\mysql-6.0-falcon-team\mysql-test\var\master-data\falc
on_master.fts ...
first recovery block is 1
last recovery block is 2062
recovery read block is 2061
Recovery complete
0: Cache flush: 22 pages, 21 writes in 0 seconds (22 pps)
081008 14:12:41 [ERROR] Unknown/unsupported table type: INNODB
081008 14:12:41 [ERROR] Aborting
[8 Oct 2008 13:02] Philip Stoev
Vlad, please try the following on Linux:

$ perl runall.pl \
  --reporters=Recovery \
  --grammar=conf/example.yy \
  --basedir=/build/bzr/6.0-falcon-team \
  --duration=60

Note that MyISAM tables will be created, but it will be Falcon recovery that has an error.
[8 Oct 2008 13:22] Philip Stoev
Tablespace and logs before recovery.

Attachment: bug39212.zip (application/x-zip-compressed, text), 296.32 KiB.

[8 Oct 2008 13:23] Philip Stoev
Strace output for this file handle:

open("/build/bzr/6.0-falcon-team/mysql-test/var/master-data_recovery/falcon_master.fl2", O_RDWR|O_LARGEFILE) = 6
fcntl64(6, F_GETFL)         = 0x8002 (flags O_RDWR|O_LARGEFILE)
fcntl64(6, F_SETFL, O_RDWR|O_DIRECT|O_LARGEFILE) = 0
fstat64(6, {st_mode=S_IFREG|0660, st_size=512, ...}) = 0
pread64(6, 0xb4341200, 512, 0) = -1 EINVAL (Invalid argument)
[9 Oct 2008 11:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55914

2860 Vladislav Vaintroub	2008-10-09
      Bug#39212 Falcon exception on recovery when one of the log files 
      is empty
      
      Reason is a bug with reiserfs file system - read of a file of 
      size 1 to 4095 returns error (EINVAL), if file is opened with 
      O_DIRECT. ext3 behaves correct and returns number of bytes read.
      
      Solved with workaround: initial size of serial log files is now  
      at least 4096 bytes.
[9 Oct 2008 11:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55915

2860 Vladislav Vaintroub	2008-10-09
      Bug#39212 Falcon exception on recovery when one of the log files 
      is empty
            
      Reason is a bug in reiserfs file system - read of a file of 
      size 1 to 4095 returns error (EINVAL), if file is opened with 
      O_DIRECT. ext3 behaves correct and returns number of bytes read.
            
      Solved with workaround: initial size of serial log files is now  
      at least 4096 bytes.
[9 Oct 2008 14:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55938

2860 Vladislav Vaintroub	2008-10-09
      Bug#39212 Falcon exception on recovery when one of the log files 
      is empty
                  
      Reason is a bug in reiserfs file system.
      The serial log file is initially 512 bytes.  On reiserfs,
      for files opened with O_DIRECT, read() will fail if file size
      is between 1 to 4095 returns error (EINVAL). The bug supposedly 
      occurs because file size is smaller than page size.
                  
      Solved with a workaround: initial size of serial log files is 
      now 8K (we assume largest page size we need to deal with is 
      8KB).
[28 Oct 2008 8:10] Bugs System
Pushed into 6.0.8-alpha  (revid:vvaintroub@mysql.com-20081009140709-dcecamn7fko6yb0h) (version source revid:stewart@flamingspork.com-20081009143136-tbayn5x8rgvvu9mf) (pib:5)
[9 Jan 2009 14:06] MC Brown
A note has been added to the 6.0.8 changelog: 

When using Falcon on ReiserFS file systems, the initial size of the serial log could cause problems during recovery if the size of the log file was less than 4KB. The minimum size of the serial log file has now been increased to 8KB to address the problem.