Bug #40402 Falcon assertion "write error on page 0 (4096/4096/4)", followed by core dump
Submitted: 29 Oct 2008 17:54 Modified: 26 May 2010 17:50
Reporter: Sven Sandberg Email Updates:
Status: Unsupported Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S2 (Serious)
Version:6.0-rpl OS:Linux
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: core, F_PAGE IO, falcon, mtr, sporadic, test failure

[29 Oct 2008 17:54] Sven Sandberg
Description:
I got a crash when running rpl.rpl_row_basic_11bugs on my local machine. The full error message doesn't fit the size of the description field, so I'll attach it separately. The stack trace for the failing thread is:

#0  0xb7ee8410 in __kernel_vsyscall ()
#0  0xb7ee8410 in __kernel_vsyscall ()
#1  0xb7ea1ae7 in pthread_kill () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x08995a91 in my_write_core (sig=6) at stacktrace.c:307
#3  0x083341c4 in handle_segfault (sig=6) at mysqld.cc:2674
#4  <signal handler called>
#5  0xb7ee8410 in __kernel_vsyscall ()
#6  0xb7ea4d30 in raise () from /lib/tls/i686/cmov/libpthread.so.0
#7  0x08683c50 in Error::debugBreak () at Error.cpp:94
#8  0x08683ce9 in Error::error (
    string=0x8b5a2c0 "write error on page %d (%d/%d/%d) of \"%s\": %s (%d)")
    at Error.cpp:71
#9  0x0869decf in IO::writePages (this=0xb6ef47d4, pageNumber=0, length=4096, 
    data=0xb6ef2000 "\001", type=0) at IO.cpp:362
#10 0x0869dfac in IO::writePage (this=0xb6ef47d4, bdb=0xb714c084, type=0)
    at IO.cpp:326
#11 0x0865810a in Cache::writePage (this=0xb6ef4c08, bdb=0xb714c084, type=0)
    at Cache.cpp:546
#12 0x08658467 in Cache::flush (this=0xb6ef4c08, dbb=0xb6ef47d0)
    at Cache.cpp:684
#13 0x08678115 in Dbb::flush (this=0xb6ef47d0) at Dbb.cpp:471
#14 0x0867ab6f in Dbb::open (this=0xb6ef47d0, 
    fileName=0xbfb2d9c0 "/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts", cacheSize=4194304, transId=0)
    at Dbb.cpp:552
#15 0x08672dac in Database::openDatabase (this=0xb70f5640, 
    filename=0xbfb2d9c0 "/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts") at Database.cpp:695
#16 0x08664d35 in Connection::getDatabase (this=0xb6ef4590, 
    dbName=0xb6ef4234 "FALCON_MASTER", 
    dbFileName=0xbfb2d9c0 "/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts", threads=0xb6ef4290)
    at Connection.cpp:1651
#17 0x086662b2 in Connection::openDatabase (this=0xb6ef4590, 
    dbName=0xb6ef4234 "FALCON_MASTER", 
    filename=0xb6ef4264 "falcon_master.fts", account=0x8b4c82c "mysql", 
    password=0x8b4c82c "mysql", address=0x0, parent=0xb6ef4290)
    at Connection.cpp:934
#18 0x08623de2 in StorageDatabase::getOpenConnection (this=0xb6ef4130)
    at StorageDatabase.cpp:136
#19 0x08626549 in StorageHandler::initialize (this=0xb70f5028)
    at StorageHandler.cpp:993
#20 0x0861e221 in StorageInterface::falcon_init (p=0x96e4fe8)
    at ha_falcon.cpp:237
#21 0x08478a29 in ha_initialize_handlerton (plugin=0x96df840) at handler.cc:442
#22 0x0853479c in plugin_initialize (plugin=0x96df840) at sql_plugin.cc:1006
#23 0x0853860a in plugin_init (argc=0x8d57134, argv=0x96a00d0, flags=0)
    at sql_plugin.cc:1213
#24 0x08337d25 in init_server_components () at mysqld.cc:4097
#25 0x0833889e in main (argc=9, argv=0xbfb2e184) at mysqld.cc:4579

And there's the following message:

[Falcon] Error: write error on page 0 (4096/4096/4) of "/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts": Input/output error (5)

How to repeat:
Run the test suite a few times in 6.0-rpl.
Some system info:

$ uname -a
Linux riska 2.6.24-21-generic #1 SMP Mon Aug 25 17:32:09 UTC 2008 i686 GNU/Linux

This happens on an ext3 partition with 480 MB free space. It seems to happen in replication tests on the master.
[29 Oct 2008 17:56] Sven Sandberg
Full error message (2 failures)

Attachment: _tmp_ (application/octet-stream, text), 22.51 KiB.

[22 Dec 2008 20:13] Hakan Küçükyılmaz
I get a very similar stacktrace running DBT2 with 100 warehouses and 499 concurrent users. There is plenty of disk space on caneland ~100GB.

hakank@caneland:~> more /data0/mysql/caneland.err
081221 15:36:12 [Note] Plugin 'InnoDB' disabled by command line option
[Falcon] Error: write error on page 0 (8192/8192/3) of "/data0/mysql/falcon_master.fts": Input/output error (5)
081221 15:36:14 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=0
max_threads=1601
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4943512 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x40000
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(my_print_stacktrace+0x2e) [0xaa434e]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(handle_segfault+0x336) [0x655506]
/lib64/libpthread.so.0 [0x2b8e7fc60c00]
/lib64/libpthread.so.0(raise+0x2d) [0x2b8e7fc60aad]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Error::error(char const*, ...)+0xf9) [0x855ad9]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(IO::writePages(int, int, unsigned char const*, int)+0x1ac) [0x8ac
c3c]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Cache::writePage(Bdb*, int)+0xc2) [0x919742]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Cache::flush(Dbb*)+0xab) [0x919a8b]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Dbb::open(char const*, long long, unsigned int)+0x37f) [0x89e8df]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Database::openDatabase(char const*)+0x2e) [0x897d7e]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Connection::getDatabase(char const*, char const*, Threads*)+0xd6)
 [0x88d846]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Connection::openDatabase(char const*, char const*, char const*, c
har const*, char const*, Threads*)+0xa4) [0x8902c4]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageDatabase::getOpenConnection()+0x5a) [0x85ce6a]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageHandler::initialize()+0x96) [0x860066]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageInterface::falcon_init(void*)+0x187) [0x851db7]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(ha_initialize_handlerton(st_plugin_int*)+0x3f) [0x75a30f]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld [0x7dfff2]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(plugin_init(int*, char**, int)+0x8b9) [0x7e2989]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld [0x65a378]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(main+0x677) [0x65b287]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x2b8e802ef184]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(__gxx_personality_v0+0xe1) [0x583439]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[22 Dec 2008 20:15] Hakan Küçükyılmaz
One difference to the stack trace before is that I am running Falcon with page size of 8k.
[25 Jan 2009 20:25] Hakan Küçükyılmaz
Reproduced with running DBT2, too:

prior open database failure was: write error on page 0 (8192/8192/3) of "/data0/mysql/falcon_master.fts": Input
/output error (5)
[3 Apr 2009 13:39] Olav Sandstå
Note that the code that earlier wrote this error message and then "crashed" the process has been re-written. The error message is now replaced by throwing an exception (see patch attached to Bug#38970).

As a consequence of this change, this failure will no longer occur as a crash in IO::writePages but more likely be returned to Falcon as an initialization error.
[8 Sep 2009 11:37] Olav Sandstå
This bug might be caused by the same issue as Bug#43490.