Bug #40402 Falcon assertion "write error on page 0 (4096/4096/4)", followed by core dump
Submitted: 29 Oct 2008 18:54 Modified: 8 Sep 13:37
Reporter: Sven Sandberg
Status: Verified
Category:Server: Falcon Severity:S2 (Serious)
Version:6.0-rpl OS:Linux
Assigned to: Olav Sandstaa Target Version:6.0-beta
Tags: falcon, core, mtr, test failure, sporadic, F_PAGE IO
Triage: Triaged: D1 (Critical)

[29 Oct 2008 18:54] Sven Sandberg
Description:
I got a crash when running rpl.rpl_row_basic_11bugs on my local machine. The full error
message doesn't fit the size of the description field, so I'll attach it separately. The
stack trace for the failing thread is:

#0  0xb7ee8410 in __kernel_vsyscall ()
#0  0xb7ee8410 in __kernel_vsyscall ()
#1  0xb7ea1ae7 in pthread_kill () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x08995a91 in my_write_core (sig=6) at stacktrace.c:307
#3  0x083341c4 in handle_segfault (sig=6) at mysqld.cc:2674
#4  <signal handler called>
#5  0xb7ee8410 in __kernel_vsyscall ()
#6  0xb7ea4d30 in raise () from /lib/tls/i686/cmov/libpthread.so.0
#7  0x08683c50 in Error::debugBreak () at Error.cpp:94
#8  0x08683ce9 in Error::error (
    string=0x8b5a2c0 "write error on page %d (%d/%d/%d) of \"%s\": %s (%d)")
    at Error.cpp:71
#9  0x0869decf in IO::writePages (this=0xb6ef47d4, pageNumber=0, length=4096, 
    data=0xb6ef2000 "\001", type=0) at IO.cpp:362
#10 0x0869dfac in IO::writePage (this=0xb6ef47d4, bdb=0xb714c084, type=0)
    at IO.cpp:326
#11 0x0865810a in Cache::writePage (this=0xb6ef4c08, bdb=0xb714c084, type=0)
    at Cache.cpp:546
#12 0x08658467 in Cache::flush (this=0xb6ef4c08, dbb=0xb6ef47d0)
    at Cache.cpp:684
#13 0x08678115 in Dbb::flush (this=0xb6ef47d0) at Dbb.cpp:471
#14 0x0867ab6f in Dbb::open (this=0xb6ef47d0, 
    fileName=0xbfb2d9c0
"/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts",
cacheSize=4194304, transId=0)
    at Dbb.cpp:552
#15 0x08672dac in Database::openDatabase (this=0xb70f5640, 
    filename=0xbfb2d9c0
"/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts")
at Database.cpp:695
#16 0x08664d35 in Connection::getDatabase (this=0xb6ef4590, 
    dbName=0xb6ef4234 "FALCON_MASTER", 
    dbFileName=0xbfb2d9c0
"/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts",
threads=0xb6ef4290)
    at Connection.cpp:1651
#17 0x086662b2 in Connection::openDatabase (this=0xb6ef4590, 
    dbName=0xb6ef4234 "FALCON_MASTER", 
    filename=0xb6ef4264 "falcon_master.fts", account=0x8b4c82c "mysql", 
    password=0x8b4c82c "mysql", address=0x0, parent=0xb6ef4290)
    at Connection.cpp:934
#18 0x08623de2 in StorageDatabase::getOpenConnection (this=0xb6ef4130)
    at StorageDatabase.cpp:136
#19 0x08626549 in StorageHandler::initialize (this=0xb70f5028)
    at StorageHandler.cpp:993
#20 0x0861e221 in StorageInterface::falcon_init (p=0x96e4fe8)
    at ha_falcon.cpp:237
#21 0x08478a29 in ha_initialize_handlerton (plugin=0x96df840) at handler.cc:442
#22 0x0853479c in plugin_initialize (plugin=0x96df840) at sql_plugin.cc:1006
#23 0x0853860a in plugin_init (argc=0x8d57134, argv=0x96a00d0, flags=0)
    at sql_plugin.cc:1213
#24 0x08337d25 in init_server_components () at mysqld.cc:4097
#25 0x0833889e in main (argc=9, argv=0xbfb2e184) at mysqld.cc:4579

And there's the following message:

[Falcon] Error: write error on page 0 (4096/4096/4) of
"/home/sven/bzr/b40257-test_innodb_flag/6.0-rpl/mysql-test/var/2/mysqld.1/data/falcon_master.fts":
Input/output error (5)

How to repeat:
Run the test suite a few times in 6.0-rpl.
Some system info:

$ uname -a
Linux riska 2.6.24-21-generic #1 SMP Mon Aug 25 17:32:09 UTC 2008 i686 GNU/Linux

This happens on an ext3 partition with 480 MB free space. It seems to happen in
replication tests on the master.
[29 Oct 2008 18:56] Sven Sandberg
Full error message (2 failures)

Attachment: _tmp_ (application/octet-stream, text), 22.51 KiB.

[22 Dec 2008 21:13] Hakan Kuecuekyilmaz
I get a very similar stacktrace running DBT2 with 100 warehouses and 499 concurrent users.
There is plenty of disk space on caneland ~100GB.

hakank@caneland:~> more /data0/mysql/caneland.err
081221 15:36:12 [Note] Plugin 'InnoDB' disabled by command line option
[Falcon] Error: write error on page 0 (8192/8192/3) of "/data0/mysql/falcon_master.fts":
Input/output error (5)
081221 15:36:14 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388608
read_buffer_size=1048576
max_used_connections=0
max_threads=1601
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4943512 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x40000
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(my_print_stacktrace+0x2e) [0xaa434e]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(handle_segfault+0x336) [0x655506]
/lib64/libpthread.so.0 [0x2b8e7fc60c00]
/lib64/libpthread.so.0(raise+0x2d) [0x2b8e7fc60aad]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Error::error(char const*, ...)+0xf9)
[0x855ad9]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(IO::writePages(int, int, unsigned char
const*, int)+0x1ac) [0x8ac
c3c]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Cache::writePage(Bdb*, int)+0xc2)
[0x919742]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Cache::flush(Dbb*)+0xab) [0x919a8b]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Dbb::open(char const*, long long, unsigned
int)+0x37f) [0x89e8df]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Database::openDatabase(char const*)+0x2e)
[0x897d7e]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Connection::getDatabase(char const*, char
const*, Threads*)+0xd6)
 [0x88d846]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(Connection::openDatabase(char const*, char
const*, char const*, c
har const*, char const*, Threads*)+0xa4) [0x8902c4]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageDatabase::getOpenConnection()+0x5a)
[0x85ce6a]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageHandler::initialize()+0x96)
[0x860066]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(StorageInterface::falcon_init(void*)+0x187)
[0x851db7]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(ha_initialize_handlerton(st_plugin_int*)+0x3f)
[0x75a30f]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld [0x7dfff2]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(plugin_init(int*, char**, int)+0x8b9)
[0x7e2989]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld [0x65a378]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(main+0x677) [0x65b287]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x2b8e802ef184]
/data0/work/mysql/mysql-6.0-falcon/sql/mysqld(__gxx_personality_v0+0xe1) [0x583439]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[22 Dec 2008 21:15] Hakan Kuecuekyilmaz
One difference to the stack trace before is that I am running Falcon with page size of 8k.
[25 Jan 21:25] Hakan Kuecuekyilmaz
Reproduced with running DBT2, too:

prior open database failure was: write error on page 0 (8192/8192/3) of
"/data0/mysql/falcon_master.fts": Input
/output error (5)
[3 Apr 15:39] Olav Sandstaa
Note that the code that earlier wrote this error message and then "crashed" the process
has been re-written. The error message is now replaced by throwing an exception (see
patch attached to Bug#38970).

As a consequence of this change, this failure will no longer occur as a crash in
IO::writePages but more likely be returned to Falcon as an initialization error.
[8 Sep 13:37] Olav Sandstaa
This bug might be caused by the same issue as Bug#43490.