Bug #67291 Production MySQL master server crashed and could not get recovered
Submitted: 18 Oct 2012 18:19 Modified: 18 Oct 2012 21:43
Reporter: steven tang Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.5.11 OS:Linux (CentOS release 4.7 (Final))
Assigned to: CPU Architecture:Any

[18 Oct 2012 18:19] steven tang
Description:
We had a production MySQL master server crash and it could not get recovered and we had to re-create the master from a slave which caused a major outage!!!

The following is the master in trouble and then crashed before we had to re-created it.

-------------------------------------------------------------
121017 13:22:25 [Warning] Aborted connection 44656588 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got timeout reading communication packets)
121017 13:22:25 [Warning] Aborted connection 44655081 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got timeout reading communication packets)
121017 13:22:46 [Warning] Aborted connection 44657664 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got timeout reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44658031 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got an error reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44657992 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got an error reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44658030 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got an error reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44658022 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got an error reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44658010 to db: 'quadrant' user: 'writer' host: '172.16.78.48' (Got an error reading communication packets)
121017 13:22:51 [Warning] Aborted connection 44657998 to db: 'quadrant' 121017 13:25:20  InnoDB: Assertion failure in thread 1566763360 in file /export/home/pb2/build/sb_0-3159149-1301580870.56/mysql-5.5.11/storage/innobase/ibuf/ibuf0ibuf.c line 4130
InnoDB: Failing assertion: page_get_n_recs(page) > 1
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
121017 13:25:20 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1073741824
read_buffer_size=4194304
max_used_connections=643
max_threads=500
thread_count=304
connection_count=304
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5155958 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = (nil) thread_stack 0x40000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x39)[0x795f29]
/usr/local/mysql/bin/mysqld(handle_segfault+0x380)[0x4fce00]
/lib64/tls/libpthread.so.0[0x32df10c5b0]
/lib64/tls/libc.so.6(gsignal+0x3d)[0x32de62e26d]
/lib64/tls/libc.so.6(abort+0xfe)[0x32de62fa6e]
/usr/local/mysql/bin/mysqld[0x86cd2d]
/usr/local/mysql/bin/mysqld[0x8224c1]
/usr/local/mysql/bin/mysqld[0x83163e]
/usr/local/mysql/bin/mysqld[0x831b37]
/usr/local/mysql/bin/mysqld[0x823540]
/usr/local/mysql/bin/mysqld[0x80cd3c]
/usr/local/mysql/bin/mysqld[0x80f4e4]
/usr/local/mysql/bin/mysqld[0x7ca5c7]
/usr/local/mysql/bin/mysqld[0x8b00e6]
/usr/local/mysql/bin/mysqld[0x8b058e]
/usr/local/mysql/bin/mysqld[0x8b0dbd]
/usr/local/mysql/bin/mysqld[0x8a6b4d]
/usr/local/mysql/bin/mysqld[0x7e8e35]
/usr/local/mysql/bin/mysqld[0x7def9c]
/lib64/tls/libpthread.so.0[0x32df106137]
/lib64/tls/libc.so.6(__clone+0x73)[0x32de6c9883]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
mysqld: /export/home/pb2/build/sb_0-3159149-1301580870.56/mysql-5.5.11/mysys/my_new.cc:51: int __cxa_pure_virtual(): Assertion `! "Aborted: pure virtual method called."' failed.
mysqld: /export/home/pb2/build/sb_0-3159149-1301580870.56/mysql-5.5.11/mysys/my_new.cc:51: int __cxa_pure_virtual(): Assertion `! "Aborted: pure virtual method called."' failed.
121017 13:25:25 mysqld_safe Number of processes running now: 0
121017 13:25:25 mysqld_safe mysqld restarted
121017 13:25:26 [Note] Plugin 'FEDERATED' is disabled.
121017 13:25:26 InnoDB: The InnoDB memory heap is disabled
121017 13:25:26 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121017 13:25:26 InnoDB: Compressed tables use zlib 1.2.3
121017 13:25:26 InnoDB: Using Linux native AIO
121017 13:25:26 InnoDB: Initializing buffer pool, size = 13.0G
121017 13:25:27 InnoDB: Completed initialization of buffer pool
121017 13:25:27 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 16191973909557
121017 13:25:27  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 16191979152384
...

/usr/local/mysql/bin/mysqld[0x7dfdc0]
/lib64/tls/libpthread.so.0[0x32df106137]
/lib64/tls/libc.so.6(__clone+0x73)[0x32de6c9883]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
121017 13:42:41 mysqld_safe mysqld from pid file /opt/mysql/data1/q3-db1.webtech.local.pid ended
-------------------------------------------------------------

How to repeat:
If it repeats again, we are close to out of business!!

detailed log will be attached soon.

Suggested fix:
It seems to be that MySQL server encountered a bug and corrupted the data and could not recover from there!
[18 Oct 2012 18:20] steven tang
crashed Master the could not get recovered error log

Attachment: mysqld1-q3-db1.err (text/plain), 79.12 KiB.

[18 Oct 2012 21:25] MySQL Verification Team
Looks like a duplicate of a known bug:
http://bugs.mysql.com/bug.php?id=61104
http://bugs.mysql.com/bug.php?id=66819

Workaround, upgrade and/or put this into my.cnf:
innodb_change_buffering=inserts
[18 Oct 2012 21:43] steven tang
We are going to upgrade to MySQL server 5.5.25a soon.

Would this upgrade be sufficient to avoid this crash in the future?

Thanks,
Steven