Bug #77812 Got signal 11 when start slave, after optimize huge tables
Submitted: 23 Jul 2015 10:17 Modified: 29 Jul 2015 10:52
Reporter: Lee Johnson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.6.21 OS:CentOS
Assigned to: CPU Architecture:Any
Tags: optimize table, Signal 11, slave start

[23 Jul 2015 10:17] Lee Johnson
Description:
mysql version 5.6.21-log.
os is CentOS release 6.5 (Final), 2.6.32-431.23.3.el6.x86_64
There is a master-slave DBs. 

There are some huge tables with each size of more than 200GB  (innodb file). I want to optimize them to relocate disk space. The operation is being done in slave server. 
First , I stop the slave, do optimize table tab1; optimize table tab2; ,and then start slave. After while the slave server crashes with the following error. I recreate the replication but nothing works. And the binlogs are intact after checking. 

2015-07-23 10:11:06 27703 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2015-07-23 10:11:06 27703 [Note] Slave I/O thread: connected to master 'slave@10.0.0.1:3306',replication started in log 'mysql-bin.002619' at position 292706911
2015-07-23 10:11:06 27703 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.002619' at position 292706538, relay log '/var/lib/mysql/log/relay_log/mysql-relay-bin.000032' position: 292706701
03:18:06 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=6
max_threads=8000
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 16755581 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fc680000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fc8e8386e20 thread_stack 0x40000
/var/lib/mysql/bin/mysqld(my_print_stacktrace+0x35)[0x8e5665]
/var/lib/mysql/bin/mysqld(handle_fatal_signal+0x41b)[0x65296b]
/lib64/libpthread.so.0[0x3ee540f710]
/var/lib/mysql/bin/mysqld(_Z10unpack_rowPK14Relay_log_infoP5TABLEjPKhPK9st_bitmapPS5_PmS5_+0x189)[0x8a3f09]
/var/lib/mysql/bin/mysqld(_ZN14Rows_log_event24do_index_scan_and_updateEPK14Relay_log_info+0x150)[0x878d80]
/var/lib/mysql/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x944)[0x87bd04]
/var/lib/mysql/bin/mysqld(_ZN9Log_event11apply_eventEP14Relay_log_info+0x68)[0x889d58]
/var/lib/mysql/bin/mysqld(_Z26apply_event_and_update_posPP9Log_eventP3THDP14Relay_log_info+0x239)[0x8b3179]
/var/lib/mysql/bin/mysqld[0x8bc533]
/var/lib/mysql/bin/mysqld(handle_slave_sql+0x9bf)[0x8bdd2f]
/var/lib/mysql/bin/mysqld(pfs_spawn_thread+0x127)[0xab0767]
/lib64/libpthread.so.0[0x3ee54079d1]
/lib64/libc.so.6(clone+0x6d)[0x3ee50e886d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 583403
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
150723 11:18:10 mysqld_safe Number of processes running now: 0
150723 11:18:10 mysqld_safe mysqld restarted 

How to repeat:
1. set master-slave DB
2. made some huge tables with each size more that 180GB
3. apply some DML (for example  tpcc , sysbench etc.) statements on  master to simulate normal DB modification
4. optimize tables on huge tables, two or more tables, each of them is more than 180GB.
5. start slave.
6. server restarts with signal 11.

Suggested fix:
I found bug http://bugs.mysql.com/bug.php?id=7658 maybe similar bug.
[23 Jul 2015 10:56] MySQL Verification Team
This crash is with RBR events.  There is at least one such crash fixed in current 5.6.25.    Hint - an out of sync slave, and a master with binlog_row_image = minimal can trigger this.

Therefore please consider upgrade to 5.6.25 and report back here if that doesn't help.
[24 Jul 2015 4:22] Lee Johnson
Thanks Shane.

I will upgrade and try. Any progress will be pasted here.
[29 Jul 2015 10:19] Lee Johnson
5.6.25 fixed the issue indeed.

I upgraded and re-did the test, It ran correctly.

thank you Shane.