MySQL Bugs: #66497: crash after restarting a parallel slave

Bug #66497	crash after restarting a parallel slave
Submitted:	22 Aug 2012 13:01	Modified:	10 Jan 2013 14:00
Reporter:	Giuseppe Maxia (OCA)	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S1 (Critical)
Version:	5.6.6	OS:	Any (Linux, MacOS X)
Assigned to:		CPU Architecture:	Any
Tags:	crash, parallel, replication, repository

Description:
After restarting a slave that was using parallel threads, you get this in the error log

12:56:15 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=1
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68181 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
0   mysqld                              0x000000010ff2c90a my_print_stacktrace + 44^@
1   mysqld                              0x000000010fd9132a handle_fatal_signal + 705^@
2   libsystem_c.dylib                   0x00007fff83d06cfa _sigtramp + 26^@
3   ???                                 0x0000000000000005 0x0 + 5^@
4   mysqld                              0x000000010ff112b6 _ZN14Rpl_info_table11do_get_infoEiPhmPKh + 88^@
5   mysqld                              0x000000010ff1693e _ZN16Rpl_info_handler8get_infoIPhEEbT_mS2_ + 48^@
6   mysqld                              0x000000010ff14019 _ZN12Slave_worker9read_infoEP16Rpl_info_handler + 419^@
7   mysqld                              0x000000010ff12b5e _ZN8Rpl_info9copy_infoEP16Rpl_info_handlerS1_ + 22^@
8   mysqld                              0x000000010ff121b7 _ZN16Rpl_info_factory17decide_repositoryEP8Rpl_infojPP16Rpl_info_handlerS4_PPKc + 267^@
9   mysqld                              0x000000010ff128dc _ZN16Rpl_info_factory13create_workerEjjP14Relay_log_infob + 404^@
10  mysqld                              0x000000010fefac59 _Z19mts_recovery_groupsP14Relay_log_info + 328^@
11  mysqld                              0x000000010fefa99e _Z19start_slave_threadsbbP11Master_infoi + 370^@
12  mysqld                              0x000000010fefa5ad _Z10init_slavev + 395^@
13  mysqld                              0x000000010fec6bcc _Z11mysqld_mainiPPc + 9289^@
14  mysqld                              0x000000010fca0174 start + 52^@
15  ???                                 0x0000000000000009 0x0 + 9^@

120822 14:56:15 mysqld_safe mysqld restarted
120822 14:56:15 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
120822 14:56:15 [Warning] Setting lower_case_table_names=2 because file system for /Users/gmax/sandboxes/rsandbox_5_6_6/node1/data/ is case insensitive
120822 14:56:15 [Note] Plugin 'FEDERATED' is disabled.
120822 14:56:15 InnoDB: The InnoDB memory heap is disabled
120822 14:56:15 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120822 14:56:15 InnoDB: Compressed tables use zlib 1.2.3
120822 14:56:15 InnoDB: CPU supports crc32 instructions
120822 14:56:15 InnoDB: Initializing buffer pool, size = 128.0M
120822 14:56:15 InnoDB: Completed initialization of buffer pool
120822 14:56:15 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 1664116
120822 14:56:15  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 1666146
120822 14:56:15  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
120822 14:56:16 InnoDB: 128 rollback segment(s) are active.
120822 14:56:16 InnoDB: Waiting for the background threads to start
120822 14:56:16 InnoDB: 1.2.6 started; log sequence number 1666146
120822 14:56:16 [Note] Recovering after a crash using mysql-bin
120822 14:56:16 [Note] Starting crash recovery...
120822 14:56:16 [Note] Crash recovery finished.

How to repeat:
1) enable parallel replication
stop slave;
set global master_info_repository='table';
set global relay_log_info_repository='table';
set global slave_parallel_workers=5;
start slave;
select count(*) from mysql.slave_worker_info;

2) check that replication is working fine
3) shut down the slave server
4) check the error log for graceful shutdown
5) start the slave server 
6) check the error log again

Thank you for the bug report. Please provide both configuration files (master/slave) private if you wish. Thanks.

This test was done using MySQL Sandbox.

Master config:

[mysql]
prompt='master [\h] {\u} (\d) > '
#

[client]
user            = msandbox
password        = msandbox
port            = 12831
socket          = /tmp/mysql_sandbox12831.sock

[mysqld]
user                            = gmax
port                            = 12831
socket                          = /tmp/mysql_sandbox12831.sock
basedir                         = /Users/gmax/opt/mysql/5.6.6
datadir                         = /Users/gmax/sandboxes/rsandbox_5_6_6/master/data
tmpdir                          = /Users/gmax/sandboxes/rsandbox_5_6_6/master/tmp
pid-file                        = /Users/gmax/sandboxes/rsandbox_5_6_6/master/data/mysql_sandbox12831.pid
#log-slow-queries               = /Users/gmax/sandboxes/rsandbox_5_6_6/master/data/msandbox-slow.log
#log                            = /Users/gmax/sandboxes/rsandbox_5_6_6/master/data/msandbox.log
#
# additional options passed through 'my_clause' 
#
log-bin=mysql-bin
server-id=1
log-error=msandbox.err

Slave config:

[mysql]
prompt='slave1 [\h] {\u} (\d) > '
#

[client]
user            = msandbox
password        = msandbox
port            = 12832
socket          = /tmp/mysql_sandbox12832.sock

[mysqld]
user                            = gmax
port                            = 12832
socket                          = /tmp/mysql_sandbox12832.sock
basedir                         = /Users/gmax/opt/mysql/5.6.6
datadir                         = /Users/gmax/sandboxes/rsandbox_5_6_6/node1/data
tmpdir                          = /Users/gmax/sandboxes/rsandbox_5_6_6/node1/tmp
pid-file                        = /Users/gmax/sandboxes/rsandbox_5_6_6/node1/data/mysql_sandbox12832.pid
#log-slow-queries               = /Users/gmax/sandboxes/rsandbox_5_6_6/node1/data/msandbox-slow.log
#log                            = /Users/gmax/sandboxes/rsandbox_5_6_6/node1/data/msandbox.log
#
# additional options passed through 'my_clause' 
#
server-id=101
report-host=SBslave1
report-port=12832
log-bin=mysql-bin

Thank you for the report.

Verified as described.

Fixed in 5.6.7 as a duplicate of an internally filed bug.