Bug #74444 mysqld crashes as soon as replication starts
Submitted: 20 Oct 2014 7:17 Modified: 3 Jan 2015 12:16
Reporter: Eugene Zheganin Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.6.21 OS:Solaris
Assigned to: CPU Architecture:Any
Tags: replication

[20 Oct 2014 7:17] Eugene Zheganin
Description:
mysqld crashes as soon as the replication starts.
this is 100% reproducible. mysqld keeps crashing after this, until slave is stopped.

log with backtrace:

===Cut===
07:08:02 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=16777216
read_buffer_size=16777216
max_used_connections=5
max_threads=1024
thread_count=3
connection_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 17044936 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xb686a848
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
/usr/local/mysql/bin/mysqld'my_print_stacktrace+0x2a [0xcd931c]
/usr/local/mysql/bin/mysqld'handle_fatal_signal+0x3b0 [0x9fbf98]
/lib/amd64/libc.so.1'__sighndlr+0x6 [0xffff80ffbf445556]
/lib/amd64/libc.so.1'call_user_handler+0x2ce [0xffff80ffbf4389da]
/usr/local/mysql/bin/mysqld'_Z10free_blobsP5TABLE+0xf [0xb31d77] [Signal 11 (SEGV)]
/usr/local/mysql/bin/mysqld'_ZN14Relay_log_info20clear_tables_to_lockEv+0x92 [0xcb4112]
/usr/local/mysql/bin/mysqld'_ZN14Relay_log_info25slave_close_thread_tablesEP3THD+0xec [0xcb425a]
/usr/local/mysql/bin/mysqld'_ZN14Relay_log_info15cleanup_contextEP3THDb+0xc9 [0xcb3ffd]
/usr/local/mysql/bin/mysqld'_ZL23rows_event_stmt_cleanupPK14Relay_log_infoP3THD+0x71 [0xc71ffe]
/usr/local/mysql/bin/mysqld'_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xd92 [0xc71e6a]
/usr/local/mysql/bin/mysqld'_ZN9Log_event11apply_eventEP14Relay_log_info+0x23a [0xc62100]
/usr/local/mysql/bin/mysqld'_Z26apply_event_and_update_posPP9Log_eventP3THDP14Relay_log_info+0x1b9 [0xca3dda]
/usr/local/mysql/bin/mysqld'_ZL20exec_relay_log_eventP3THDP14Relay_log_info+0x2c0 [0xca4aaf]
/usr/local/mysql/bin/mysqld'handle_slave_sql+0x73d [0xca870c]
/usr/local/mysql/bin/mysqld'pfs_spawn_thread+0x156 [0xd1f579]
/lib/amd64/libc.so.1'_thrp_setup+0xa5 [0xffff80ffbf4451f1]
/lib/amd64/libc.so.1'_lwp_start+0x0 [0xffff80ffbf445490]
Please read http://dev.mysql.com/doc/refman/5.1/en/resolve-stack-dump.html
and follow instructions on how to resolve the stack trace.
Resolved stack trace is much more helpful in diagnosing the
problem, so please do resolve it

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 10570
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
141020 11:08:05 mysqld_safe mysqld restarted
===Cut===

I've tried the proposed procedure to resolve stack dump, but it seems like this script is Linux-obly, because mangled symbols remained.

How to repeat:
get a mysql 5.6.21, try to set up a replication on it.
[20 Oct 2014 7:19] Eugene Zheganin
5.6.10 and 5.6.16 work just fine.
[20 Oct 2014 7:59] Eugene Zheganin
Downgrading to 5.6.16 helped, without any actions with data.
[20 Oct 2014 8:33] MySQL Verification Team
Probably duplicate of:
Bug 19724470 - CRASH WHEN RECEIVE OLD-STYLE ROW EVENTS FROM TIMESTAMP COLUMNS 
Bug 19704825 - TEMPORARY SLAVE TYPE CONVERSION TABLES RELEASED TO EARLY
[20 Oct 2014 11:51] Daniël van Eeden
You could try to create a core file:
Add 'core-file' to my.cnf and set core size: ulimit -c unlimited

Then you can create a stacktrace with mdb or dbx.

In mdb:
$G (Enable C++ symbol demangling)
$C (show stack)

In dbx: where

I assume the 5.1 url in your message is actually coming from 5.6?
[20 Oct 2014 11:52] Daniël van Eeden
dbx is part of Oracle Solaris Studio.

Are you running Solaris 10 or 11? Which exact release? (/etc/release)
[20 Oct 2014 12:11] Eugene Zheganin
1) yes, a reference to 5.1 comes in/from 5.6.x installation and I find it weird indeed.
2) I'm running Solaris x86 11.2 version. 
3) Mysql was built with gcc:

# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/gcc/4.5/lib/gcc/i386-pc-solaris2.11/4.5.2/lto-wrapper
Target: i386-pc-solaris2.11
Configured with: /builds/hudson/workspace/nightly-update/build/i386/components/gcc45/gcc-4.5.2/configure CC=/ws/on11update-tools/SUNWspro/sunstudio12.1/bin/cc CXX=/ws/on11update-tools/SUNWspro/sunstudio12.1/bin/CC --prefix=/usr/gcc/4.5 --mandir=/usr/gcc/4.5/share/man --bindir=/usr/gcc/4.5/bin --libdir=/usr/gcc/4.5/lib --sbindir=/usr/gcc/4.5/sbin --infodir=/usr/gcc/4.5/share/info --libexecdir=/usr/gcc/4.5/lib --enable-languages=c,c++,fortran,objc --enable-shared --with-gmp-include=/usr/include/gmp --with-mpfr-include=/usr/include/mpfr --without-gnu-ld --with-ld=/usr/bin/ld --with-gnu-as --with-as=/usr/gnu/bin/as CFLAGS='-g -O2 '
Thread model: posix
gcc version 4.5.2 (GCC)

Thanks for providing info on how to take a normal backtrace from core.
But the thing is, this server now is running a production installation. I will need some days to recreate the situation on a panicbox. Then I will get the core file and a bt. Thanks.
[20 Oct 2014 19:51] Daniël van Eeden
The version number in the message is fixed with this commit:

commit ca9785e963a5d2881f46c3528548243b40ab9729
Author: Jon Olav Hauglid <jon.hauglid@oracle.com>
Date:   Mon Oct 14 15:06:30 2013 +0200

    Bug#17465503: STACKTRACE CODE REFERENCES OLD MANUAL LINKS
    
    This patch changes the strack trace error log message
    on Solaris so that it doesn't statically refer to a 5.1
    web page, but instead uses MYSQL_VERSION_MAJOR and
    MYSQL_VERSION_MINOR to refer to the web page matching
    the current release.

This is part of 5.7.3-m13, but is not in the 5.6 branch.

https://github.com/mysql/mysql-server/blob/5.6/mysys/stacktrace.c#L171
[16 Nov 2014 9:49] mahmoud tuweiq
i think this related to binary log format 
master version: 5.6.21 binary log format is MIXED
slave version: 5.6.21 binary log format is STATEMENT

when i start the replication is toped because of binary log format
 see the result of " show slave status :

"
mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: souq-read6.ciht5im6msyg.eu-west-1.rds.amazonaws.com
                  Master_User: slaveUSR
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin-changelog.000812
          Read_Master_Log_Pos: 26722634
               Relay_Log_File: masterdb-relay-bin.000002
                Relay_Log_Pos: 971
        Relay_Master_Log_File: mysql-bin-changelog.000533
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1666
                   Last_Error: Error executing row event: 'Cannot execute statement: impossible to write to binary log since statement is in row format and BINLOG_FORMAT = STATEMENT.'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 6203802
              Relay_Log_Space: 5470197180
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1666
               Last_SQL_Error: Error executing row event: 'Cannot execute statement: impossible to write to binary log since statement is in row format and BINLOG_FORMAT = STATEMENT.'
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 129519433
                  Master_UUID: 614b6b24-662c-11e4-a26b-22000a4c06d0
             Master_Info_File: /data/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 141116 09:14:05
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)
"

so , to change binary format i did the following 

then when i start the slave  mysql is gone way 

 show slave status \G
show global variables like 'bin%';

ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
ERROR: 
Can't connect to the server

and  work again but binary log  switch back to statement 
this following is the result  of binary log format 

mysql>  show global variables like 'binlog_format';
+---------------+-----------+
| Variable_name | Value     |
+---------------+-----------+
| binlog_format | STATEMENT |
+---------------+-----------+
1 row in set (0.00 sec)
[16 Nov 2014 9:54] mahmoud tuweiq
this how i change the binary log format on slave 
stop slave ;
FLUSH TABLES WITH READ LOCK;
FLUSH LOGS;
SET GLOBAL binlog_format = 'MIXED';
FLUSH LOGS;
UNLOCK TABLES;           
 

mysql>  show global variables like 'binlog_format' ;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | MIXED |
+---------------+-------+
1 row in set (0.00 sec)

mysql> 
mysql> 
mysql> start slave ; 
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like 'binlog_format' ;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
ERROR: 
Can't connect to the server

mysql> show global variables like 'binlog_format' ;
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)
ERROR: 
Can't connect to the server

mysql> show global variables like 'binlog_format' ;
No connection. Trying to reconnect...
Connection id:    4
Current database: *** NONE ***

+---------------+-----------+
| Variable_name | Value     |
+---------------+-----------+
| binlog_format | STATEMENT |
+---------------+-----------+
1 row in set (0.00 sec)

 

 show global variables like 'binlog_format' ;
[16 Nov 2014 16:17] Daniël van Eeden
Are both master and slave running 5.6.21?

Could you post your config? At least master_info_repository, log_slave_updates (will be ON), slave_parallel_workers and master_verify_checksum.
[16 Nov 2014 16:29] Daniël van Eeden
Could you try with a different version? (5.6.10, 5.7.5-m15, etc)
Did you install from binaries or from a tarball?
[17 Nov 2014 11:56] mahmoud tuweiq
both of master and slave version is 5.6.21 
what i did is downgrade the slave to 5.6.20 
then rebuild the replication and that
make it working fine
[17 Nov 2014 14:34] MySQL Verification Team
It's a known problem.   The problem is the servers have different physical formats for tables with temporal columns.  Maybe one server was created via mysqldump and another was upgraded from 5.5...
[3 Dec 2014 10:09] MySQL Verification Team
Please try 5.6.22 and let us know if that works.  It has a bugfix I think would solve this.
[3 Dec 2014 12:12] Eugene Zheganin
About servers running different versions - yes, that was exactly my situation. I was upgrading my installation and I was restoring mysql dump from 5.5 on a 5.6 server, and then enabling replication from 5.5 master. But this is still a bug, right ?
[3 Dec 2014 12:16] MySQL Verification Team
On http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-22.html search for Bug #18770469, Bug #19704825 that is the one I believe you encountered.
[4 Jan 2015 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".