Bug #35003 Upgrading Slave from CGE6.1.23 to CGE6.2.12 crashes mysqld reading relay logs
Submitted: 3 Mar 2008 14:39 Modified: 8 Mar 2008 10:41
Reporter: Geert Vanderkelen Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Replication Severity:S2 (Serious)
Version:5.1.23_6.2.12 OS:Linux
Assigned to: Assigned Account CPU Architecture:Any

[3 Mar 2008 14:39] Geert Vanderkelen
Description:
After upgrading the Slave cluster to CGE 6.2.12, restarting the mysqld into this new version and starting the slave makes the mysqld segfault.

How to repeat:
Reproducible putting in the relay logs privately provided.

* Master and Slave clusters running in CGE 6.1.23
* Upgrade Slave cluster to CGE 6.2.12
* Put in the relay logs and make sure the mysqld is reading them
* Starting replication using SLAVE START gives following:

080303 13:22:03 [Note] NDB: NodeID is 5, management server 'ndbsup-priv-2:1406'
080303 13:22:04 [Note] NDB[0]: no storage nodes connected (timed out)
080303 13:22:04 [Note] Starting MySQL Cluster Binlog Thread
080303 13:22:04 [Note] Event Scheduler: Loaded 0 events
080303 13:22:04 [Note] /data1/mysql/5.1.23_6.2.12/bin/mysqld: ready for connections.
Version: '5.1.23-ndb-6.2.12-telco' socket: '/tmp/mysql_geert.sock' port: 3377 Source distribution
080303 13:22:07 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_schema
080303 13:22:07 [Note] NDB Binlog: logging ./mysql/ndb_schema (UPDATED,USE_WRITE)
080303 13:22:07 [Note] NDB Binlog: DISCOVER TABLE Event: REPL$mysql/ndb_apply_status
080303 13:22:07 [Note] NDB Binlog: logging ./mysql/ndb_apply_status (UPDATED,USE_WRITE)
2008-03-03 13:22:07 [NdbApi] INFO -- Flushing incomplete GCI:s < 207290/12
080303 13:22:07 [Note] NDB Binlog: starting log at epoch 207290/12
080303 13:22:07 [Note] NDB Binlog: ndb tables writable

mysqld> START SLAVE;

080303 13:28:21 [Note] Slave SQL thread initialized, starting replication in log 'Dist-0_111_binlog.000001' at position 74359526, relay log '.
/Dist-0_211_relaylog.000002' position: 255
080303 13:28:21 [Note] Slave I/O thread: connected to master 'repl@ndbsup-1.mysql.com:3377',replication started in log 'master-log.000001' at 
position 222
080303 13:28:21 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=1
max_threads=151
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338206 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xd85e20
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x40bdb0e8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
(nil)
New value of fp=0xd85e20 failed sanity check, terminating stack trace!

A backtrace using a core (other test case than above):

#0 0x0000002a95787737 in pthread_kill () from /lib64/tls/libpthread.so.0
(gdb) bt
#0 0x0000002a95787737 in pthread_kill () from /lib64/tls/libpthread.so.0
#1 0x000000000076d976 in write_core (sig=11) at stacktrace.c:240
#2 0x000000000060ea2c in handle_segfault (sig=11) at mysqld.cc:2314
#3 <signal handler called>
#4 unpack_row_old (rli=0xed24a0, table=0xed0840, colcnt=49, record=0xe93210 "n", row=0xef7330 "\005ÿo", cols=0xf2d678, row_end=0x40c1be20, master_reclength=0xf2d6c0, rw_set=0xed0948, event_type=PRE_GA_WRITE_ROWS_EVENT) at field.h:290
#5 0x00000000006e6fcc in Write_rows_log_event_old::do_prepare_row (this=Variable "this" is not available.
) at log_event_old.cc:976
#6 0x00000000006e53b5 in Old_rows_log_event::do_apply_event (this=0xf2d720, ev=0xf2d620, rli=0xed24a0) at log_event_old.cc:230
#7 0x000000000075be4d in exec_relay_log_event (thd=0xed6ba0, rli=0xed24a0) at log_event.h:950
#8 0x000000000075c9f1 in handle_slave_sql (arg=Variable "arg" is not available.
) at slave.cc:2557
#9 0x0000002a9578410a in start_thread () from /lib64/tls/libpthread.so.0
#10 0x0000002a95ede8c3 in clone () from /lib64/tls/libc.so.6
#11 0x0000000000000000 in ?? ()
(gdb) bt full
#0 0x0000002a95787737 in pthread_kill () from /lib64/tls/libpthread.so.0
No symbol table info available.
#1 0x000000000076d976 in write_core (sig=11) at stacktrace.c:240
No locals.
#2 0x000000000060ea2c in handle_segfault (sig=11) at mysqld.cc:2314
curr_time = 1204182073
tm = {tm_sec = 13, tm_min = 1, tm_hour = 9, tm_mday = 28, tm_mon = 1, tm_year = 108, tm_wday = 4, tm_yday = 58, tm_isdst = 0, tm_gmtoff = 7200, tm_zone = 0xde9150 "EET"}
thd = (class THD *) 0xed6ba0
#3 <signal handler called>
No symbol table info available.
#4 unpack_row_old (rli=0xed24a0, table=0xed0840, colcnt=49, record=0xe93210 "n", row=0xef7330 "\005ÿo", cols=0xf2d678, row_end=0x40c1be20, master_reclength=0xf2d6c0, rw_set=0xed0948, event_type=PRE_GA_WRITE_ROWS_EVENT) at field.h:290
fptr = (class Field **) 0xe935d0
__PRETTY_FUNCTION__ = "int unpack_row_old(Relay_log_info*, TABLE*, uint, uchar*, const uchar*, const MY_BITMAP*, const uchar**, ulong*, MY_BITMAP*, Log_event_type)"
#5 0x00000000006e6fcc in Write_rows_log_event_old::do_prepare_row (this=Variable "this" is not available.
) at log_event_old.cc:976
error = Variable "error" is not available.
[8 Mar 2008 10:41] Andrei Elkin
Appeared to be a duplicate of bug#31581.