Bug #11366 mysqld process cores if cluster dies during SQL thread catch-up to master phase
Submitted: 15 Jun 2005 22:57 Modified: 14 Sep 2005 18:52
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.1.0-wl2325-wl1354-new OS:Linux (Linux)
Assigned to: Tomas Ulin CPU Architecture:Any

[15 Jun 2005 22:57] Jonathan Miller
Description:
Using a dump from  the master, I restored the slave cluster, started the slave and let the slave start processing from the masters bin log. The slave was 4800 second behind the master. As it was reprocessing, I crashed the cluster. (killed two data nodes in the same group of two groups and 4 data nodes). Opon the cluster deciding to shutdown the mysqld process cored.

BT
#0  0x4012cda1 in kill () from /lib/libc.so.6
#1  0x4005cf4a in pthread_kill () from /lib/libpthread.so.0
#2  0x0825c1d7 in write_core ()
#3  0x0816e8b7 in handle_segfault ()
#4  0x400605cd in __pthread_sighandler () from /lib/libpthread.so.0
#5  <signal handler called>
#6  0x08386cb0 in NdbTransaction::getNdbOperation(NdbTableImpl const*, NdbOperation*) ()
#7  0x08386dae in NdbTransaction::getNdbOperation(NdbDictionary::Table const*) ()
#8  0x0821c570 in ha_ndbcluster::write_row(char*) ()
#9  0x0820b53d in handler::ha_write_row(char*) ()
#10 0x081e5753 in Rows_log_event::exec_event(st_relay_log_info*) ()
#11 0x08252602 in exec_relay_log_event(THD*, st_relay_log_info*) ()
#12 0x082506ff in handle_slave_sql ()
#13 0x4005a6de in pthread_start_thread () from /lib/libpthread.so.0
(gdb) frame 7
#7  0x08386dae in NdbTransaction::getNdbOperation(NdbDictionary::Table const*) ()
(gdb) list
1       in ../sysdeps/i386/elf/start.S
(gdb) frame 8
#8  0x0821c570 in ha_ndbcluster::write_row(char*) ()
(gdb) list
1       in ../sysdeps/i386/elf/start.S
(gdb)

How to repeat:
1) setup master cluster
2) start bank test
3) dump cluster
4) restore dump on slave cluster
5) start the slave cluster
6) crash cluster.
[14 Sep 2005 18:52] Jonathan Miller
Tomas, I tried several different ways to reproduce this with the latest wl2325 clone, and could not. Closing it for now.
[jbm]