Bug #40993 NDBD crashes during a tablespace addition
Submitted: 24 Nov 2008 19:43 Modified: 19 Feb 2009 10:45
Reporter: Matthew Schlegel Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S1 (Critical)
Version:mysql-5.1.27 ndb-6.3.17-RC OS:Linux (Linux zabbix2-hc.gxt.com 2.6.18-53.el5 #1 SMP Mon Nov 12 02:14:55 EST 2007 x86_64 x86_64 x86_64 GNU/)
Assigned to: CPU Architecture:Any
Tags: ndbd

[24 Nov 2008 19:43] Matthew Schlegel
Description:
During the addition of additional data files for a tablespace, one of the ndbd's in the cluster crashed.  Restarting the crashed ndbd did not bring the node back online, and after multiple restart attempts, the remaining ndbd crashed and will not restart.

How to repeat:
alter TABLESPACE zabbix_tablespace_1 add datafile 'zabbix_tblspace_1-pt3.dat' initial_size 2G engine ndb;
[24 Nov 2008 19:45] Matthew Schlegel
trace log from node

Attachment: ndb_6_trace.zip (application/x-zip-compressed, text), 46.93 KiB.

[24 Nov 2008 19:47] Matthew Schlegel
cluster configuration file

Attachment: config.ini (, text), 996 bytes.

[24 Nov 2008 19:48] Matthew Schlegel
cluster log from one startup attempt on one of the nodes

Attachment: cluster-log.txt (text/plain), 2.27 KiB.

[25 Nov 2008 20:29] Michael Senizaiz
ndb_6_*.log

Attachment: ndb_6_logs.tgz (application/octet-stream, text), 2.52 KiB.

[25 Nov 2008 20:30] Michael Senizaiz
MySQL Version:

mysql  Ver 14.14 Distrib 5.1.27-ndb-6.3.17, for redhat-linux-gnu (x86_64) using readline 5.1
[25 Nov 2008 20:32] Michael Senizaiz
The log shoes it going between block 245 and 246, back and forth, until the log ends.

--------------- Signal ----------------
r.bn: 246 "DBDIH", r.proc: 6, r.sigId: 221776 gsn: 327 "NDB_STTOR" prio: 1
s.bn: 251 "NDBCNTR", s.proc: 6, s.sigId: 221775 length: 22 trace: 0 #sec: 0 fragInf: 0
 H'00fb0006 H'00000006 H'00000002 H'00000001 H'00000006 H'00000002 H'88776655
 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655
 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655 H'88776655
 H'88776655
--------------- Signal ----------------
r.bn: 251 "NDBCNTR", r.proc: 6, r.sigId: 221775 gsn: 328 "NDB_STTORRY" prio: 1
s.bn: 245 "DBTC", s.proc: 6, s.sigId: 221774 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00f50006
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 6, r.sigId: 221774 gsn: 236 "DISEIZECONF" prio: 1
s.bn: 246 "DBDIH", s.proc: 6, s.sigId: 221773 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'0001100f H'00008807
--------------- Signal ----------------
r.bn: 246 "DBDIH", r.proc: 6, r.sigId: 221773 gsn: 238 "DISEIZEREQ" prio: 1
s.bn: 245 "DBTC", s.proc: 6, s.sigId: 221772 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'0001100f H'00f50006
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 6, r.sigId: 221772 gsn: 236 "DISEIZECONF" prio: 1
s.bn: 246 "DBDIH", s.proc: 6, s.sigId: 221771 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'0001100e H'00008806
--------------- Signal ----------------
r.bn: 246 "DBDIH", r.proc: 6, r.sigId: 221771 gsn: 238 "DISEIZEREQ" prio: 1
s.bn: 245 "DBTC", s.proc: 6, s.sigId: 221770 length: 2 trace: 0 #sec: 0 fragInf: 0
 H'0001100e H'00f50006
[28 Jan 2009 19:50] Hartmut Holzgraefe
fails in 

  3563 /**
  3564  * Drop object during NR/SR
  3565  */
  3566 void
  3567 Dbdict::restartDropObj(Signal* signal,
  3568                        Uint32 tableId,
  3569                        const SchemaFile::TableEntry * entry)
  3570 {
  ....
  3593   ndbout_c("Dropping %d %d", tableId, entry->m_tableType);
  3594   switch(entry->m_tableType){
  3595   case DictTabInfo::Tablespace:
  3596   case DictTabInfo::LogfileGroup:{
  3597     jam();
  3598     Ptr<Filegroup> fg_ptr;
  3599     ndbrequire(c_filegroup_hash.find(fg_ptr, tableId));
  3600     dropObjPtr.p->m_obj_ptr_i = fg_ptr.i;
  3601     dropObjPtr.p->m_vt_index = 3;
  3602     break;
  3603   }
  3604   case DictTabInfo::Datafile:{
  3605     jam();
  3606     Ptr<File> file_ptr;
  3607     dropObjPtr.p->m_vt_index = 2;
> 3608     ndbrequire(c_file_hash.find(file_ptr, tableId));
  3609     dropObjPtr.p->m_obj_ptr_i = file_ptr.i;
  3610     break;
  3611   }
  ....
[19 Feb 2009 10:45] Jonas Oreland
This is a duplicate of bug#36702,
(or the crash is the equivalent ndbrequire in the same function)

Closing this as duplicate