Bug #18622 4R: Data node failes in SimulatedBlock.cpp:211 during stress testing
Submitted: 29 Mar 2006 17:09 Modified: 24 Apr 2006 16:21
Reporter: Jonathan Miller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1.9 OS:Linux (Linux 32 Bit OS)
Assigned to: Jonas Oreland CPU Architecture:Any

[29 Mar 2006 17:09] Jonathan Miller
Description:
I had 3 different crashes at one time during stress testing using the cid_ndb_dd.pl script. I had different mysqld crashes and a data node failure. I am opening 3 different bug reports, but they may all be related: Please see:
http://bugs.mysql.com/bug.php?id=18621

Time: Wednesday 29 Mars 2006 - 18:31:39
Status: Temporary error, restart node
Message: Send signal error (Internal error, programming error or missing error message, please report a bug)
Error: 2339
Error data: Signal (GSN: 316, Length: 26, Rec Block No: 247)
Error object: SimulatedBlock.cpp:211
Program: /home/ndbdev/jmiller/builds/libexec/ndbd
Pid: 14587
Trace: /space/run/ndb_4_trace.log.1
Version: Version 5.1.9 (beta)
***EOM***

--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342839 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00007d3a H'00000002 H'00600600 H'4954494e H'535f4c41 H'20455a49 H'30303031
 H'20200a4d H'20202020 H'20202020 H'20202020 H'20202020 H'20202020 H'45202020
 H'4e49474e H'444e3d45 H'00000042 H'00000000 H'00000000 H'00000000 H'00000000
 H'00000000 H'00000000 H'00000000 H'00000000
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342838 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00007d3a H'00000002 H'00600600 H'7461642f H'6c696661 H'61642e65 H'200a2774
 H'20202020 H'20202020 H'20202020 H'20202020 H'20202020 H'20202020 H'20455355
 H'46474f4c H'20454c49 H'554f5247 H'676c2050 H'20200a31 H'20202020 H'20202020
 H'20202020 H'20202020 H'20202020 H'49202020
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342837 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00007d3a H'00000002 H'00600600 H'000000dd H'00000000 H'41455243 H'54204554
 H'454c4241 H'43415053 H'45542045 H'52455453 H'53545f32 H'2020200a H'20202020
 H'20202020 H'20202020 H'20202020 H'20202020 H'44412020 H'41442044 H'49464154
 H'2720454c H'65742f2e H'72657473 H'73745f32
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342836 gsn: 4 "ATTRINFO" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 25 trace: 1 #sec: 0 fragInf: 0
 H'00007d3a H'00000002 H'00600600 H'0053545f H'00020020 H'00000780 H'00000000
 H'00000000 H'00000000 H'00000000 H'00000000 H'00000000 H'00000000 H'00040004
 H'00000006 H'00050008 H'00000000 H'00000000 H'00060004 H'00000000 H'00070004
 H'00000000 H'00080004 H'00000008 H'00030108
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342835 gsn: 316 "LQHKEYREQ" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 23 trace: 1 #sec: 0 fragInf: 0
 ClientPtr = H'00007d3a hashValue = H'fe7141dd tcBlockRef = H'00f50003
 transId1 = H'00000002 transId2 = H'00600600 savePointId = H'00000000
 Op: 4 Lock: 0 Flags: CommitAckMarker NoDisk ScanInfo/noFiredTriggers: H'0
 AttrLen: 93 (5 in this) KeyLen: 4 TableId: 2 SchemaVer: 1
 FragId: 2 ReplicaNo: 0 LastReplica: 3 NextNodeId: 5
 ApiRef: H'80060006 ApiOpRef: H'00000024 NextNodeId2: 2 NextNodeId3: 3
 KeyInfo: H'00000000 H'5345540a H'32524554 H'0053545f
 AttrInfo: H'00000001 H'00000000 H'0001000b H'5345540a H'32524554
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 4, r.sigId: 1342834 gsn: 316 "LQHKEYREQ" prio: 1
s.bn: 245 "DBTC", s.proc: 3, s.sigId: -1 length: 18 trace: 1 #sec: 0 fragInf: 0
 ClientPtr = H'00007d38 hashValue = H'fe7141dd tcBlockRef = H'00f50003
 transId1 = H'00000002 transId2 = H'00600600 savePointId = H'00000000
 Op: 0 Lock: 0 Flags: NoDisk ScanInfo/noFiredTriggers: H'0
 AttrLen: 1 (1 in this) KeyLen: 4 TableId: 2 SchemaVer: 1
 FragId: 2 ReplicaNo: 0 LastReplica: 0 NextNodeId: 5
 ApiRef: H'80060006 ApiOpRef: H'0000002c
 KeyInfo: H'00000000 H'5345540a H'32524554 H'0053545f
 AttrInfo: H'00030000
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 4, r.sigId: 1342833 gsn: 409 "TIME_SIGNAL" prio: 1
s.bn: 252 "QMGR", s.proc: 4, s.sigId: 1342832 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 252 "QMGR", r.proc: 4, r.sigId: 1342832 gsn: 164 "CONTINUEB" prio: 0
s.bn: 252 "QMGR", s.proc: 4, s.sigId: 1342830 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 4, r.sigId: 1342831 gsn: 164 "CONTINUEB" prio: 0
s.bn: 253 "NDBFS", s.proc: 4, s.sigId: 1342829 length: 1 trace: 0 #sec: 0 fragInf: 0
 Scanning the memory channel every 10ms
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 4, r.sigId: 1342828 gsn: 409 "TIME_SIGNAL" prio: 1
s.bn: 252 "QMGR", s.proc: 4, s.sigId: 1342824 length: 1 trace: 0 #sec: 0 fragInf: 0
 H'00000004
--------------- Signal ----------------
r.bn: 245 "DBTC", r.proc: 4, r.sigId: 1342827 gsn: 409 "TIME_SIGNAL" prio: 1

How to repeat:
run cid_ndb_dd.pl in a 4 host, 4 data node, 4 replica configuration
[31 Mar 2006 20:55] Serge Kozlov
I got same issue. 
Cluster configuration has 4 ndbd/4 replicas, 1 mgmd, 1 mysqld, 1 api. All nodes are running on one box. I run  ./load_tpcb.pl ndb16 3306 root BLANK ndb and got that error - first (master) node is stopped but load_tpcb still works ...
[13 Apr 2006 6:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4899
[21 Apr 2006 14:42] Tomas Ulin
pushed to 5.1.10
[23 Apr 2006 5:49] Jonas Oreland
Node crash when doing insert/update of table > 128 bytes using 4replica
[24 Apr 2006 16:21] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 5.1.10 changelog. Closed.