Bug #26476 DUMP 2502 (TcDumpAllScanRec) dumps the last record twice
Submitted: 19 Feb 2007 8:13
Reporter: Hartmut Holzgraefe Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:mysql-5.1 OS:Linux (linux)
Assigned to: CPU Architecture:Any
Tags: 5.1bk

[19 Feb 2007 8:13] Hartmut Holzgraefe
Description:
When doing a DUMP 2502 the last record is logged twice in the debug output (the same is true for other TcDumpAll... codes)

  Node 2: TC: Dump all ScanRecord - size: 256
  Node 2: Dbtc::ScanRecord[1]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=2
  Node 2: Dbtc::ScanRecord[2]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=3
  [...]
  Node 2: Dbtc::ScanRecord[254]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=255
  Node 2: Dbtc::ScanRecord[255]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=-256
  Node 2: Dbtc::ScanRecord[255]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=-256

How to repeat:
Do a DUMP 2502 on a running cluster and check the output written to the management nodes log

Suggested fix:
This is due to the DUMP code in the signal being altered during processing and the various DUMP codes being checked in a cascade of if() statements. As processing continues after the last record has been dumped the if() block for the altered "dump one statement" code is hit and the last record is dumped again

Quick fix: add a "return" in the "we're done" code path

Once and for all fix: change dump handlers to use switch/case or at least if/else if/else for both maintainability and performance reasons
[20 Feb 2007 14:08] Jonas Oreland
please just add the return

I also read your page, reason for not a simple loop is the following
1) ndbd is single threaded, a single loops blocks _all_ transaction
   therefore we manually timeslice, by sending signal to self

2) a single loop would/will fill sendbuffer quite rapidly and
   whould most likely cause node failure.

   i think the current impl. that you're looking at might have this
     behavior, i.e that it's not carefully enough timesliced...

3) fyi i added per customer request a set of dump commands that was carefully
   written that enabled dumping of transaction/operations for debugging 
   lock problems
   however, this has not hit mainline...

/Jonas