MySQL Bugs: #26476: DUMP 2502 (TcDumpAllScanRec) dumps the last record twice

Bug #26476	DUMP 2502 (TcDumpAllScanRec) dumps the last record twice
Submitted:	19 Feb 2007 8:13
Reporter:	Hartmut Holzgraefe	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1	OS:	Linux (linux)
Assigned to:		CPU Architecture:	Any
Tags:	5.1bk

Description:
When doing a DUMP 2502 the last record is logged twice in the debug output (the same is true for other TcDumpAll... codes)

  Node 2: TC: Dump all ScanRecord - size: 256
  Node 2: Dbtc::ScanRecord[1]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=2
  Node 2: Dbtc::ScanRecord[2]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=3
  [...]
  Node 2: Dbtc::ScanRecord[254]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=255
  Node 2: Dbtc::ScanRecord[255]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=-256
  Node 2: Dbtc::ScanRecord[255]: state=0nextfrag=0, nofrag=0
  Node 2:  ailen=0, para=0, receivedop=0, noOprePperFrag=0
  Node 2:  schv=0, tab=0, sproc=0
  Node 2:  apiRec=-256, next=-256

How to repeat:
Do a DUMP 2502 on a running cluster and check the output written to the management nodes log

Suggested fix:
This is due to the DUMP code in the signal being altered during processing and the various DUMP codes being checked in a cascade of if() statements. As processing continues after the last record has been dumped the if() block for the altered "dump one statement" code is hit and the last record is dumped again

Quick fix: add a "return" in the "we're done" code path

Once and for all fix: change dump handlers to use switch/case or at least if/else if/else for both maintainability and performance reasons

please just add the return

I also read your page, reason for not a simple loop is the following
1) ndbd is single threaded, a single loops blocks _all_ transaction
   therefore we manually timeslice, by sending signal to self

2) a single loop would/will fill sendbuffer quite rapidly and
   whould most likely cause node failure.

   i think the current impl. that you're looking at might have this
     behavior, i.e that it's not carefully enough timesliced...

3) fyi i added per customer request a set of dump commands that was carefully
   written that enabled dumping of transaction/operations for debugging 
   lock problems
   however, this has not hit mainline...

/Jonas