Bug #73667 Recursive Dblqh::finishScanrec() triggers assert in LocalDLFifoList constructor
Submitted: 21 Aug 2014 9:20 Modified: 23 Dec 2014 10:54
Reporter: Jan Wedvik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:7.4.0 OS:Any
Assigned to: CPU Architecture:Any

[21 Aug 2014 9:20] Jan Wedvik
Description:
Recursion in Dblqh::finishScanrec() triggers an assert in LocalDLFifoList<ScanRecord>() because there is an attempt to create two list iterators with the same head. This is a regression due to WL#7532 (Optimise NDB scans), which introduced this recursion.

The trace below shows the same list head is used in frame 13 and 41.

(gdb) frame 13
#13 0x0000000000732a21 in Dblqh::finishScanrec (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12376
(gdb) p &fragptr.p->m_queuedScans
$3 = (IntrusiveList<Dblqh::ScanRecord, ArrayPool<Dblqh::ScanRecord>, ListHead<FirstLink, LastLink, NoCount>, DefaultDoubleLinkMethods<Dblqh::ScanRecord, Dblqh::ScanRecord> >::Head *) 0x2b12fb1a53b0
(gdb) frame 41
#41 0x0000000000733b96 in Dblqh::finishScanrec (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12490
(gdb) p &fragptr.p->m_queuedScans
$4 = (IntrusiveList<Dblqh::ScanRecord, ArrayPool<Dblqh::ScanRecord>, ListHead<FirstLink, LastLink, NoCount>, DefaultDoubleLinkMethods<Dblqh::ScanRecord, Dblqh::ScanRecord> >::Head *) 0x2b12fb1a53b0
(gdb) where
#0  0x00002b12092b7425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00002b12092bab8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x0000000000435a09 in childAbort (error_code=6000, exit_code=-1, currentStartPhase=255) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/ndbd.cpp:425
#3  0x0000000000436835 in NdbShutdown (error_code=6000, type=NST_ErrorHandlerSignal, restartType=NRT_Default) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/ndbd.cpp:904
#4  0x0000000000af4da9 in ErrorReporter::handleError (messageID=6000, problemData=0x7fff572ad7a0 "Signal 6 received; Aborted", objRef=0xbdb3b0 "/export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/ndbd.cpp", nst=NST_ErrorHandlerSignal) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/error/ErrorReporter.cpp:257
#5  0x0000000000435bc0 in handler_error (signum=6) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/ndbd.cpp:473
#6  <signal handler called>
#7  0x00002b12092b7425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x00002b12092bab8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x00002b12092b00ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x00002b12092b0192 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x000000000079103f in IntrusiveList<Dblqh::ScanRecord, ArrayPool<Dblqh::ScanRecord>, ListHead<FirstLink, LastLink, NoCount>, DefaultDoubleLinkMethods<Dblqh::ScanRecord, Dblqh::ScanRecord> >::Local::Local (this=0x7fff572adea0, pool=..., head=...) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/vm/IntrusiveList.hpp:282
#12 0x000000000078f401 in LocalDLFifoList<Dblqh::ScanRecord, Dblqh::ScanRecord, DefaultDoubleLinkMethods<Dblqh::ScanRecord, Dblqh::ScanRecord> >::LocalDLFifoList (this=0x7fff572adea0, pool=..., head=...) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/vm/IntrusiveList.hpp:339
#13 0x0000000000732a21 in Dblqh::finishScanrec (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12376
#14 0x00000000007315b0 in Dblqh::tupScanCloseConfLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12093
#15 0x00000000007310c3 in Dblqh::accScanCloseConfLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12064
#16 0x00000000007253ca in Dblqh::execNEXT_SCANCONF (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:9955
#17 0x000000000048c23c in SimulatedBlock::executeFunction (this=0x14aa5a0, gsn=330, signal=0x124dc00, f=(void (SimulatedBlock::*)(SimulatedBlock * const, Signal *)) 0x724e0e <Dblqh::execNEXT_SCANCONF(Signal*)>) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1122
#18 0x000000000048c9aa in SimulatedBlock::EXECUTE_DIRECT (this=0x1488000, block=247, gsn=330, signal=0x124dc00, len=3) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1429
#19 0x00000000004d4494 in Dbacc::releaseScanLab (this=0x1488000, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dbacc/DbaccMain.cpp:6492
#20 0x00000000004d26fb in Dbacc::execNEXT_SCANREQ (this=0x1488000, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dbacc/DbaccMain.cpp:6214
#21 0x000000000078bf43 in SimulatedBlock::EXECUTE_DIRECT (this=0x1488000, f=(void (SimulatedBlock::*)(SimulatedBlock * const, Signal *)) 0x4d1de6 <Dbacc::execNEXT_SCANREQ(Signal*)>, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1317
#22 0x0000000000730db0 in Dblqh::closeScanLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:12023
#23 0x00000000007272e4 in Dblqh::scanLockReleasedLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:10209
#24 0x0000000000727bd1 in Dblqh::scanReleaseLocksLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:10340
#25 0x000000000072fba7 in Dblqh::scanTupkeyConfLab (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:11864
#26 0x00000000006f887b in Dblqh::execTUPKEYCONF (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3657
#27 0x000000000072e227 in Dblqh::next_scanconf_tupkeyreq (this=0x14aa5a0, signal=0x124dc00, scanPtr=0x171d2b0, regTcPtr=0x2b12fb40aa10, fragPtrP=0x2b12fb1a5310, disk_page=4294967040) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:11661
#28 0x000000000072d69d in Dblqh::nextScanConfScanLab (this=0x14aa5a0, signal=0x124dc00, scanPtr=0x171d2b0, regTcPtr=0x2b12fb40aa10, fragId=0, accOpPtr=36096) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:11535
#29 0x000000000072550e in Dblqh::execNEXT_SCANCONF (this=0x14aa5a0, signal=0x124dc00) at /export/home/tmp/jw159207/mysql/repo/mysql-5.6-cluster-7.4/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:9966

How to repeat:
Create database as shown below (full SQL script will be attached to bug), then run bencher as follows:
bencher -d ydb -e "select count(*) from t1 join t1 as t2 on t1.d<t2.c+2 join t1 as t3 on t2.d<t3.c+2 join t1 as t4 on t3.d<t4.c+2" -l 5 -t 32

drop database if exists ydb;
create database ydb;
use ydb;
CREATE TABLE t1 (
  a int NOT NULL,
  b int NOT NULL,
  c int NOT NULL,
  d int NOT NULL,
  PRIMARY KEY (`b`,`a`)
) ENGINE=ndbcluster;
insert into t1 values (0, 1, 1, 1);
insert into t1 values (1, 1, 1, 2);
insert into t1 values (2, 1, 1, 3);
insert into t1 values (3, 1, 1, 4);
.
.
insert into t1 values (998, 1, 1, 999);
insert into t1 values (999, 1, 1, 0);
analyze table t1;

Suggested fix:
Add a release() method to LocalDLFifoList and related list types, such that it is possible to disconnect the LocalDLFifoList from a list head without running the destructor.
[23 Dec 2014 10:54] Jon Stephens
Thank you for your bug report. This issue has already been fixed in the latest released version of that product, which you can download at

  http://www.mysql.com/downloads/

Documented fix in the NDB 7.4.3 changelog as follows:

    Recursion in the internal method Dblqh::finishScanrec() led to
    an attempt to create two list iterators with the same head. This
    regression was introduced during work done to optimize scans for
    version 7.4 of the NDB storage engine.
  
Closed.