Description:
SIGFPE, floating point exception is generated during high provisioning load. Data Node crashes and, the following error is printed by the mysql:
ERROR 1297 (HY000) at line 103945: Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER
Logs of NDB data node:
2018-05-25 15:50:00 [ndbd] INFO -- Received signal 8. Running error handler.
2018-05-25 15:50:00 [ndbd] INFO -- Signal 8 received; Floating point exception
2018-05-25 15:50:00 [ndbd] INFO -- /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp
2018-05-25 15:50:00 [ndbd] INFO -- Error handler signal restarting system
2018-05-25 15:50:00 [ndbd] INFO -- Error handler shutdown completed - exiting
2018-05-25 15:50:00 [ndbd] ALERT -- Node 3: Forced node shutdown completed, restarting. Initiated by signal 8. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
2018-05-25 15:50:00 [ndbd] INFO -- Ndb has terminated (pid 32121) restarting
2018-05-25 15:50:00 [ndbd] INFO -- Angel reconnected to '172.26.221.19:1186'
2018-05-25 15:50:03 [ndbd] INFO -- Angel reallocated nodeid: 3
2018-05-25 15:50:03 [ndbd] INFO -- Angel pid: 31729 started child: 959
2018-05-25 15:50:03 [ndbd] INFO -- Normal start of data node using checkpoint and log info if existing
2018-05-25 15:50:03 [ndbd] INFO -- Configuration fetched from '172.26.221.19:1186', generation: 1
2018-05-25 15:50:03 [ndbd] INFO -- Changing directory to '/var/lib/mysql/cluster_data'
NDBMT: non-mt
2018-05-25 15:50:03 [ndbd] INFO -- NDB Cluster -- DB node 3
2018-05-25 15:50:03 [ndbd] INFO -- mysql-5.6.34 ndb-7.4.13 --
2018-05-25 15:50:03 [ndbd] INFO -- WatchDog timer is set to 6000 ms
2018-05-25 15:50:03 [ndbd] INFO -- Memory Allocation for global memory pools Starting
2018-05-25 15:50:03 [ndbd] INFO -- numa_set_interleave_mask(numa_all_nodes) : OK
2018-05-25 15:50:03 [ndbd] INFO -- Ndbd_mem_manager::init(1) min: 10780Mb initial: 10908Mb
2018-05-25 15:50:03 [ndbd] INFO -- Touch Memory Starting, 8320 pages, page size = 32768
2018-05-25 15:50:03 [ndbd] INFO -- Touch Memory Completed
2018-05-25 15:50:03 [ndbd] INFO -- Lock Memory Starting, 8320 pages, page size = 32768
Where problem occurs, investigation:
./storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp
Dblqh::IOTracker::tick
Traceback:
storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950
#0 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950
#1 0x000000000054140a in Dblqh::timer_handling (this=this@entry=0x104a230, signal=signal@entry=0xdf3960 <globalData+7168>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3087
#2 0x0000000000541561 in Dblqh::execTIME_SIGNAL (this=0x104a230, signal=0xdf3960 <globalData+7168>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3072
#3 0x000000000071269d in executeFunction (f=<optimized out>, signal=<optimized out>, gsn=<optimized out>, this=0x104a230)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1477
#4 executeFunction_async (signal=<optimized out>, gsn=<optimized out>, this=0x104a230)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1444
#5 FastScheduler::doJob (this=0xdef1c0 <globalScheduler>, loopStartCount=<optimized out>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/FastScheduler.cpp:154
#6 0x0000000000711555 in ThreadConfig::ipControlLoop (this=0xe5d580, pThis=0xe74c70)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/ThreadConfig.cpp:276
#7 0x0000000000412cb4 in ndbd_run (foreground=<optimized out>, report_fd=<optimized out>, connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=0x0, no_start=<optimized out>, initial=false,
initialstart=false, allocated_nodeid=3, connect_retries=12, connect_delay=5)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp:1022
#8 0x0000000000411116 in real_main (argc=0, argv=0xe557b8, argv@entry=0xe435b0)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:204
#9 0x0000000000414cfb in spawn_process (progname=<optimized out>, args=...)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:432
#10 retry_spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:451
#11 angel_run (progname=progname@entry=0x7fffffffe847 "/usr/sbin/ndbd", original_args=..., connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=<optimized out>, initial=false, no_start=false, daemon=true,
connnect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:663
#12 0x0000000000411180 in real_main (argc=0, argv=0xe1e618) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:223
#13 0x00007ffff6dfab15 in __libc_start_main () from /lib64/libc.so.6
#14 0x0000000000410a15 in _start ()
The traceback indicates maxlag is 0.
0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0)
Check in the `tick` method, places where the division by maxlag happens.
if (lag > maxlag)
{
/**
* We did have lag last second...
* increase m_lag_cnt and check if it has reached maxlag_cnt
*/
Uint32 tmp = m_lag_cnt;
m_lag_cnt += (lag / maxlag);
if (tmp < maxlag_cnt && m_lag_cnt >= maxlag_cnt)
{
retVal = -1; // start aborting transactions
}
}
How to repeat:
Script to repeat, N.B. replace the <FILL IN ~4K OF TEXT> placeholders with actual 4K of text:
---- repeat.sh ----
FILE=bug.sql
echo "creating.."
echo "DROP TABLE IF EXISTS bug_test;" > $FILE
echo "CREATE TABLE bug_test (id1 bigint(20) NOT NULL,id2 bigint(20) DEFAULT NULL,id3 bigint(20) DEFAULT NULL,payload1 varchar(4096) DEFAULT NULL, payload2 varchar(4096) DEFAULT NULL, PRIMARY KEY (id1), UNIQUE KEY bug_UNIQUE2 (id2), UNIQUE KEY bug_UNIQUE3 (id3), INDEX id1_idx (id1), INDEX id2_idx (id2) USING BTREE, INDEX id3_idx (id2) USING BTREE) ENGINE=ndbcluster DEFAULT CHARSET=latin1;" >> $FILE;
for x in {1..100000}; do
echo "insert into bug_test (id1, id2, id3, payload1, payload2) values (100000000000"$x", 100000000000"$x", 100000000000"$x", '<FILL IN ~4K OF TEXT>',
'<FILL IN ~4K OF TEXT>');" >> $FILE;
done
echo "provisioning.."
time mysql --database=db < $FILE
---- end of repeat.sh ----
Run data node in debug mode:
gdb ndbd
(gdb) set follow-fork-mode child
(gdb) handle SIGFPE stop nopass
(gdb) run
(gdb) tb
Start provisioning traffic to MySQL cluster.
>./repeat.sh
One of the data nodes will fail and get to the gdb debug console. Use `tb` to print the traceback of the exception.
Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff7fed740 (LWP 31730)]
0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950
26950 /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp: No such file or directory.
Missing separate debuginfos, use: debuginfo-install MySQL-Cluster-server-advanced-7.4.13-1.el7.x86_64
(gdb) bt
#0 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950
#1 0x000000000054140a in Dblqh::timer_handling (this=this@entry=0x104a230, signal=signal@entry=0xdf3960 <globalData+7168>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3087
#2 0x0000000000541561 in Dblqh::execTIME_SIGNAL (this=0x104a230, signal=0xdf3960 <globalData+7168>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3072
#3 0x000000000071269d in executeFunction (f=<optimized out>, signal=<optimized out>, gsn=<optimized out>, this=0x104a230)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1477
#4 executeFunction_async (signal=<optimized out>, gsn=<optimized out>, this=0x104a230)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1444
#5 FastScheduler::doJob (this=0xdef1c0 <globalScheduler>, loopStartCount=<optimized out>)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/FastScheduler.cpp:154
#6 0x0000000000711555 in ThreadConfig::ipControlLoop (this=0xe5d580, pThis=0xe74c70)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/ThreadConfig.cpp:276
#7 0x0000000000412cb4 in ndbd_run (foreground=<optimized out>, report_fd=<optimized out>, connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=0x0, no_start=<optimized out>, initial=false,
initialstart=false, allocated_nodeid=3, connect_retries=12, connect_delay=5)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp:1022
#8 0x0000000000411116 in real_main (argc=0, argv=0xe557b8, argv@entry=0xe435b0)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:204
#9 0x0000000000414cfb in spawn_process (progname=<optimized out>, args=...)
at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:432
#10 retry_spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:451
#11 angel_run (progname=progname@entry=0x7fffffffe847 "/usr/sbin/ndbd", original_args=..., connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=<optimized out>, initial=false, no_start=false, daemon=true,
connnect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:663
#12 0x0000000000411180 in real_main (argc=0, argv=0xe1e618) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:223
#13 0x00007ffff6dfab15 in __libc_start_main () from /lib64/libc.so.6
#14 0x0000000000410a15 in _start ()
Suggested fix:
The `maxlag` should be checked for 0 before division.
Description: SIGFPE, floating point exception is generated during high provisioning load. Data Node crashes and, the following error is printed by the mysql: ERROR 1297 (HY000) at line 103945: Got temporary error 4010 'Node failure caused abort of transaction' from NDBCLUSTER Logs of NDB data node: 2018-05-25 15:50:00 [ndbd] INFO -- Received signal 8. Running error handler. 2018-05-25 15:50:00 [ndbd] INFO -- Signal 8 received; Floating point exception 2018-05-25 15:50:00 [ndbd] INFO -- /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp 2018-05-25 15:50:00 [ndbd] INFO -- Error handler signal restarting system 2018-05-25 15:50:00 [ndbd] INFO -- Error handler shutdown completed - exiting 2018-05-25 15:50:00 [ndbd] ALERT -- Node 3: Forced node shutdown completed, restarting. Initiated by signal 8. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. 2018-05-25 15:50:00 [ndbd] INFO -- Ndb has terminated (pid 32121) restarting 2018-05-25 15:50:00 [ndbd] INFO -- Angel reconnected to '172.26.221.19:1186' 2018-05-25 15:50:03 [ndbd] INFO -- Angel reallocated nodeid: 3 2018-05-25 15:50:03 [ndbd] INFO -- Angel pid: 31729 started child: 959 2018-05-25 15:50:03 [ndbd] INFO -- Normal start of data node using checkpoint and log info if existing 2018-05-25 15:50:03 [ndbd] INFO -- Configuration fetched from '172.26.221.19:1186', generation: 1 2018-05-25 15:50:03 [ndbd] INFO -- Changing directory to '/var/lib/mysql/cluster_data' NDBMT: non-mt 2018-05-25 15:50:03 [ndbd] INFO -- NDB Cluster -- DB node 3 2018-05-25 15:50:03 [ndbd] INFO -- mysql-5.6.34 ndb-7.4.13 -- 2018-05-25 15:50:03 [ndbd] INFO -- WatchDog timer is set to 6000 ms 2018-05-25 15:50:03 [ndbd] INFO -- Memory Allocation for global memory pools Starting 2018-05-25 15:50:03 [ndbd] INFO -- numa_set_interleave_mask(numa_all_nodes) : OK 2018-05-25 15:50:03 [ndbd] INFO -- Ndbd_mem_manager::init(1) min: 10780Mb initial: 10908Mb 2018-05-25 15:50:03 [ndbd] INFO -- Touch Memory Starting, 8320 pages, page size = 32768 2018-05-25 15:50:03 [ndbd] INFO -- Touch Memory Completed 2018-05-25 15:50:03 [ndbd] INFO -- Lock Memory Starting, 8320 pages, page size = 32768 Where problem occurs, investigation: ./storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp Dblqh::IOTracker::tick Traceback: storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950 #0 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950 #1 0x000000000054140a in Dblqh::timer_handling (this=this@entry=0x104a230, signal=signal@entry=0xdf3960 <globalData+7168>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3087 #2 0x0000000000541561 in Dblqh::execTIME_SIGNAL (this=0x104a230, signal=0xdf3960 <globalData+7168>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3072 #3 0x000000000071269d in executeFunction (f=<optimized out>, signal=<optimized out>, gsn=<optimized out>, this=0x104a230) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1477 #4 executeFunction_async (signal=<optimized out>, gsn=<optimized out>, this=0x104a230) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1444 #5 FastScheduler::doJob (this=0xdef1c0 <globalScheduler>, loopStartCount=<optimized out>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/FastScheduler.cpp:154 #6 0x0000000000711555 in ThreadConfig::ipControlLoop (this=0xe5d580, pThis=0xe74c70) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/ThreadConfig.cpp:276 #7 0x0000000000412cb4 in ndbd_run (foreground=<optimized out>, report_fd=<optimized out>, connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=0x0, no_start=<optimized out>, initial=false, initialstart=false, allocated_nodeid=3, connect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp:1022 #8 0x0000000000411116 in real_main (argc=0, argv=0xe557b8, argv@entry=0xe435b0) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:204 #9 0x0000000000414cfb in spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:432 #10 retry_spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:451 #11 angel_run (progname=progname@entry=0x7fffffffe847 "/usr/sbin/ndbd", original_args=..., connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=<optimized out>, initial=false, no_start=false, daemon=true, connnect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:663 #12 0x0000000000411180 in real_main (argc=0, argv=0xe1e618) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:223 #13 0x00007ffff6dfab15 in __libc_start_main () from /lib64/libc.so.6 #14 0x0000000000410a15 in _start () The traceback indicates maxlag is 0. 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0) Check in the `tick` method, places where the division by maxlag happens. if (lag > maxlag) { /** * We did have lag last second... * increase m_lag_cnt and check if it has reached maxlag_cnt */ Uint32 tmp = m_lag_cnt; m_lag_cnt += (lag / maxlag); if (tmp < maxlag_cnt && m_lag_cnt >= maxlag_cnt) { retVal = -1; // start aborting transactions } } How to repeat: Script to repeat, N.B. replace the <FILL IN ~4K OF TEXT> placeholders with actual 4K of text: ---- repeat.sh ---- FILE=bug.sql echo "creating.." echo "DROP TABLE IF EXISTS bug_test;" > $FILE echo "CREATE TABLE bug_test (id1 bigint(20) NOT NULL,id2 bigint(20) DEFAULT NULL,id3 bigint(20) DEFAULT NULL,payload1 varchar(4096) DEFAULT NULL, payload2 varchar(4096) DEFAULT NULL, PRIMARY KEY (id1), UNIQUE KEY bug_UNIQUE2 (id2), UNIQUE KEY bug_UNIQUE3 (id3), INDEX id1_idx (id1), INDEX id2_idx (id2) USING BTREE, INDEX id3_idx (id2) USING BTREE) ENGINE=ndbcluster DEFAULT CHARSET=latin1;" >> $FILE; for x in {1..100000}; do echo "insert into bug_test (id1, id2, id3, payload1, payload2) values (100000000000"$x", 100000000000"$x", 100000000000"$x", '<FILL IN ~4K OF TEXT>', '<FILL IN ~4K OF TEXT>');" >> $FILE; done echo "provisioning.." time mysql --database=db < $FILE ---- end of repeat.sh ---- Run data node in debug mode: gdb ndbd (gdb) set follow-fork-mode child (gdb) handle SIGFPE stop nopass (gdb) run (gdb) tb Start provisioning traffic to MySQL cluster. >./repeat.sh One of the data nodes will fail and get to the gdb debug console. Use `tb` to print the traceback of the exception. Program received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7ffff7fed740 (LWP 31730)] 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950 26950 /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp: No such file or directory. Missing separate debuginfos, use: debuginfo-install MySQL-Cluster-server-advanced-7.4.13-1.el7.x86_64 (gdb) bt #0 0x00000000005412ab in Dblqh::IOTracker::tick (this=0x129af90, now=<optimized out>, maxlag=0, maxlag_cnt=0) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:26950 #1 0x000000000054140a in Dblqh::timer_handling (this=this@entry=0x104a230, signal=signal@entry=0xdf3960 <globalData+7168>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3087 #2 0x0000000000541561 in Dblqh::execTIME_SIGNAL (this=0x104a230, signal=0xdf3960 <globalData+7168>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp:3072 #3 0x000000000071269d in executeFunction (f=<optimized out>, signal=<optimized out>, gsn=<optimized out>, this=0x104a230) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1477 #4 executeFunction_async (signal=<optimized out>, gsn=<optimized out>, this=0x104a230) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/SimulatedBlock.hpp:1444 #5 FastScheduler::doJob (this=0xdef1c0 <globalScheduler>, loopStartCount=<optimized out>) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/FastScheduler.cpp:154 #6 0x0000000000711555 in ThreadConfig::ipControlLoop (this=0xe5d580, pThis=0xe74c70) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/vm/ThreadConfig.cpp:276 #7 0x0000000000412cb4 in ndbd_run (foreground=<optimized out>, report_fd=<optimized out>, connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=0x0, no_start=<optimized out>, initial=false, initialstart=false, allocated_nodeid=3, connect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/ndbd.cpp:1022 #8 0x0000000000411116 in real_main (argc=0, argv=0xe557b8, argv@entry=0xe435b0) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:204 #9 0x0000000000414cfb in spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:432 #10 retry_spawn_process (progname=<optimized out>, args=...) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:451 #11 angel_run (progname=progname@entry=0x7fffffffe847 "/usr/sbin/ndbd", original_args=..., connect_str=<optimized out>, force_nodeid=<optimized out>, bind_address=<optimized out>, initial=false, no_start=false, daemon=true, connnect_retries=12, connect_delay=5) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/angel.cpp:663 #12 0x0000000000411180 in real_main (argc=0, argv=0xe1e618) at /export/home/pb2/build/sb_0-20777380-1476347309.44/rpm/BUILD/mysql-cluster-com-7.4.13/mysql-cluster-com-7.4.13/storage/ndb/src/kernel/main.cpp:223 #13 0x00007ffff6dfab15 in __libc_start_main () from /lib64/libc.so.6 #14 0x0000000000410a15 in _start () Suggested fix: The `maxlag` should be checked for 0 before division.