Bug #72929 | Ndb kernel thread is stuck in: Job Handling for about 18s caused data node crash | ||
---|---|---|---|
Submitted: | 10 Jun 2014 6:58 | Modified: | 8 Dec 2014 7:58 |
Reporter: | paul He | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | 5.6.17-ndb-7.3.5-cluster-gpl-log, 7.3.7 | OS: | Linux (Red Hat Enterprise Linux Server release 6.2 (Santiago)) |
Assigned to: | CPU Architecture: | Any |
[10 Jun 2014 6:58]
paul He
[10 Jun 2014 14:53]
MySQL Verification Team
Thank you for the report. I cannot repeat with dummy schema/data, could you please provide complete repeatable test case(schema, data, and any other missing details(other than provided in bug report) which help us to reproduce the issue)? Also, could you please attach the cluster logs? Preferably using the ndb_error_reporter utility: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-error-reporter.html Thanks, Umesh
[11 Jun 2014 13:53]
paul He
test case which trigger the bug
Attachment: test case.TXT (text/plain), 10.79 KiB.
[19 Jun 2014 8:07]
paul He
Hi,Umesh. I have not any feedback from you for about a week.Is there any information that you would like me to provide?
[19 Jun 2014 8:35]
MySQL Verification Team
Hi Paul, I tried it couple of times(with single connection or even with mysqlslap with >15 concurrency) but it is not crashing, select takes >1hrs and query is killed but not seen node crashes. Shall keep you updated on this. Thanks, Umesh
[26 Aug 2014 5:26]
paul He
I am sorry that there is some mistake in the test case.I have correct the error and I will upload the test case again (named test_case_v2.sql).The ndbd will crash in about 3 minutes after execute the test_case_v2.sql.
[26 Aug 2014 5:27]
paul He
test_case_v2.sql
Attachment: test_case_v2.sql (application/octet-stream, text), 10.76 KiB.
[26 Aug 2014 5:35]
MySQL Verification Team
Thank you for the modified test case. Shall run few tests and confirm if it crashes. Thanks, Umesh
[8 Dec 2014 7:58]
MySQL Verification Team
Thank you Paul, and my apologies for taking long time on this. With exact conf file, and updated test case I'm seeing different issue than the originally reported one which is fixed as per #18229003 and in NDB 7.1.31, 7.2.16, and 7.3.5. For change logs, please refer http://dev.mysql.com/doc/relnotes/mysql-cluster/7.3/en/mysql-cluster-news-5-6-17-ndb-7-3-5... mysql> source test_case_v2.sql; Database changed Query OK, 0 rows affected (0.14 sec) Query OK, 0 rows affected (0.04 sec) Query OK, 0 rows affected (0.14 sec) Query OK, 0 rows affected (0.06 sec) Query OK, 0 rows affected (0.12 sec) Query OK, 0 rows affected (0.03 sec) Query OK, 1 row affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 1 row affected (17.59 sec) Query OK, 1 row affected (0.51 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) ERROR 1296 (HY000): Got error 901 'Inconsistent ordered index. The index needs to be dropped and recreated' from NDBCLUSTER mysql> mysql> show errors; +-------+------+---------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+---------------------------------------------------------------------------------------------------------+ | Error | 1296 | Got error 901 'Inconsistent ordered index. The index needs to be dropped and recreated' from NDBCLUSTER | | Error | 1296 | Got error 122 'Unknown error code' from NDBCLUSTER | | Error | 1181 | Got error 122 during ROLLBACK | +-------+------+---------------------------------------------------------------------------------------------------------+ 3 rows in set (0.00 sec) mysql> show warnings; +---------+------+---------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +---------+------+---------------------------------------------------------------------------------------------------------+ | Warning | 1296 | Got error 901 'Inconsistent ordered index. The index needs to be dropped and recreated' from NDB | | Error | 1296 | Got error 901 'Inconsistent ordered index. The index needs to be dropped and recreated' from NDBCLUSTER | | Warning | 1296 | Got error 0 'No error' from NDB | | Error | 1296 | Got error 122 'Unknown error code' from NDBCLUSTER | | Error | 1181 | Got error 122 during ROLLBACK | +---------+------+---------------------------------------------------------------------------------------------------------+ 5 rows in set (0.00 sec) // Node error log Time: Monday 8 December 2014 - 08:28:54 Status: Temporary error, restart node Message: WatchDog terminate, internal error or massive overload on the machine running this node (Internal error, programming error or missing error message, please report a bug) Error: 6050 Error data: Job Handling Error object: /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp Program: ndbd Pid: 26811 Version: mysql-5.6.17 ndb-7.3.5 Trace: /export/umes // back trace (gdb) bt #0 FastScheduler::traceDumpGetJam (this=<optimized out>, thr_no=<optimized out>, thrdTheEmulatedJam=@0x7f6366ab7608: 0x4, thrdTheEmulatedJamIndex=@0x7f6366ab7618: 1721) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/FastScheduler.cpp:504 #1 0x00000000006c9b39 in WriteMessage (thrdMessageID=6050, thrdProblemData=0x7824a5 "Job Handling", thrdObjRef=0x782228 "/export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp", nst=@0x7f6366ab766c: NST_Watchdog) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/error/ErrorReporter.cpp:376 #2 0x00000000006ca005 in ErrorReporter::handleError (messageID=6050, problemData=0x7824a5 "Job Handling", objRef=0x782228 "/export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp", nst=NST_Watchdog) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/error/ErrorReporter.cpp:246 #3 0x00000000006ba0aa in WatchDog::shutdownSystem (this=<optimized out>, last_stuck_action=0x0) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp:399 #4 0x00000000006ba8c6 in WatchDog::run (this=0x25152a0) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp:385 #5 0x00000000006babb9 in runWatchDog (w=0x1) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/kernel/vm/WatchDog.cpp:37 #6 0x00000000006f4ae7 in ndb_thread_wrapper (_ss=0x24c64e0) at /export/home/pb2/build/sb_0-11918329-1396368454.51/mysql-cluster-com-7.3.5/storage/ndb/src/common/portlib/NdbThread.c:202 #7 0x00007f63667d6df3 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f63658db47d in clone () from /lib64/libc.so.6
[8 Dec 2014 8:12]
MySQL Verification Team
// 7.3.7 also affected mysql> show variables like '%version%'; +-------------------------+------------------------------------------------------+ | Variable_name | Value | +-------------------------+------------------------------------------------------+ | innodb_version | 5.6.21 | | ndb_version | 459527 | | ndb_version_string | ndb-7.3.7 | | ndbinfo_version | 459527 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.6.21-ndb-7.3.7-cluster-commercial-advanced-log | | version_comment | MySQL Cluster Server - Advanced Edition (Commercial) | | version_compile_machine | x86_64 | | version_compile_os | linux-glibc2.5 | +-------------------------+------------------------------------------------------+ 10 rows in set (0.00 sec) mysql> source test_case_v2.sql; Database changed Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.04 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.06 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.04 sec) Query OK, 1 row affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 1 row affected (15.76 sec) Query OK, 1 row affected (0.44 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected, 1 warning (0.00 sec) Query OK, 0 rows affected (0.00 sec) ERROR 1296 (HY000): Got error 901 'Inconsistent ordered index. The index needs to be dropped and recreated' from NDBCLUSTER // ndbd error log Time: Monday 8 December 2014 - 09:07:59 Status: Temporary error, restart node Message: WatchDog terminate, internal error or massive overload on the machine running this node (Internal error, programming error or missing error message, please report a bug) Error: 6050 Error data: Job Handling Error object: /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp Program: ndbd Pid: 1446 Version: mysql-5.6.21 ndb-7.3.7 Trace: /export/umesh // #0 FastScheduler::traceDumpGetJam (this=<optimized out>, thr_no=<optimized out>, thrdTheEmulatedJam=@0x7fc5b5839608: 0x4, thrdTheEmulatedJamIndex=@0x7fc5b5839618: 1604) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/FastScheduler.cpp:508 508 /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/FastScheduler.cpp: No such file or directory. Missing separate debuginfos, use: debuginfo-install glibc-2.17-55.0.4.el7_0.1.x86_64 libgcc-4.8.2-16.2.el7_0.x86_64 libstdc++-4.8.2-16.2.el7_0.x86_64 (gdb) bt #0 FastScheduler::traceDumpGetJam (this=<optimized out>, thr_no=<optimized out>, thrdTheEmulatedJam=@0x7fc5b5839608: 0x4, thrdTheEmulatedJamIndex=@0x7fc5b5839618: 1604) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/FastScheduler.cpp:508 #1 0x00000000006d08b9 in WriteMessage (thrdMessageID=6050, thrdProblemData=0x81e3d5 "Job Handling", thrdObjRef=0x81e490 "/export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp", nst=@0x7fc5b583966c: NST_Watchdog) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/error/ErrorReporter.cpp:376 #2 0x00000000006d0dc5 in ErrorReporter::handleError (messageID=6050, problemData=0x81e3d5 "Job Handling", objRef=0x81e490 "/export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp", nst=NST_Watchdog) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/error/ErrorReporter.cpp:246 #3 0x00000000006c0dea in WatchDog::shutdownSystem (this=<optimized out>, last_stuck_action=0x0) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp:408 #4 0x00000000006c14d4 in WatchDog::run (this=0x29497d0) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp:394 #5 0x00000000006c1919 in runWatchDog (w=0x1) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/kernel/vm/WatchDog.cpp:37 #6 0x00000000006fc067 in ndb_thread_wrapper (_ss=0x28fc150) at /export/home/pb2/build/sb_0-13403650-1412862503.48/mysql-cluster-com-7.3.7/storage/ndb/src/common/portlib/NdbThread.c:202 #7 0x00007fc5b5558df3 in start_thread () from /lib64/libpthread.so.0 #8 0x00007fc5b465d47d in clone () from /lib64/libc.so.6
[8 Dec 2014 8:22]
MySQL Verification Team
// Build used revision-id: mauritz.sundell@oracle.com-20141009130427-cwuq3yrz4eu63xn6 date: 2014-10-09 15:04:27 +0200 build-date: 2014-10-09 15:29:27 +0200 revno: 4444 branch-nick: mysql-cluster-7.3.7-release MySQL source 5.6.21