Bug #62800 trying to load myisam table export in ndbcluster crashes all ndbd nodes
Submitted: 18 Oct 2011 20:17 Modified: 21 Oct 2011 15:46
Reporter: Simon Quigley Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S1 (Critical)
Version:mysql-5.1.56 ndb-7.1.15 OS:Linux (centos 5.6)
Assigned to: CPU Architecture:Any

[18 Oct 2011 20:17] Simon Quigley
Description:
using mysql administrator, or mysqldump with any variety of options, results in a sql script which when the engine type is changed from MyISAM to ndbcluster, and loaded into a cluster, results in the crashing of both/all ndbd nodes.

How to repeat:
CREATE DATABASE IF NOT EXISTS mya2billing;
USE mya2billing;

DROP TABLE IF EXISTS `mya2billing`.`cc_invoice`;
CREATE TABLE  `mya2billing`.`cc_invoice` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `reference` varchar(30) COLLATE utf8_bin DEFAULT NULL,
  `id_card` bigint(20) NOT NULL,
  `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `paid_status` tinyint(4) NOT NULL DEFAULT '0',
  `status` tinyint(4) NOT NULL DEFAULT '0',
  `title` varchar(50) COLLATE utf8_bin NOT NULL,
  `description` mediumtext COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `reference` (`reference`)
) ENGINE=ndbcluster AUTO_INCREMENT=46 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO `mya2billing`.`cc_invoice` (`id`,`reference`,`id_card`,`date`,`paid_status`,`status`,`title`,`description`) VALUES 
  (1,0x323031303030303030303031,1,'2010-07-20 05:01:06',1,1,0x524546494C4C,0x496E766F69636520666F7220726566696C6C);

Suggested fix:
unknown
[18 Oct 2011 20:28] Simon Quigley
I tried to upload the ndb error log after the crash, but ftp.mysql.com doesn't resolve. let me know if there's somewhere else I can upload it.
[19 Oct 2011 21:54] Simon Quigley
source database version is MySQL 5.1.49-log and the data is being extracted with mysql administrator 5.1.54
[20 Oct 2011 17:57] Simon Quigley
I was told the problem could be the hex encoded insert statement:

INSERT INTO `mya2billing`.`cc_invoice`
(`id`,`reference`,`id_card`,`date`,`paid_status`,`status`,`title`,`description`) VALUES  (1,0x323031303030303030303031,1,'2010-07-20
05:01:06',1,1,0x524546494C4C,0x496E766F69636520666F7220726566696C6C);

So I tried a non hex encoded statement:

INSERT INTO `cc_invoice` (`id`, `reference`, `id_card`, `date`,
`paid_status`, `status`, `title`, `description`) VALUES
(1,'201000000001',1,'2010-07-20 09:01:06',1,1,'REFILL','Invoice for
refill');

But this caused the same issue.

I was also told to try loading without backticks around the fields, so I tried exporting it again using ASCII quotes and loading it:

INSERT INTO
"mya2billing"."cc_invoice" ("id","reference","id_card","date","paid_status","status","title","description") VALUES   (1,0x323031303030303030303031,1,'2010-07-20 05:01:06',1,1,0x524546494C4C,0x496E766F69636520666F7220726566696C6C);

But that had the same result, both nodes crashing.

Output from ndb node 1:

start_resend(1, empty bucket (743/5 743/4) -> active
execGCP_NOMORETRANS(743/5) c_ongoing_take_over_cnt -> seize
Finished with handling node-failure
Detect out-of-order commit(1) -> 2
2011-10-19 20:07:28 [ndbd] ALERT    -- Node 3: Forced node shutdown
completed. Occured during startphase 0. Initiated by signal 11.

output from node 2:

2011-10-19 20:07:49 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=100
2011-10-19 20:07:49 [ndbd] INFO     -- Watchdog: User time: 167  System
time: 882
2011-10-19 20:07:49 [ndbd] INFO     -- dbtup/DbtupTrigger.cpp
2011-10-19 20:07:49 [ndbd] INFO     -- DBTUP (Line: 2035) 0x00000002
2011-10-19 20:07:49 [ndbd] INFO     -- Error handler shutting down
system
2011-10-19 20:07:49 [ndbd] INFO     -- Error handler shutdown completed
- exiting
2011-10-19 20:07:50 [ndbd] ALERT    -- Node 4: Forced node shutdown
completed. Caused by error 2341: 'Internal program error (failed
ndbrequire)(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.
[20 Oct 2011 17:58] Simon Quigley
I tried making the load even simpler, and it still doesn't work:

CREATE DATABASE IF NOT EXISTS mya2billing;
USE mya2billing;

DROP TABLE IF EXISTS "mya2billing"."cc_invoice";
CREATE TABLE  "mya2billing"."cc_invoice" (
  "id" bigint(20) NOT NULL AUTO_INCREMENT,
  "reference" varchar(30) COLLATE utf8_bin DEFAULT NULL,
  "id_card" bigint(20) NOT NULL,
  "date" timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  "paid_status" tinyint(4) NOT NULL DEFAULT '0',
  "status" tinyint(4) NOT NULL DEFAULT '0',
  "title" varchar(50) COLLATE utf8_bin NOT NULL,
  "description" mediumtext COLLATE utf8_bin NOT NULL,
  PRIMARY KEY ("id"),
  UNIQUE KEY "reference" ("reference")
) ENGINE=ndbcluster AUTO_INCREMENT=46 DEFAULT CHARSET=utf8
COLLATE=utf8_bin;

INSERT INTO cc_invoice VALUES
(1,'201000000001',1,'2010-07-2009:01:06',1,1,'REFILL','Invoice for
refill');

node 1 outputs:

execGCP_NOMORETRANS(439/8) c_ongoing_take_over_cnt -> seize
Finished with handling node-failure
start_resend(1, empty bucket (439/8 439/7) -> active
Detect out-of-order commit(0) -> 2
2011-10-19 20:48:07 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=100
2011-10-19 20:48:07 [ndbd] INFO     -- Watchdog: User time: 180  System
time: 888
2011-10-19 20:48:07 [ndbd] INFO     -- Watchdog: User time: 183  System
time: 889
2011-10-19 20:48:07 [ndbd] WARNING  -- Watchdog: Warning overslept 219
ms, expected 100 ms.
2011-10-19 20:48:07 [ndbd] INFO     -- dbtup/DbtupTrigger.cpp
2011-10-19 20:48:07 [ndbd] INFO     -- DBTUP (Line: 2035) 0x00000002
2011-10-19 20:48:07 [ndbd] INFO     -- Error handler shutting down
system
2011-10-19 20:48:07 [ndbd] INFO     -- Error handler shutdown completed
- exiting
2011-10-19 20:48:08 [ndbd] ALERT    -- Node 3: Forced node shutdown
completed. Caused by error 2341: 'Internal program error (failed
ndbrequire)(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.

node 2 outputs:

2011-10-19 20:48:28 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=100
2011-10-19 20:48:28 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:28 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=200
2011-10-19 20:48:28 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:28 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=300
2011-10-19 20:48:28 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:28 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=400
2011-10-19 20:48:28 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:28 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=500
2011-10-19 20:48:28 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:29 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=600
2011-10-19 20:48:29 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:29 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=700
2011-10-19 20:48:29 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:29 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=800
2011-10-19 20:48:29 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:29 [ndbd] WARNING  -- Ndb kernel thread 3 is stuck in:
Job Handling elapsed=900
2011-10-19 20:48:29 [ndbd] INFO     -- Watchdog: User time: 187  System
time: 995
2011-10-19 20:48:29 [ndbd] INFO     -- dbtup/DbtupTrigger.cpp
2011-10-19 20:48:29 [ndbd] INFO     -- DBTUP (Line: 2035) 0x00000002
2011-10-19 20:48:29 [ndbd] INFO     -- Error handler shutting down
system
2011-10-19 20:48:29 [ndbd] INFO     -- Error handler shutdown completed
- exiting
2011-10-19 20:48:30 [ndbd] ALERT    -- Node 4: Forced node shutdown
completed. Caused by error 2341: 'Internal program error (failed
ndbrequire)(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.
[20 Oct 2011 18:36] Simon Quigley
The problem is caused by the UNIQUE KEY field. If I remove UNIQUE KEY `reference` (`reference`) from the SQL in the CREATE TABLE, then the data can be loaded without issue.
[20 Oct 2011 19:51] Jonas Oreland
Hi,

Is this 7.1.15 or 7.1.15a ?
(cause 7.1.15 contained a serious bug...and was replace very shortly after
 by 7.1.15a)

Just checking

/Jonas
[21 Oct 2011 15:46] Simon Quigley
Thanks Jonas, I was using 15, not 15a, and I reinstalled the whole cluster with 15a, and I was able to load the table without issue.

I see that the bug was related to "Setting IndexMemory or sometimes DataMemory to 2 GB or higher could lead to data node failures under some conditions. (Bug #12873640)", and I had my nodes configured to use the 24GB of RAM they have.

Thanks again.