MySQL Bugs: #69619: ERROR 1297 (HY000) and error 2341: 'Internal program error (failed ndbrequire)

Bug #69619	ERROR 1297 (HY000) and error 2341: 'Internal program error (failed ndbrequire)
Submitted:	29 Jun 2013 11:19	Modified:	10 Oct 2013 16:35
Reporter:	Toki Winter	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	7.3.2-1.el6.x86_64	OS:	Linux (CentOS 6.4)
Assigned to:		CPU Architecture:	Any

Description:
Cluster was built with 2 ndb_mgmd nodes, 2 API nodes and 2 data nodes. (ndb_mgmd and mysqld nodes co-exist).

Loaded the world database (substituting InnoDB for NDBCLUSTER, of course).

Ran through procedure to add two more data nodes online (http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-online-add-node-basics.html).

Issue ALTER ONLINE TABLE <blah> REORGANIZE PARTITION; results in:

mysql> alter online table Country reorganize partition;
ERROR 1297 (HY000): Got temporary error 701 'System busy with other schema operation' from NDBCLUSTER

and...

ndb_mgm> Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

My "end" configuration is:

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)]	4 node(s)
id=3	@192.168.122.42  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master)
id=4	@192.168.122.43  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0)
id=5	@192.168.122.44  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 1)
id=6	@192.168.122.45  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 1)

[ndb_mgmd(MGM)]	2 node(s)
id=1 (not connected, accepting connect from 192.168.122.40)
id=2	@192.168.122.41  (mysql-5.6.11 ndb-7.3.2)

[mysqld(API)]	6 node(s)
id=7	@192.168.122.40  (mysql-5.6.11 ndb-7.3.2)
id=8	@192.168.122.41  (mysql-5.6.11 ndb-7.3.2)
id=9 (not connected, accepting connect from any host)
id=10 (not connected, accepting connect from any host)
id=11 (not connected, accepting connect from any host)
id=12 (not connected, accepting connect from any host)

All other files attached to bug report.

How to repeat:
Add two new nodes online to existing cluster, create a nodegroup, attempt to reorganize partition on table from "world" database.

Suggested fix:
Workaround: Perform a full cluster SHUTDOWN, and start all back up.

Here's my "before" config:

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)]	2 node(s)
id=3	@192.168.122.42  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master)
id=4	@192.168.122.43  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0)

[ndb_mgmd(MGM)]	2 node(s)
id=1	@192.168.122.40  (mysql-5.6.11 ndb-7.3.2)
id=2	@192.168.122.41  (mysql-5.6.11 ndb-7.3.2)

[mysqld(API)]	6 node(s)
id=7 (not connected, accepting connect from 192.168.122.40)
id=8 (not connected, accepting connect from 192.168.122.41)
id=9 (not connected, accepting connect from any host)
id=10 (not connected, accepting connect from any host)
id=11 (not connected, accepting connect from any host)
id=12 (not connected, accepting connect from any host)

Hello Toki,

Thank you for the bug report. 
Verified as described.

Thanks,
Umesh

I verified the bug for tables where foreign keys are in use.

So another workaround would be to first remove foreign keys,
then reorganize partitions, and then add the foreign keys back.

In this case with world database:

ALTER ONLINE TABLE City DROP FOREIGN KEY `city_ibfk_1`;
ALTER ONLINE TABLE CountryLanguage DROP FOREIGN KEY `countryLanguage_ibfk_1`;

ALTER ONLINE TABLE City REORGANIZE PARTITION;
ALTER ONLINE TABLE Country REORGANIZE PARTITION;
ALTER ONLINE TABLE CountryLanguage REORGANIZE PARTITION;

ALTER ONLINE TABLE City ADD CONSTRAINT `city_ibfk_1` FOREIGN KEY(`CountryCode`) REFERENCES `Country` (`Code`) ON DELETE NO ACTION ON UPDATE NO ACTION;
ALTER ONLINE TABLE CountryLanguage ADD CONSTRAINT `countryLanguage_ibfk_1` FOREIGN KEY(`CountryCode`) REFERENCES `Country` (`Code`) ON DELETE NO ACTION ON UPDATE NO ACTION;

Posted by developer:
 
fixed in 7.3.3

Documented fix in the NDB 7.3.3 changelog as follows:

      ALTER ONLINE TABLE ... REORGANIZE PARTITION failed when run 
      against a table having or using a foreign key.

Closed.

i got this error on 7.3.5 on debian.

2 api , 2mgm , 4 datanodes (replication factor=2) => 2 node groups (i added 2 data nodes to the group as my mem usage was 94 in first node group)

i performed ALTER ONLINE TABLE oms_ndb.orders_01 REORGANIZE PARTITION;

before the alter memory was as follows:

ndb_mgm> all report memory;
Node 3: Data usage is 94%(30889 32K pages of total 32768)
Node 3: Index usage is 9%(6154 8K pages of total 64032)
Node 4: Data usage is 94%(30889 32K pages of total 32768)
Node 4: Index usage is 9%(6154 8K pages of total 64032)
Node 9: Data usage is 4%(1354 32K pages of total 32768)
Node 9: Index usage is 0%(158 8K pages of total 64032)
Node 10: Data usage is 4%(1352 32K pages of total 32768)
Node 10: Index usage is 0%(158 8K pages of total 64032)

i was trying to rebalance with 9/10 but i got this error