Bug #54290 Inconsistent "table is full" errors causing ndbd errors
Submitted: 7 Jun 2010 11:47 Modified: 7 Jun 2010 13:55
Reporter: Anders Karlsson Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:7.1.3 OS:Linux
Assigned to: CPU Architecture:Any

[7 Jun 2010 11:47] Anders Karlsson
Description:
When a table is "full" in MySQL Cluster, you may still be able to insert rows into it. Sometimes it fails, sometimes not. In this case the table is simple:
CREATE TABLE t1(c1 int) ENGINE=NDB
The result looks like this:
MySQL [test]> insert into t1 values(1);
ERROR 1114 (HY000): The table 't1' is full
MySQL [test]> insert into t1 values(1);
ERROR 1114 (HY000): The table 't1' is full
MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
ERROR 1114 (HY000): The table 't1' is full
MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

MySQL [test]> insert into t1 values(1);
ERROR 1114 (HY000): The table 't1' is full
MySQL [test]> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

Having done this for a while seems to corrupt the server, and after a restart, the data nodes will not start. This has been observed on WIndows and Linux. Also see bug #54268.

How to repeat:
Start MySQL Cluster. Create a table in the test database:
CREATE TABLE t1(c1 int) ENGINE=NDB

Insert rows until you get a table is full error:
MySQL [test]> insert into t1 values(1);
ERROR 1114 (HY000): The table 't1' is full

Continue ti try inserting for a while, ignoring a few table is full errors. Then shutdown and restart the Cluster. The datanodes will have errors:
2010-06-07 13:18:17 [MgmtSrvr] ALERT    -- Node 2: Forced node shutdown completed. Occured during startphase 4. Caused by error 2352: 'Invalid LCP(Ndbd file system inconsistency error, please report a bug). Ndbd file system error, restart node initial'.
2010-06-07 13:18:17 [MgmtSrvr] ALERT    -- Node 1: Node 2 Disconnected
2010-06-07 13:18:17 [MgmtSrvr] ALERT    -- Node 3: Forced node shutdown completed. Occured during startphase 4. Caused by error 2352: 'Invalid LCP(Ndbd file system inconsistency error, please report a bug). Ndbd file system error, restart node initial'.
2010-06-07 13:18:17 [MgmtSrvr] ALERT    -- Node 1: Node 3 Disconnected
[7 Jun 2010 13:55] Jørgen Austvik
Thanks for all the good reports!

No primary key = autogenerated primary key.
If one node is out of resources, then the autogenerated PK could hash to another node, and it could work some more retries until PK hashes to the same node.

The restart part seems to be bug #54268?