Description:
Seems to be an issue with SharedGlobalMemory (my incorrect setting of it, but what the results of that manifest is interesting)
Setting up on test host to repeat GCP STOP, I started getting "ERROR 1528 (HY000): Failed to create LOGFILE GROUP" immediately when issuing the create logfile group command.
According to the SQL error, I needed to increase MaxNoOfTables:
mysql> show errors;
+-------+------+----------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------+
| Error | 1296 | Got error 707 'No more table metadata records (increase MaxNoOfTables)' from NDB |
| Error | 1528 | Failed to create LOGFILE GROUP |
+-------+------+----------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
So I increased MaxNoOfTables:
MaxNoOfTables=4000
I then retired and got the same error.
Looking at dump 8004, it shows I have plenty.
2009-01-29 23:35:01 [MgmSrvr] INFO -- Node 2: Node 6: API mysql-5.1.31 ndb-6.4.3
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: DICT: c_counterMgr size: 25 free: 25
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: TRIX: c_theSubscriptionRecPool size: 100 free: 100
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_subscriberPool size: 8004 free: 8001
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_tablePool size: 4002 free: 3999
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_subscriptionPool size: 4002 free: 3999
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_syncPool size: 2 free: 2
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_dataBufferPool size: 2057 free: 2057
2009-01-29 23:35:29 [MgmSrvr] INFO -- Node 3: Suma: c_subOpPool size: 256 free: 256
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: DICT: c_counterMgr size: 25 free: 25
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_subscriberPool size: 8004 free: 8001
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_tablePool size: 4002 free: 3999
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_subscriptionPool size: 4002 free: 3999
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_syncPool size: 2 free: 2
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_dataBufferPool size: 2057 free: 2057
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: Suma: c_subOpPool size: 256 free: 256
2009-01-29 23:35:31 [MgmSrvr] INFO -- Node 2: TRIX: c_theSubscriptionRecPool size: 100 free: 100
So using the process of elimination I started commenting out ndbd options and retrying.
When I commented out "SharedGlobalMemory" the log file created without issue.
i.e. when I allowed it to set to default.
Taking a closer look at the configuration line, I notice that the "M" was missing from the number.
i.e.
SharedGlobalMemory=384
should have been
SharedGlobalMemory=384M
Yet none of the errors or problems I received back from the Cluster informed or lead me to the fact I was out of or low on "Shared Memory".
How to repeat:
See above
Suggested fix:
Seems like this is incorrect error handling. Also seems that we should be printing some type of INFO message in the cluster log or ndb logs when we run low on resources for SharedGlobalMemory.
In addition, maybe setting a minimum for SharedGlobalMemory that is higher.