Bug #42480 Low setting for SharedGlobalMemory gives erroneous error 707 msg
Submitted: 30 Jan 2009 14:13 Modified: 30 Jan 2009 14:14
Reporter: Jonathan Miller Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Cluster: Disk Data Severity:S3 (Non-critical)
Version:mysql-5.1-telco-6.3 OS:Linux
Assigned to: CPU Architecture:Any
Tags: 7.0, mysql-5.1-telco-6.3->6.X
Triage: Triaged: D4 (Minor) / R2 (Low) / E2 (Low)

[30 Jan 2009 14:13] Jonathan Miller
Description:
Seems to be an issue with SharedGlobalMemory (my incorrect setting of it, but what the results of that manifest is interesting)

Setting up on test host to repeat GCP STOP, I started getting "ERROR 1528 (HY000): Failed to create LOGFILE GROUP" immediately when issuing the create logfile group command.

According to the SQL error, I needed to increase MaxNoOfTables:

mysql> show errors;
+-------+------+----------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------+
| Error | 1296 | Got error 707 'No more table metadata records (increase MaxNoOfTables)' from NDB |
| Error | 1528 | Failed to create LOGFILE GROUP |
+-------+------+----------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

So I increased MaxNoOfTables:

MaxNoOfTables=4000

I then retired and got the same error.

Looking at dump 8004, it shows I have plenty.

2009-01-29 23:35:01 [MgmSrvr] INFO     -- Node 2: Node 6: API mysql-5.1.31 ndb-6.4.3
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: DICT: c_counterMgr size: 25 free: 25
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: TRIX: c_theSubscriptionRecPool size: 100 free: 100
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_subscriberPool  size: 8004 free: 8001
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_tablePool  size: 4002 free: 3999
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_subscriptionPool  size: 4002 free: 3999
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_syncPool  size: 2 free: 2
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_dataBufferPool  size: 2057 free: 2057
2009-01-29 23:35:29 [MgmSrvr] INFO     -- Node 3: Suma: c_subOpPool  size: 256 free: 256
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: DICT: c_counterMgr size: 25 free: 25
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_subscriberPool  size: 8004 free: 8001
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_tablePool  size: 4002 free: 3999
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_subscriptionPool  size: 4002 free: 3999
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_syncPool  size: 2 free: 2
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_dataBufferPool  size: 2057 free: 2057
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: Suma: c_subOpPool  size: 256 free: 256
2009-01-29 23:35:31 [MgmSrvr] INFO     -- Node 2: TRIX: c_theSubscriptionRecPool size: 100 free: 100

So using the process of elimination I started commenting out ndbd options and retrying.

When I commented out "SharedGlobalMemory" the log file created without issue.

i.e. when I allowed it to set to default.

Taking a closer look at the configuration line, I notice that the "M" was missing from the number.

i.e. 
SharedGlobalMemory=384

should have been

SharedGlobalMemory=384M

Yet none of the errors or problems I received back from the Cluster informed or lead me to the fact I was out of or low on "Shared Memory". 

How to repeat:
See above

Suggested fix:
Seems like this is incorrect error handling. Also seems that we should be printing some type of INFO message in the cluster log or ndb logs when we run low on resources for SharedGlobalMemory. 

In addition, maybe setting a minimum for SharedGlobalMemory that is higher.