Bug #8010 4006 forces MySQL Node Restart
Submitted: 19 Jan 2005 15:55 Modified: 20 Jan 2005 11:34
Reporter: Hans Zaunere Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:4.1.9 OS:Linux (RHES 3)
Assigned to: Jonas Oreland CPU Architecture:Any

[19 Jan 2005 15:55] Hans Zaunere
Description:

There are four NDB nodes on two Dual Xeon EMT_64 servers and three MySQL nodes on single Xeon EMT_64 servers.  MySQL node A receives data from an outside source and stores it in MyISAM and NDB tables.  Two MySQL nodes, B and C, replicate the MyISAM tables from A, and perform SELECTs between the MyISAM and NDB tables.  The only writes are occuring in MySQL node A.  MySQL nodes B and C maintain roughly 250 queries/second.

After running a test script for roughly 18 hours on MySQL nodes B and C, they report:

Can't lock file (errno: 4006)

If the MySQL nodes B and C are restarted, the problem goes away, however, will appear again in roughly 6 hours.

After changing MaxNoOfConcurrentTransactions from 4096 to 8192 and restarting the entire cluster, the test scripts run for roughly 32 hours, then again give the 4006 error.  The test scripts continue to run after a restart, but for a shorter period of time.

However, after about 12 hours, queries/second decreases to 100-150.

How to repeat:
Prolonged usage of MySQL nodes will return 4006.  If MySQL node A is turned out (thus, no replication), the same errors are given on 4006 after several hours.

Data, a test script, and configuration files will be supplied.