Bug #22537 Restart from clean shutdown yields signal 11/error 6000 from cluster data nodes
Submitted: 21 Sep 2006 5:12 Modified: 6 Dec 2006 12:58
Reporter: Rick F Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:5.1.11 (generic shared libs rpm) OS:Linux (Centos-4.4/Kernel 2.6.9)
Assigned to: CPU Architecture:Any
Tags: 6000, cluster, ndbd, Signal 11, startphase 4

[21 Sep 2006 5:12] Rick F
Description:
Cluster is 4 data nodes on 4 machines, each with 4gb ram (various different amounts of ram per node used, have tried 2.5gb down to 700mb), 5th machine running manager. 2 Node groups, 2 replicas. All machines are 3.4ghz dual xeon running latest centos.

Cluster starts up fine, can create tables and import data without complaints. With small amounts of test data, I can perform a 'shutdown' on the management console and restart just fine. If I put any significant amount (say, 100k worth of data across a dozen tables) the nodes will always segfault on restart during phase four. Error reported on the management console:

ndb_mgm> Node 4: Forced node shutdown completed. Occured during startphase 4. Initiated by signal 11. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

Oh, and the error is temporary in the same way a hangman's noose is. ;)

How to repeat:
1) Create new cluster
2) Insert significant amount of data
3) Do clean shutdown
4) Restart cluster
5) Become increasingly frustrated with each error
[21 Sep 2006 5:13] Rick F
Config.ini for cluster

Attachment: config.ini (application/octet-stream, text), 923 bytes.

[21 Sep 2006 5:16] Rick F
tar.gzip of node 1 logs

Attachment: node1.tgz (application/x-compressed, text), 30.59 KiB.

[21 Sep 2006 5:18] Rick F
tar.gzip of node 2 logs

Attachment: node2.tgz (application/x-compressed, text), 30.54 KiB.

[21 Sep 2006 5:18] Rick F
tar.gzip of node 3 logs

Attachment: node3.tgz (application/x-compressed, text), 33.37 KiB.

[21 Sep 2006 5:19] Rick F
tar.gzip of node 4 logs

Attachment: node4.tgz (application/x-compressed, text), 33.42 KiB.

[21 Sep 2006 5:20] Rick F
Reassigned to server: cluster
[21 Sep 2006 13:54] Serge Kozlov
Same error message (Error: 6000) appears if any variables below have max upper limit, e.g.:
DataMemory: 1024G
MaxNoOfConcurrentOperations: 4294967039
RedoBuffer: 4294967039
[23 Sep 2006 15:43] Jonas Oreland
Hi,

Sorry about the "increasing frustration"

1) Did you download binary release ? or did you compile source
   If you compiled sources, which compiler did you use?

2) Could you test 5.1.12, which has several bug fixes...

/Jonas
[23 Sep 2006 15:45] Jonas Oreland
Hi again.

Just noted that 5.1.12 has not been release yet...
I.e is only avaible using snapshots...

/Jonas
[29 Sep 2006 20:13] Rick F
Perhaps you can tell me where 5.1.12 is available as a snapshot? I go to downloads and I don't see it.
[6 Nov 2006 12:58] Valeriy Kravchuk
5.1.12 is already available at http://dev.mysql.com/downloads/mysql/5.1.html#downloads. Please, check it.
[7 Dec 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".