| Bug #65938 | ndbdmtd shutdown 1 min after started: Illegal signal ... (GSN 32 not added) | ||
|---|---|---|---|
| Submitted: | 18 Jul 2012 13:56 | Modified: | 17 Feb 2013 17:24 |
| Reporter: | Jay Ward | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S2 (Serious) |
| Version: | 5.5.22-ndb-7.2.6-gpl-log | OS: | Linux (CentOS release 6.3 (Final) 2.6.32-279.1.1.el6.x86_64) |
| Assigned to: | Assigned Account | CPU Architecture: | Any |
| Tags: | assertion failed, error 2301, GSN 32 not added, GSN_SCAN_TABREQ, interrupts, ndbdmtd, numa, send lock contentions, sendbufferpool lock contentions | ||
[18 Jul 2012 14:23]
Jay Ward
Profile of hardware used.
Attachment: hardwareprofile.tar.gz (application/gzip, text), 6.62 KiB.
[18 Jul 2012 14:24]
Jay Ward
Uploaded ndb_error_reporter output to ftp.oracle.com/support/incoming/bug-data-65938.tar.bz2
[22 Jul 2012 13:06]
Jay Ward
This ended up being the result of EL6's localhost entry in the /etc/hosts file: 127.0.0.1 localhost Worked while 127.0.0.1 localhost.localdomain localhost did not. NDB should be able to handle either.
[17 Jan 2013 17:19]
Shahryar Ghazi
Hi Jay, For some reason I am unable to access the file you uploaded to FTP server. Please upload it again and also include any configuration files (eg. config.ini, my.cnf) and OS network info (eg. hosts files). The potential issue appears to be related to network configuration so I am assuming that hardware configuration (mentioned in step3 of "how to repeat" above) should not matter in this case. Please correct me if I am wrong. Also, please explain step2 of "How to repeat" in detail. Thanks.
[18 Feb 2013 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".

Description: We were moving our NDB nodes to bigger hardware and taking them out of the virtualization software in which they had been running. After making changes to the config.ini and restarting both management node, I started ndbmtd on the data node we were moving first using: [root@ndb2 mysql]# /usr/local/mysql/bin/ndbmtd --ndb-nodeid=4 -c MGMNode1:1186,MGMNode2:1186 --initial The node started successfully, and then shortly thereafter shut itself down: 674871/0 (674870/4294967295) switchover complete bucket 1 state: 1starting 2012-07-17 16:40:29 [ndbd] INFO -- Start phase 101 completed 2012-07-17 16:40:29 [ndbd] INFO -- Node started send lock node 5 waiting for lock, contentions: 200 spins: 367555 send lock node 5 waiting for lock, contentions: 400 spins: 611990 send lock node 5 waiting for lock, contentions: 600 spins: 867841 ... More lines like unto these ... jbalock thr: 0 waiting for lock, contentions: 7800 spins: 6899476 ... More lines like unto these ... send lock node 15 waiting for lock, contentions: 2800 spins: 5477963 send lock node 15 waiting for lock, contentions: 3000 spins: 5744907 send lock node 15 waiting for lock, contentions: 3200 spins: 5997054 2012-07-17 16:40:52 [ndbd] INFO -- Illegal signal received (GSN 32 not added) 2012-07-17 16:40:52 [ndbd] INFO -- Illegal signal received (GSN 32 not added) 2012-07-17 16:40:52 [ndbd] INFO -- Error handler shutting down system 2012-07-17 16:40:52 [ndbd] INFO -- Error handler shutdown completed - exiting 2012-07-17 16:40:54 [ndbd] ALERT -- Node 4: Forced node shutdown completed. Caused by error 2301: 'Assertion(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. I expected it to do one of the following: 1. Not get GSN signal 32, if inapplicable in that context OR 2. Handle GSN signal properly and continue running OR 3. Give real information as to the real problem so it can be fixed. How to repeat: 1. Create a Cluster with two management nodes, two NDB nodes, and two or more MySQL nodes. 2. Down one of the data nodes and use it's IP address (or hostname) on a new data node. 3. Create Data node with hardware like the profile I will upload when uploading ndb_error output. 3. Copy Linux Generic untar-ed directory to /usr/local/ and symlink to /usr/local/mysql (since configurations already point to that directory) 4. Start ndbmtd with --initial. 5. Wait for node to enter started status. 6. Wait for node to shut down. Suggested fix: I am not sure what the fix should be.