Bug #62607 Cluster crashed on UPDATE statement
Submitted: 3 Oct 2011 16:12 Modified: 16 Sep 2016 14:14
Reporter: Mykola Ivanko Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S1 (Critical)
Version:7.1.15a OS:Linux (CentOS 5.6)
Assigned to: Bogdan Kecman CPU Architecture:Any
Tags: cluster, crash, error 2341, failed ndbrequire, UPDATE

[3 Oct 2011 16:12] Mykola Ivanko
Description:
We are doing stress tests on future production cluster consists of 4 data, 2 management and 2 SQL nodes. DB contains now around 1000000 records (CDRs from  SIP gateway). Cluster works OK with data inserting and selecting, but when we tried update query, it gave:

ndb_mgm> Node 3: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
Node 5: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.

After that, we ran the same query, and got:

ndb_mgm> Node 6: Forced node shutdown completed. Caused by error 2341: 'Internal program error (failed ndbrequire)(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
Node 4: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.

After the crash, all data nodes was restarted successfully, but when we tried the "killer query" again, we got the same crash again.

How to repeat:
Please find the ndb_error_reporter logs and DB schema file attached. If you need full DB dump with data, we can send it.
Management nodes are started as:
ndb_mgmd --initial --ndb-nodeid=1 --ndb-connectstring=172.16.0.30:1186 --config-dir=/etc/mysql --config-file=/etc/mysql/config.ini
ndb_mgmd  --initial --ndb-nodeid=2 --ndb-connectstring=172.16.0.31:1186 --config-dir=/etc/mysql --config-file=/etc/mysql/config.ini

Data nodes are started as:
ndbmtd --ndb-nodeid=3 --ndb-connectstring=172.16.0.30:1186,172.16.0.31:1186
ndbmtd --ndb-nodeid=4 --ndb-connectstring=172.16.0.30:1186,172.16.0.31:1186
ndbmtd --ndb-nodeid=5 --ndb-connectstring=172.16.0.30:1186,172.16.0.31:1186
ndbmtd --ndb-nodeid=6 --ndb-connectstring=172.16.0.30:1186,172.16.0.31:1186

SQL nodes are started as:
mysqld  --defaults-file=/etc/mysql/my.cnf

"Killer query" is:
update cdr_201109 set processed=1 where processed=0 limit 100000;

Suggested fix:
none
[3 Oct 2011 16:49] Mykola Ivanko
NDB node 3 logs

Attachment: ndb_3.tar.bz2 (application/x-bzip, text), 113.61 KiB.

[3 Oct 2011 16:51] Mykola Ivanko
NDB node 5 logs

Attachment: ndb_5.tar.bz2 (application/x-bzip, text), 98.27 KiB.

[11 May 2012 21:13] Sveta Smirnova
Thank you for the report.

We updated our instructions:

If the data you need to attach is more than 500KB, you should create a compressed archive of the data and a README file that describes the data with a filename that includes the bug number (example: bug-data-62607.zip), and use FTP (login with the userid anonymous and your email address) to upload the archive to ftp://ftp.oracle.com/support/incoming/. Once you have uploaded the file, add a comment to this bug to notify us about it. Note: This directory is unlistable, which means that once you have uploaded your file, you will not be able to see it. By default, all files will be deleted after 21 days with 2 advanced email warnings.

Please upload files.
[16 Sep 2016 14:14] Bogdan Kecman
reproduced (fairly easy) on 7.1.15a
cannot reproduce on latest 7.1/7.2/7.4