Bug #70897 | NDB Datanode crash after running query. | ||
---|---|---|---|
Submitted: | 13 Nov 2013 16:17 | Modified: | 13 Nov 2014 14:35 |
Reporter: | Martin van Wilderen | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.6.11 ndb-7.3.2, 7.3.3 | OS: | Linux (CentOS release 6.4 (Final) / 2.6.32-358.6.2.el6.x86_64) |
Assigned to: | CPU Architecture: | Any | |
Tags: | ndbd |
[13 Nov 2013 16:17]
Martin van Wilderen
[19 Nov 2013 10:47]
MySQL Verification Team
Hello Martin, Thank you for the report. I can not repeat described behavior with dummy data. Could you please provide repeatable data? Please mark as private when you upload the data. Also, Could you please attach the cluster logs? Preferably using the ndb_error_reporter utility: http://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-programs-ndb-error-reporter.html Thanks, Umesh
[19 Nov 2013 11:03]
Martin van Wilderen
Hi Umesh, > Could you please provide repeatable data? Would you like a full dump of the database? This is a 5.7 GB SQL dump. > Also, Could you please attach the cluster logs? Preferably using the ndb_error_reporter utility Done
[29 Nov 2013 20:00]
MySQL Verification Team
Hello Martin, Thank you for the test case. I've hit this issue only 1/20 times so for on 7.3.3 when simulating a load with mysqlslap ( just running the query never triggered any data node crash). // How to repeat - setup cluster(1mgm, 2 data, 1 api node) and import provided data - Simulate load with mysqlslap bin/mysqlslap --no-defaults --create-schema=test --user=root --delimiter=";" --query=/tmp/query.sql --concurrency=50 --iterations=200 // query used [root@cluster-repo mysql-cluster-gpl-7.3.3]# more /tmp/query.sql SELECT l.id, l.CreationDate AS Datum, re.EAN, re.HubID, re.PortalID, re.PortalUserID, kr.NAME AS Naam, c.CreationDate AS Claimdatum, c.role AS Claimrol, c.EndDate AS Einddatum,l.Statuscode, l.Errormessage, "" AS Acties, kr.UserHash, l.AttributenOK, l.Doelgroep, l.Licentiegeldig, l.ClientIP, c.ID AS ClaimID, l.Request_ID,COALESCE(l.KeyringLookupTime, l.VerifyClaimTime, l.CreateClaimTime) AS hasSubgrid FROM loguserflow l LEFT JOIN request re ON (l.Request_ID = re.id) LEFT JOIN `key` k ON (k.HubID = re.HubID AND k.PortalID = re.PortalID AND k.PortalUserID = re.PortalUserID) LEFT JOIN keyring kr ON (kr.id = k.Keyring_ID) LEFT JOIN claim c ON (c.Request_ID = re.ID) WHERE 1 = 1 and re.HubID = "Kennisnet" and l.CreationDate >= "2013-10-01 00:30" and l.CreationDate <= "2013-11-11 00:45" and kr.name like '%Bargboer%' order by l.id DESC limit 0, 100; Thanks, Umesh
[20 Feb 2014 9:07]
Martin van Wilderen
Any updates on the fix of this bug?
[13 Nov 2014 14:01]
Mikael Ronström
It crashes due to an error in the function xfrm_key. So probably the information received in the signal has some problems in it. The signal comes from DBSPJ, so this means that it is very likely that something has occurred with the signal data while in transit. Quite likely an error in the SPJ part.
[13 Nov 2014 14:34]
Ole John Aske
Posted by developer: This seems to be a variant of bug#17845161 ERROR 1296: GOT ERROR 290 'CORRUPT KEY IN TC, UNABLE TO XFRM'; NDB_JOIN_PUSHDOWN This issue has been fixed in 7.2.15, 7.3.4 and 7.4.x