Bug #98779 | DBACC (Line: 3886) 0x00000002 Check opPtrP->m_key_or_scan_info.m_scanOpDeleteCou | ||
---|---|---|---|
Submitted: | 28 Feb 2020 15:07 | Modified: | 17 Mar 2020 17:30 |
Reporter: | Daniel Hope | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | ndb-7.6.9 | OS: | Ubuntu |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[28 Feb 2020 15:07]
Daniel Hope
[28 Feb 2020 15:12]
Daniel Hope
Just to be clear this crashes the cluster and requires restarting to get it back online, its only a 2 node cluster, tonight I will add a third to see if it helps prevent the error.
[9 Mar 2020 17:55]
MySQL Verification Team
Hi, I am not able to reproduce this. You cannot add "one more node" you need to add "two more" as I assume you are running with noofreplica=2. Any other number is "unsupported" (or "beta" or "do not use in production"). I cannot reproduce this. You can upload the full ndb_error_reporter log, we might extract some additional info from there but so far this looks like improperly sized cluster but without being able to reproduce I can't say more. Kind regards Bogdan
[9 Mar 2020 18:16]
Daniel Hope
Hey So I didn't add a node in the end anyway, thanks for clearing up that I can't! Could you just clear up "looks like improperly sized cluster" for me ? I have attached the ndb_error_reporter output for completeness
[9 Mar 2020 19:26]
MySQL Verification Team
Hi, data nodes run in groups so you need always N * noofreplica number of data nodes. Assuming you use default noofreplica=2 you need to have even number of data nodes, so adding two nodes at a time. Improperly sized means that your configuration (hardware & config) does not cover your load. Properly configured the same hardware might be able to handle the load. This is something a MySQL Support team can help you with. Properly configuring MySQL Cluster is not simple task. On the other hand it can be a bug but I need a way to reproduce it. all best Bogdan
[17 Mar 2020 17:22]
MySQL Verification Team
Documented fix as follows in the NDB 7.5.17, 7.6.13, and 8.0.19 changelogs: A transaction which inserts a child row may run concurrently with a transaction which deletes the parent row for that child. One of the transactions should be aborted in this case, lest an orphaned child row result. Before committing an insert on a child row, a read of the parent row is triggered to confirm that the parent exists. Similarly, before committing a delete on a parent row, a read or scan is performed to confirm that no child rows exist. When insert and delete transactions were run concurrently, their prepare and commit operations could interact in such a way that both transactions committed. This occurred because the triggered reads were performed using CommittedRead locks (see NdbOperation::LockMode), which are not strong enough to prevent such error scenarios. This problem is fixed by using the stronger SimpleRead lock mode for both triggered reads. The use of SimpleRead locks ensures that at least one transaction aborts in every possible scenario involving concurrent child-insertion and parent-deletion transactions.
[17 Mar 2020 17:30]
Daniel Hope
So upgrading the cluster will prevent the error ?
[17 Mar 2020 17:38]
MySQL Verification Team
Yes.