Bug #93422 [Cluster 5.6.24-ndb-7.4.6-cluster-gpl] Got temporary error 20016 'Query aborted
Submitted: 30 Nov 2018 10:29 Modified: 2 Dec 2018 14:56
Reporter: R van der Wal Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:5.6.24-ndb-7.4.6 OS:Microsoft Windows (Server 2016)
Assigned to: Bogdan Kecman CPU Architecture:x86

[30 Nov 2018 10:29] R van der Wal
Description:
When trying to join or to do a "where table X.colum=Table Y.column" the cluster fails with an error message:
* Got temporary error 20016 'Query aborted due to node failure' from NDBCLUSTER *

Strange thing is that this error came up suddenly and the query has been running fine for some months. But now it fails on every occasion. 

How to repeat:
any join on the specific tables results in this error
[2 Dec 2018 14:56] R van der Wal
UPDATE: 
We discovered that the Windows 2012/2016 servers that host our MySQL Cluster were part of a VMware cluster and that VMWare was taking daily snapshots of our servers. 
These short snapshots freeze the hardrive for a short period, but long enough to wrecken our data-node servers on MySQL level. After some time the differences between the datanodes were severe enough to bring the cluster to a grinding halt. 

We shutdown the whole cluster and restarted it. This solved the broken queries instantly. 
The VMWare server team has disabled the snaphotiing for respective drives the same day. So we think this error was not MySQL related and will keep a close eye at the Cluster for the next coming week. 

Appologies for poluting the bug database!