MySQL Bugs: #73342: MySql Cluster Issue when a node is completely lost

Bug #73342	MySql Cluster Issue when a node is completely lost
Submitted:	21 Jul 2014 13:40	Modified:	30 Dec 2015 15:07
Reporter:	george sibley	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	5.6	OS:	Linux (12.04)
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
We are intending to use mysql clustering to aid resilience for our database solution.

During the destructive testing, we encountered the following scenario which concerns us greatly. We're really trying to emulate 'complete' connectivity loss from one mysql cluster instance to another:-

1) We have the mysql management service and one of the database nodes on one image.
2) We have the second node in the cluster on a separate image.
3) We're able to write to the primary node and see it replicate the data on the secondary node.
4) We can bring down one database node, write data, and then see the data replicated in the secondary node when it's brought back up. All good.
5) We then bring down the secondary + the mysql cluster process (ndbd), thus trying to emulate connectivity loss from the mysql cluster and primary node.
6) We then write to the primary node.
7) We then bring back up the secondary node and associated mysql cluster process  (ndbd).
8) We then insert data into the secondary node, ensuring the data has the same 'pk' as the data inserted in 6.
9) The data is replicated from the secondary to the primary, thus overwriting the data inserted in step 6.

So, from what I can see we get 'silent' data integrity issues which would go unnoticed. Is there anything we can do to avoid this situation?

Regards

George

How to repeat:
The steps stated in the description can be used. We were able to reproduce the issue on several occasions.

Hi,

I will need bit more data on your test setup in order to verify this. What I understand you did is not really possible so let's clear things up a bit :)

please provide me with following data

1. version of the cluster you are running (5.6 is not a cluster version)

2. when your cluster is running, connect to the ndb_mgm and execute SHOW; get me the result

3. are you configuring your cluster using MCM or you are manually configuring cluster. In any case, please provide me with your config both for cluster (usually called config.ini) and for mysqld (usually my.cnf)

4. Since you are mentioning "images" I assume you are running nodes inside some VM. What VM are you using? VirtualBox, VmWare, QEMU (proxmox?)... 

5. If I'm correctly understanding you are running 2 VM's, VM1 is running 
 - mgmd
 - ndbd
and VM2 is running
 - mysqld
 - ndbd

please confirm this

6. your steps 5-6-7 are unclear, it is possible that your answer to 1-5 will clear this up but in any case please elaborate on this steps a bit

7. "ensuring the data has the same 'pk' as the data inserted in 6"; can you please 
 - get us the show create table\G for the table in question
 - get us ndbdesc for the table in question

8. can you provide the logs for the cluster

Thanks
Bogdan Kecman

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".