MySQL Bugs: #105074: Take over error: DELETE immediately followed by INSERT

Bug #105074	Take over error: DELETE immediately followed by INSERT
Submitted:	29 Sep 2021 11:18	Modified:	30 Sep 2021 11:47
Reporter:	Mikael Ronström	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	8.0.23++	OS:	Any
Assigned to:	MySQL Verification Team	CPU Architecture:	Any

Description:
The following happens.
1) A node is restarting
2) The NDB API first deletes a row followed immediately by an INSERT of the same row from a different transaction.

When 2) happens the DELETE is COMMITted in the starting node, but not yet
COMPLETEd. This means that since the starting node is a backup replica
the row is still locked.

The code in handle_nr_copy assumes that any INSERT will not see any locked
rows. There is no handling of a real-time break in this situation.

How to repeat:
Run the testcase testNodeRestart -n Bug16895311 T1
on a machine with many CPUs and using 3 replicas.

Suggested fix:
Ensure that starting nodes unlock the row already in the COMMIT phase.
This should ensure that any INSERTs that arrive from normal transactions
in the Copy phase arrive at the starting node cannot meet a locked row.

After some deeper investigation it seems that the bug is caused by
the node order in normal transactions is wrong. So will investigate further.
Ignore this bug report for now.

Thanks Mikael, let us know if you discover something

all best
Bogdan

Not a bug in NDB