Bug #88946 Replication stops when replicating DDL during a NDB node restart
Submitted: 16 Dec 2017 10:44 Modified: 19 Dec 2017 17:33
Reporter: Daniël van Eeden (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:7.6.3 OS:Any
Assigned to: CPU Architecture:Any
Tags: ndb

[16 Dec 2017 10:44] Daniël van Eeden
Description:
When replicating a schema change while an NDB data node restart is in progress this happens:

Last_SQL_Error: Slave SQL thread retried transaction 10 time(s) in vain, giving up. Consider raising the value of the slave_transaction_retries variable.

This is expected ad the cluster doesn't allow schema changes during an NDB node restart. A ndb node restart can take 90 minutes.
Blocking replication for that long is less that optimal (understatement).
Not continuing after that is an issue.

Yes we can set slave_transaction_retries=<nr_of_sand_grains_in_sahara>. But in other cases we don't want to retry forever.

How to repeat:
Replicate to an NDB cluster.
Restart an NDB node or do another operation that grabs the global schema lock for a long time.

Now on the master execute some DDL. That should block replication and after 10 tries replication stops.

Suggested fix:
Change this setting to not stop replication in this case.

Maybe:
slave_transaction_retries_on_node_restart='INFINITE'

Or:
ndb_retry_replication_after_node_online=ON
[18 Dec 2017 13:34] MySQL Verification Team
Hi,

This is not a bug, it behaves as intended. Now if you agree I can change it to be a feature request. 

As for "how" to change this, I doubt the system can differentiate between 

slave_transaction_retries
and
slave_transaction_retries_on_node_restart

as the replication is handled by sql node and it's not aware of "nodes".. 

The "ndb_retry_replication_after_node_online" might be feasible solution but I believe change required to get that working is too great for it to enter any of the currently GA branches. 

The only viable solution I see is to have something done for 8.0.

All best
Bogdan
[19 Dec 2017 12:19] Daniël van Eeden
I agree that this not really a bug, but a feature request.
Fixing this in 8.0 sounds fine to me.
And indeed finding the correct way of fixing this might be difficult.
[19 Dec 2017 17:37] MySQL Verification Team
Hi,

Verified as "Feature request"

all best
Bogdan