Bug #106463 Running DDL queries in a Multi-Primary Cluster
Submitted: 15 Feb 2022 10:55 Modified: 8 Aug 2022 9:39
Reporter: IGG t Email Updates:
Status: Closed Impact on me:
None 
Category:Shell AdminAPI InnoDB Cluster / ReplicaSet Severity:S4 (Feature request)
Version:8.0.28 OS:Any
Assigned to: CPU Architecture:Any

[15 Feb 2022 10:55] IGG t
Description:
Whilst testing an Innodb Cluster to see if it could be used in a production environment, I came across a problem with the multi-primary tests, whereby someone running a DDL query (e.g. TRUNCATE TABLE) on one node at the same time as someone running a DML query (e.g. INSERT) on the same table on a different node had the potential to take down the enire cluster in quite spectacular fashion. 

After some time, and poring through log files to try and figure this out, I came across the page on known limitations in the Group Replication chapter of the documentation which clearly states why this is a problem.

I would like to make some suggestions please.

1) add a similar page to the chapter on innodb cluster, even if it just refers people to the page in the group replication chapter.
2) add a more meaningful entry to the logfiles as to why the cluster has just failed in spectacular fashion
3) add a cluster specific option outright banning DDL queries in multi-primary mode unless an admin specifically disables it.

How to repeat:
see above

Suggested fix:
see above
[15 Feb 2022 11:16] MySQL Verification Team
Hello!

Thank you for the report and feedback.

Thanks,
Umesh
[17 Feb 2022 18:06] Miguel Araujo
Posted by developer:
 
Hi,

Yes, this is indeed a limitation for multi-primary Clusters. It's described in https://dev.mysql.com/doc/refman/8.0/en/group-replication-limitations.html :

```
Concurrent DDL versus DML Operations.  Concurrent data definition statements and data manipulation statements executing against the same object but on different servers is not supported when using multi-primary mode. During execution of Data Definition Language (DDL) statements on an object, executing concurrent Data Manipulation Language (DML) on the same object but on a different server instance has the risk of conflicting DDL executing on different instances not being detected.
```

Unfortunately, such a scenario may result in deadlocks and compromises the Cluster completely. Connections are blocked and, for that reason, it's not possible to implement suggestion 2): "add a more meaningful entry to the logfiles as to why the cluster has just failed in spectacular fashion"

Regarding suggestion 3): "add a cluster specific option outright banning DDL queries in multi-primary mode unless an admin specifically disables it.", this is not possible to do and, in other words, this is the same as running the cluster in single-primary mode.

That said, this must simply be documented in the InnoDB Cluster documentation as described in suggestion 1) "add a similar page to the chapter on innodb cluster, even if it just
refers people to the page in the group replication chapter.". Specifically, the information should be added to https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-innodb-cluster-limitations.html
[8 Aug 2022 9:39] Edward Gilmore
Adding bullet point to MySQL InnoDB Cluster chapter. This bullet points
links to 18.3.2 Group Replication Limitations to meet the resolution of 1)
add a similar page to the chapter on innodb cluster, even if it just
refers people to the page in the group replication chapter.