Bug #79564 MySQL Group Replication stop replicate on conflict operations
Submitted: 9 Dec 2015 6:59 Modified: 16 Sep 2016 11:47
Reporter: Ivan Tu Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:0.6, 0.8 OS:Any
Assigned to: Nuno Carvalho CPU Architecture:Any
Tags: Group Replication block

[9 Dec 2015 6:59] Ivan Tu
Description:
Group Replication would be blocked when any conflicts between the member instance, such as creating the same table name on 2 database instances, or inserting the same primary/unique values on 2 servers. Such operations can not be solved, unless take out the instance which shows conflict and replace it with an empty one

How to repeat:
Create the same table name on 2 different database instances of the group replication (before starting group replication, one table name already exists on one of the servers, but not the rest, then creating the same table on another instance after group replication is started), or inserting the same primary/unique key values on 2 or more database instances

Suggested fix:
Need a conflict resolution feature that resolve the conflicts and show the error message when the conflicted operations been rolled back on the instances
[9 Dec 2015 10:24] Ivan Tu
Sorry, conflict with duplicated pk/uk will be resolved, duplicated user name will block group replication, and it is difficult to discover the root cause of the block from the performance_schema table such as replication_group_member_stats, wondering other than solving conflict with the user name, if there will be enhancement with the error message
[11 Dec 2015 16:16] Nuno Carvalho
Hi Ivan,

Thank you for the bug report.

Please do note that all members must start with the database state or new member must join with a subset of the database of the group. Even on the second situation, the subset must be a valid state of the group.
If you join one member with disjoint data when compared to the group you will face issues, like the one you described, since MySQL Group Replication it is a shared nothing solution, on which all members must have the same data.

So on your duplicated user scenario, how was the user created initially on the first member?
Was its creation logged on binary log?

Best regards,
Nuno Carvalho
[1 Jan 2016 15:16] Ivan Tu
Hi, 
The scenario that hang group replication can be reproduced by the following steps:
1. Create database instances
2. Turn on binlogs of all the database instances
3. Create group replication recover user (such as CREATE USER 'rpl_user'@'%' IDENTIFIED BY 'welcome'; GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';) on the all the instances, this is wrong with the configuration, it need only created on 1 instance
4. Configure group replication on the 1st database and start group_replication - so far every thing looks fine
5. Configure group replication on the 2nd database instance, when start group_replication on the 2nd instance, check the members by
SELECT * FROM performance_schema.replication_group_members\G
will shows the 2nd instance is always in "RECOVERING" state, and error log shows 
[ERROR] Slave SQL for channel 'group_replication_recovery': Error 'Operation CREATE USER failed for 'rpl_user'@'%'' on query. Default database: ''. Query: 'CREATE USER 'rpl_user'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*DF216F57F1F2066124E1AA5491D995C3CB57E4C2'', Error_code: 1396

I think there should be a command/tool that help us correct the error

Regards
Ivan
[24 Aug 2016 13:23] Umesh Shastry
Thank you for the details.
I'm not sure whether this is intended behavior, but observed with latest build.

Thanks,
Umesh
[16 Sep 2016 11:47] Erlend Dahl
Posted by developer:

[30 Aug 2016 9:33] Nuno Carvalho

Group Replication does implement conflict detection and not conflict
resolution.