Bug #90494 gr_exit_state_action=ABORT_SERVER is not considered on compatibility checks
Submitted: 18 Apr 2018 12:04 Modified: 10 Aug 2018 8:21
Reporter: Nuno Carvalho Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:8.0.12 OS:Any
Assigned to: CPU Architecture:Any

[18 Apr 2018 12:04] Nuno Carvalho
Description:
WL#11568: "Group Replication: option to shutdown server when dropping out of the group" must abort the server when the option group_replication_exit_state_action=ABORT_SERVER and a plugin error happens, though there is one scenario missing.

The scenario is when the server is starting Group Replication
automatically on server start, server is joining the group, and one of
the following situations happens:
  1) number of members was exceeded;
  2) member version is not compatible with the group;
  3) the gtid_assignment_block_size is not equal to the group;
  4) the hash algorithm used is not equal to the group;
  5) the member has more transactions than the group;
  6) the member has different configuration flags that the group has.

In any of the above situations the member will leave the group and set
super_read_only= TRUE, without considering
group_replication_exit_state_action option.

How to repeat:
Check the code at gcs_event_handlers.cc:
 887   /*
 888    If we are joining, 3 scenarios exist:
 889    1) We are incompatible with the group so we leave
 890    2) We are alone so we declare ourselves online
 891    3) We are in a group and recovery must happen
 892   */
 893   if (is_joining) {
 894     int error = 0;
 895     if ((error = check_group_compatibility(number_of_members))) {
 896       view_change_notifier->cancel_view_modification(error);
 897       return;
 898     }

Suggested fix:
 685   if (view_change_notifier->wait_for_view_modification()) {
 686     if (!view_change_notifier->is_cancelled()) {
 687       // Only log a error when a view modification was not cancelled.
 688       LogPluginErr(ERROR_LEVEL, ER_GRP_RPL_TIMEOUT_ON_VIEW_AFTER_JOINING_GRP);
 689     }
 690     error = view_change_notifier->get_error();
 691     goto err;
 692   }
 693   group_replication_running = true;
 694   log_primary_member_details();
 695 
 696 err:
 697 
 698   if (error) {
 699     plugin_is_setting_read_mode = false;
 700     group_member_mgr_configured = false;
 701 
 702     // Unblock the possible stuck delayed thread
 703     if (delayed_init_thd) delayed_init_thd->signal_read_mode_ready();
 704     leave_group();
 705     terminate_plugin_modules();
 706 
 707     if (!server_shutdown_status && server_engine_initialized() &&
 708         enabled_super_read_only) {
 709       set_read_mode_state(sql_command_interface, read_only_mode,
 710                           super_read_only_mode);
 711     }

consider option group_replication_exit_state_action at this point.

 712     if (certification_latch != NULL) {
 713       delete certification_latch; /* purecov: inspected */
 714       certification_latch = NULL; /* purecov: inspected */
 715     }
 716   }
 717 
 718   delete sql_command_interface;
 719   plugin_is_auto_starting_on_install = false;
 720 
 721   DBUG_RETURN(error);
 722 }
[10 Aug 2018 8:21] David Moss
Posted by developer:
 
Thank you for your feedback, this has been fixed in upcoming versions and the following was added to the 8.0.13 changelog:
The group_replication_exit_state_action variable enables you to specify what action is taken if a member involuntarily leaves the group, but when starting a server with group_replication_start_on_boot enabled the group_replication_exit_state_action variable was being ignored during the following scenarios:

valid number of group members was exceeded
incompatible configuration of the member system variables (various)
the joining member had more transactions than the group
the joining member's version was not compatible with the group