Bug #103421 DDL in group replication cluster also if online cause full stop of write operati
Submitted: 22 Apr 8:22 Modified: 7 May 12:53
Reporter: Marco Tusa Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S2 (Serious)
Version: OS:CentOS
Assigned to: CPU Architecture:Any

[22 Apr 8:22] Marco Tusa
Description:
When using online DDL and Group Replication Online DDL concept is partially applicable, and all write operation got HOLD when replicating. Also in case of Primary crash cluster remain without Primary for the duration of the Alter

How to repeat:
Folks, this is a bit composite issue.
I don't know if we can consider this as a bug but for sure something is acting weird.

Scenario:
    3 MySQL nodes 8.0.23 in cluster using Group replication
    
   DC1-3(root@localhost) [windmills_s]>SELECT * FROM performance_schema.replication_group_members order by 3;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | ebff1ab8-9793-11eb-ba5f-08002734ed50 | gr1         |        3306 | ONLINE       | PRIMARY     | 8.0.23         |
| group_replication_applier | f47df54e-9793-11eb-a60b-08002734ed50 | gr2         |        3306 | ONLINE       | SECONDARY   | 8.0.23         |
| group_replication_applier | e891d1b4-9793-11eb-92ac-08002734ed50 | gr3         |        3306 | ONLINE       | SECONDARY   | 8.0.23         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

Populate A table and make it large enough to take some time to perform an ALTER.
Like :
DC1-1(root@localhost) [windmills_s]>show create table windmills_test\G
*************************** 1. row ***************************
       Table: windmills_test
Create Table: CREATE TABLE `windmills_test` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `uuid` char(36) COLLATE utf8_bin NOT NULL,
  `millid` smallint NOT NULL,
  `kwatts_s` int NOT NULL,
  `date` date NOT NULL,
  `location` varchar(50) COLLATE utf8_bin NOT NULL,
  `active` tinyint NOT NULL DEFAULT '1',
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `strrecordtype` char(3) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `IDX_millid` (`millid`,`active`),
  KEY `IDX_active` (`id`,`active`),
  KEY `kuuid_x` (`uuid`),
  KEY `millid_x` (`millid`),
  KEY `active_x` (`active`)
) ENGINE=InnoDB AUTO_INCREMENT=8195530 DEFAULT CHARSET=utf8 COLLATE=utf8_bin ROW_FORMAT=DYNAMIC
1 row in set (0.00 sec)

From another terminal run on PRIMARY an insert command as:
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done 

Then from another terminal insert using different tables:
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills8  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills6 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done 

On SECONDARY nodes to check WHEN the Alter is going to be executed:
while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/mysql-8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "show processlist;"|grep -i "alter";sleep 1;done

You can try to use:
set global  group_replication_consistency= 'EVENTUAL';
set global  group_replication_consistency= 'AFTER';
Or others nothing will change.

At this point you can run th alter command:
    ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;

During the alter INSERTS on that table will continue... 
As soon as the Alter is complete and it starts on the SECONDARY nodes a meta lock and waiting for handler to commit is raised in the PRIMARY and INSERTS stop
| 1030 | root            | localhost           | windmills_s | Query   |   25 | waiting for handler commit                             | insert into windmills8  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype fro |
| 1031 | root            | localhost           | windmills_s | Query   |   24 | Waiting for table metadata lock                        | insert into windmills_test  select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype |

Once the SLOWEST table had complete the execution of the alter , the activity restart.

When the alter is executed:
DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=NONE;
Query OK, 0 rows affected (3 min 22.85 sec)
Records: 0  Duplicates: 0  Warnings: 0

We do not get warning and/or errors about the fact the ALTER is not fully supported as online operation.

At the same time IF the PRIMARY will crash after the operation (while SECONDARY apply the ALTER) the cluster will remain without PRIMARY until the fastest node is able to complete. 

Also if I use ALGORITH=COPY, first as expected I cannot insert data while inserting on that table but I can on the others.
Then I have a full stop again!

My Variables related to GR and Replica
DC1-1(root@localhost) [(none)]>show global variables like '%replica%';
+-----------------------------------------------------+----------------------------------------------------------+
| Variable_name                                       | Value                                                    |
+-----------------------------------------------------+----------------------------------------------------------+
| group_replication_advertise_recovery_endpoints      | DEFAULT                                                  |
| group_replication_allow_local_lower_version_join    | OFF                                                      |
| group_replication_auto_increment_increment          | 7                                                        |
| group_replication_autorejoin_tries                  | 3                                                        |
| group_replication_bootstrap_group                   | OFF                                                      |
| group_replication_clone_threshold                   | 9223372036854775807                                      |
| group_replication_communication_debug_options       | GCS_DEBUG_NONE                                           |
| group_replication_communication_max_message_size    | 10485760                                                 |
| group_replication_components_stop_timeout           | 31536000                                                 |
| group_replication_compression_threshold             | 1000000                                                  |
| group_replication_consistency                       | AFTER                                                    |
| group_replication_enforce_update_everywhere_checks  | OFF                                                      |
| group_replication_exit_state_action                 | READ_ONLY                                                |
| group_replication_flow_control_applier_threshold    | 25000                                                    |
| group_replication_flow_control_certifier_threshold  | 25000                                                    |
| group_replication_flow_control_hold_percent         | 10                                                       |
| group_replication_flow_control_max_quota            | 0                                                        |
| group_replication_flow_control_member_quota_percent | 0                                                        |
| group_replication_flow_control_min_quota            | 0                                                        |
| group_replication_flow_control_min_recovery_quota   | 0                                                        |
| group_replication_flow_control_mode                 | QUOTA                                                    |
| group_replication_flow_control_period               | 1                                                        |
| group_replication_flow_control_release_percent      | 50                                                       |
| group_replication_force_members                     |                                                          |
| group_replication_group_name                        | DC1aaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa                     |
| group_replication_group_seeds                       | 192.168.4.81:33061,192.168.4.82:33061,192.168.4.83:33061 |
| group_replication_gtid_assignment_block_size        | 1000000                                                  |
| group_replication_ip_allowlist                      | AUTOMATIC                                                |
| group_replication_ip_whitelist                      | AUTOMATIC                                                |
| group_replication_local_address                     | 192.168.4.81:33061                                       |
| group_replication_member_expel_timeout              | 60                                                       |
| group_replication_member_weight                     | 50                                                       |
| group_replication_message_cache_size                | 315360004294967295                                       |
| group_replication_poll_spin_loops                   | 0                                                        |
| group_replication_recovery_complete_at              | TRANSACTIONS_APPLIED                                     |
| group_replication_recovery_compression_algorithms   | uncompressed                                             |
| group_replication_recovery_get_public_key           | OFF                                                      |
| group_replication_recovery_public_key_path          |                                                          |
| group_replication_recovery_reconnect_interval       | 60                                                       |
| group_replication_recovery_retry_count              | 10                                                       |
| group_replication_recovery_ssl_ca                   |                                                          |
| group_replication_recovery_ssl_capath               |                                                          |
| group_replication_recovery_ssl_cert                 |                                                          |
| group_replication_recovery_ssl_cipher               |                                                          |
| group_replication_recovery_ssl_crl                  |                                                          |
| group_replication_recovery_ssl_crlpath              |                                                          |
| group_replication_recovery_ssl_key                  |                                                          |
| group_replication_recovery_ssl_verify_server_cert   | OFF                                                      |
| group_replication_recovery_tls_ciphersuites         |                                                          |
| group_replication_recovery_tls_version              | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3                            |
| group_replication_recovery_use_ssl                  | OFF                                                      |
| group_replication_recovery_zstd_compression_level   | 3                                                        |
| group_replication_single_primary_mode               | ON                                                       |
| group_replication_ssl_mode                          | DISABLED                                                 |
| group_replication_start_on_boot                     | OFF                                                      |
| group_replication_tls_source                        | MYSQL_MAIN                                               |
| group_replication_transaction_size_limit            | 2147483647                                               |
| group_replication_unreachable_majority_timeout      | 0                                                        |
| innodb_replication_delay                            | 0                                                        |
| replication_optimize_for_static_plugin_config       | OFF                                                      |
| replication_sender_observe_commit_only              | OFF                                                      |
+-----------------------------------------------------+----------------------------------------------------------+
61 rows in set (0.01 sec)

DC1-1(root@localhost) [(none)]>show global variables like '%slave%';
+------------------------------+----------------------------------+
| Variable_name                | Value                            |
+------------------------------+----------------------------------+
| init_slave                   |                                  |
| log_slave_updates            | ON                               |
| log_slow_slave_statements    | OFF                              |
| rpl_stop_slave_timeout       | 31536000                         |
| slave_allow_batching         | OFF                              |
| slave_checkpoint_group       | 512                              |
| slave_checkpoint_period      | 300                              |
| slave_compressed_protocol    | OFF                              |
| slave_exec_mode              | STRICT                           |
| slave_load_tmpdir            | /opt/mysql_instances/grtest/temp |
| slave_max_allowed_packet     | 1073741824                       |
| slave_net_timeout            | 60                               |
| slave_parallel_type          | LOGICAL_CLOCK                    |
| slave_parallel_workers       | 4                                |
| slave_pending_jobs_size_max  | 134217728                        |
| slave_preserve_commit_order  | ON                               |
| slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN             |
| slave_skip_errors            | OFF                              |
| slave_sql_verify_checksum    | ON                               |
| slave_transaction_retries    | 10                               |
| slave_type_conversions       |                                  |
| sql_slave_skip_counter       | 0                                |
+------------------------------+----------------------------------+
22 rows in set (0.00 sec)

Summarizing:
- Using ONLINE DDL I am able to insert in the ALTERED table and others while the alter is executed on the PRIMARY node
- As soon ALTER reaches the SECONDARY all my inserts are on hold
- No difference about the group_replication_consistency used
- If PRIMARY crashes after when secondary are executing the alter, cluster remain without PRIMARY for the duration of the ALTER on the fastest node 

Let me know if you need more details I can provide a full scenario/video if you are not able to reproduce
[7 May 12:53] MySQL Verification Team
Hi Marco,

Thanks for the report and the reproduction process. 
I went through it all and I reproduced it enough to come to a conclusion that you are on to something. To me, at first look, this did not look like a bug. Second pass through it I'm not so sure anymore so I'm verifying this for GR team to check in-depth.

all best
Bogdan