MySQL Bugs: #105436: GRepl APPLIER_QUEUE=1 after 8.0.27 upgrade, even on primary node

Bug #105436	GRepl APPLIER_QUEUE=1 after 8.0.27 upgrade, even on primary node
Submitted:	3 Nov 2021 4:06	Modified:	17 Feb 2022 20:46
Reporter:	NOC NOC	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Group Replication	Severity:	S3 (Non-critical)
Version:	8.0.28	OS:	Debian
Assigned to:		CPU Architecture:	Any

Description:
We upgraded our 3-node MySQL 8.0.26 Group Replication cluster (single primary) to 8.0.27 yesterday, and found that two nodes became stuck with COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE = 1 (or at least, never seemed to drop to 0).

The follow-on effect (and how we noticed this initially) is that we use proxySQL in front of this cluster, and it was evicting nodes due to 'excessive lag' - we previously set 'max_transactions_behind' to '0', but have had to increase it to '1' to avoid marking these nodes as offline.

COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE stays 1 even if that node's mysql is restarted, or is promoted to the primary. (however, if it is promoted to primary, demoted, and THEN mysql is restarted, the value returns to 0).

There are no errors in the mysql error log.

Initial state example with member cc0105fa-1f98-11eb-bdef-525400149847 as secondary (most important value is of COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE) [note I've cut out some rows due to description size limits):

mysql> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: 48f9bdd8-0c21-11eb-ac1c-525400f14a36
               MEMBER_HOST: mysql101
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: PRIMARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
*************************** 2. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               MEMBER_HOST: mysql102
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: SECONDARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
3 rows in set (0.00 sec)

mysql> select * from performance_schema.replication_group_member_stats\G
*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16041527307045398:98
                                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 14
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 27
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: bfd5c347-c26e-42a1-af4b-fed4aabd117e:1-7378068:7665254-7693968
            LAST_CONFLICT_FREE_TRANSACTION: bfd5c347-c26e-42a1-af4b-fed4aabd117e:7378079
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 1
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 14
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0

After promoting cc0105fa-1f98-11eb-bdef-525400149847 to primary, COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE still 1:

mysql> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: 48f9bdd8-0c21-11eb-ac1c-525400f14a36
               MEMBER_HOST: mysql101
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: SECONDARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
*************************** 2. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               MEMBER_HOST: mysql102
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: PRIMARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
3 rows in set (0.00 sec)

mysql> select * from performance_schema.replication_group_member_stats\G
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16041527307045398:98
                                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 3110
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 22
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: bfd5c347-c26e-42a1-af4b-fed4aabd117e:1-7381157:7665254-7693968
            LAST_CONFLICT_FREE_TRANSACTION: bfd5c347-c26e-42a1-af4b-fed4aabd117e:7381175
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 1
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 3110
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
3 rows in set (0.00 sec)

After demoting (promoting 48f9bdd8-0c21-11eb-ac1c-525400f14a36) and then restarting mysql on cc0105fa-1f98-11eb-bdef-525400149847:

mysql> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: 48f9bdd8-0c21-11eb-ac1c-525400f14a36
               MEMBER_HOST: mysql101
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: PRIMARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
*************************** 2. row ***************************
              CHANNEL_NAME: group_replication_applier
                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               MEMBER_HOST: mysql102
               MEMBER_PORT: 3306
              MEMBER_STATE: ONLINE
               MEMBER_ROLE: SECONDARY
            MEMBER_VERSION: 8.0.27
MEMBER_COMMUNICATION_STACK: XCom
3 rows in set (0.00 sec)

mysql> select * from performance_schema.replication_group_member_stats\G
*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16041527307045398:100
                                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 61
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 15
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: bfd5c347-c26e-42a1-af4b-fed4aabd117e:1-7381233:7665254-7694232
            LAST_CONFLICT_FREE_TRANSACTION: bfd5c347-c26e-42a1-af4b-fed4aabd117e:7381238
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 61
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
3 rows in set (0.00 sec)

Some general group reply configuration for the cluster to be attached

Happy to provide other variables/set up information if requested.

How to repeat:
Unsure as to repeatability, as cannot roll back cluster.

Suggested fix:
For now, issue was resolved by promoting problem node to master, demoting, then restarting mysql. We had to perform this on two separate nodes (at least, I can't remember the state of the 3rd node to get to where we were).

group repl vars

| group_concat_max_len                                | 1024                                            |
| group_replication_advertise_recovery_endpoints      | DEFAULT                                         |
| group_replication_allow_local_lower_version_join    | OFF                                             |
| group_replication_auto_increment_increment          | 7                                               |
| group_replication_autorejoin_tries                  | 3                                               |
| group_replication_bootstrap_group                   | OFF                                             |
| group_replication_clone_threshold                   | 9223372036854775807                             |
| group_replication_communication_debug_options       | GCS_DEBUG_NONE                                  |
| group_replication_communication_max_message_size    | 10485760                                        |
| group_replication_communication_stack               | XCOM                                            |
| group_replication_components_stop_timeout           | 300                                             |
| group_replication_compression_threshold             | 1000000                                         |
| group_replication_consistency                       | BEFORE_ON_PRIMARY_FAILOVER                      |
| group_replication_enforce_update_everywhere_checks  | OFF                                             |
| group_replication_exit_state_action                 | READ_ONLY                                       |
| group_replication_flow_control_applier_threshold    | 25000                                           |
| group_replication_flow_control_certifier_threshold  | 25000                                           |
| group_replication_flow_control_hold_percent         | 10                                              |
| group_replication_flow_control_max_quota            | 0                                               |
| group_replication_flow_control_member_quota_percent | 0                                               |
| group_replication_flow_control_min_quota            | 0                                               |
| group_replication_flow_control_min_recovery_quota   | 0                                               |
| group_replication_flow_control_mode                 | QUOTA                                           |
| group_replication_flow_control_period               | 1                                               |
| group_replication_flow_control_release_percent      | 50                                              |
| group_replication_force_members                     |                                                 |
| group_replication_group_name                        | bfd5c347-c26e-42a1-af4b-fed4aabd117e            |
| group_replication_group_seeds                       | 10.5.2.50:33061,10.5.2.60:33061,10.5.2.70:33061 |
| group_replication_gtid_assignment_block_size        | 1000000                                         |
| group_replication_ip_allowlist                      | AUTOMATIC                                       |
| group_replication_ip_whitelist                      | AUTOMATIC                                       |
| group_replication_local_address                     | 10.5.2.50:33061                                 |
| group_replication_member_expel_timeout              | 5                                               |
| group_replication_member_weight                     | 50                                              |
| group_replication_message_cache_size                | 1073741824                                      |
| group_replication_paxos_single_leader               | OFF                                             |
| group_replication_poll_spin_loops                   | 0                                               |
| group_replication_recovery_complete_at              | TRANSACTIONS_APPLIED                            |
| group_replication_recovery_compression_algorithms   | uncompressed                                    |
| group_replication_recovery_get_public_key           | ON                                              |
| group_replication_recovery_public_key_path          |                                                 |
| group_replication_recovery_reconnect_interval       | 60                                              |
| group_replication_recovery_retry_count              | 10                                              |                                            
| group_replication_recovery_tls_ciphersuites         |                                                 |
| group_replication_recovery_tls_version              | TLSv1,TLSv1.1,TLSv1.2,TLSv1.3                   |
| group_replication_recovery_use_ssl                  | OFF                                             |
| group_replication_recovery_zstd_compression_level   | 3                                               |
| group_replication_single_primary_mode               | ON                                              |
| group_replication_ssl_mode                          | DISABLED                                        |
| group_replication_start_on_boot                     | OFF                                             |
| group_replication_tls_source                        | MYSQL_MAIN                                      |
| group_replication_transaction_size_limit            | 150000000                                       |
| group_replication_unreachable_majority_timeout      | 0                                               |
| group_replication_view_change_uuid                  | AUTOMATIC                                       |

Hi,

Thanks for the report, I'm trying to reproduce but did not so far ... 

Some info that might be relevant
 - how did you originally deploy the cluster
 - how did you exactly perform the upgrade of the cluster
 - how much load you had on the cluster while upgrading
 - can you share full proxySQL config ( 'max_transactions_behind' to '0' is not going to work for sure till the system is up at full capacity so that's expected)
 - was COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE  ever higher than 1 ?
 - did you try stopping traffic completely for a minute to see if the system would "settle" (COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE become 0, trans..behind to go to 0 ..)

Thanks

Cluster was deployed manually (i.e. not using InnoDB cluster). Bootstrapping one node, then joining the others. We're using GTIDs btw.

Upgrade of the cluster was performed by shutting down mysql on a node (non-promaries first), upgrading it (debian package upgrade), then starting mysql and then rejoining the GR cluster, waiting until out of 'recovery', then moving onto the next node. I didn't notice the applier queue value at the time, so I'm not sure when that started occurring.

Cluster was only lightly loaded while upgrading. It doesn't get a lot of use yet.

Our proxysql config is not running any clever query rules, so only the primary mysql server is utilised anyway. max_transactions_behind is sort of only a diagnostic interest, rather than performative.

I never saw COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE higher that 1.

Traffic was not entirely stopped to see if it would catch up, but there would absolutely have been periods where no queries were being replicated due to the general load on the cluster.

Hope that helps...

I've just run into this problem again, however this time without the upgrade being part of the equation (instance already on 8.0.28).

Same cluster (3-node, single primary).

An unrelated package update required each instance to be restarted (OS-level). The first two nodes went perfectly fine. 
The last node was the primary, so I manually changed the primary to another node using 'group_replication_set_as_primary'. 
Then, I set innodb_fast_shutdown = 0
Stopped mysql
Restarted node
rejoined instance into GR using start group_replication
Instance successfully joined/became ONLINE/SECONDARY
However, again 'select * from performance_schema.replication_group_member_stats \G' always shows a query in the applier queue:

*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 16041527307045398:122
                                 MEMBER_ID: cc0105fa-1f98-11eb-bdef-525400149847
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 31718
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 24
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: bfd5c347-c26e-42a1-af4b-fed4aabd117e:1-12956641:13827893-13828502:14828082-14828097
            LAST_CONFLICT_FREE_TRANSACTION: bfd5c347-c26e-42a1-af4b-fed4aabd117e:12956655
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 1
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 31718
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0

(I *have* once seen the value jump to a 2, but it's always at least 1)

This time, I've tried what worked previously - promote the instance, demote the instance, restart mysql - however this hasn't helped. I can't get this instance back to a 0 applier queue. Proxysql therefore won't consider it a valid backend server.

https://lefred.be/content/mysql-innodb-cluster-how-to-handle-fc/ suggests to try set the group_replication_flow_control_mode to DISABLED, that did not help.

I can't find any way of inspecting the applier queue to learn more about the query that is in it - is it always the same query? Or does it change?

Ok, may have found at least a tangential correlation.

replica_parallel_workers = 3
replica_preserve_commit_order = 1
replica_parallel_type = LOGICAL_CLOCK

8.0.27 changed the default replica_parallel_workers to '4' which had me thinking. I set replica_parallel_workers=0, restarted mysql, and my applier queue went down to 0. Reverted to 3, restarted mysql, and applier queue jumped back up to 1.

replica_parallel_workers is actually a dynamic value, so I set it to 0 and stop/started group_replication, and now it applier queue is back to 0.

So at least my current woe is 'sorted', at the expense of replica parallel workers...

I may have found an easier workaround for this. The last couple of times that I have run into it, restarting the mysql service with systemctl doesn't resolve it. However, if I connect with MySQL Shell and issue a "restart;", it seems to come back and properly clear out the applier queue.