Bug #90213 remote nodes COUNT_TRANSACTIONS_ROWS_VALIDATING are incorrect
Submitted: 26 Mar 2018 8:02 Modified: 2 Aug 2018 10:53
Reporter: Zhenghu Wen (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:8.0.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[26 Mar 2018 8:02] Zhenghu Wen
Description:
when i use performance_schema.replication_group_member_stats to monitor COUNT_TRANSACTIONS_ROWS_VALIDATING, i found very different values in members, as follow:

node3>select COUNT_TRANSACTIONS_ROWS_VALIDATING  from replication_group_member_stats;
+------------------------------------+
| COUNT_TRANSACTIONS_ROWS_VALIDATING |
+------------------------------------+
|                                 83 |
|                                247 |
|                             315708 |
+------------------------------------+
315708 value is local COUNT_TRANSACTIONS_ROWS_VALIDATING。then i use another node do select to compare

node3>show variables like "%server_uuid%";
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 6f3e0827-2e70-11e8-8330-c81f66e48c6e |
+---------------+--------------------------------------+
1 row in set (0.00 sec)

node3>select * from replication_group_member_stats where MEMBER_ID = '5b314bb9-2e70-11e8-b53f-c81f66e48c6e'\G
*************************** 1. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15217921358048072:919
                                 MEMBER_ID: 5b314bb9-2e70-11e8-b53f-c81f66e48c6e
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 949389
                  COUNT_CONFLICTS_DETECTED: 76
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 73
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-176037690:176638008-176721915
            LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:176357180
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 334411
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 615107
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 78
1 row in set (0.00 sec)

node1>show variables like "%server_uuid%";
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 5b314bb9-2e70-11e8-b53f-c81f66e48c6e |
+---------------+--------------------------------------+
1 row in set (0.00 sec)

node1>select * from replication_group_member_stats where MEMBER_ID = '5b314bb9-2e70-11e8-b53f-c81f66e48c6e'\G
*************************** 1. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15217921358048072:919
                                 MEMBER_ID: 5b314bb9-2e70-11e8-b53f-c81f66e48c6e
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 901148
                  COUNT_CONFLICTS_DETECTED: 332
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 633190
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-176037690:176638008-176721915
            LAST_CONFLICT_FREE_TRANSACTION: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:176313978
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 334411
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 566741
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 78
1 row in set (0.00 sec)

then ,i add some some debug info before sending and after receving the msg:

2018-03-26T13:58:11.818600+08:00 9 [Note] Plugin group_replication reported: 'encode_payload ROWS_VALIDATING 168300, CERTIFIED 61671'
2018-03-26T13:58:11.819216+08:00 0 [Note] Plugin group_replication reported: 'decode_payload ROWS_VALIDATING 108, CERTIFIED 61671'
101001000101101100   168300
          01101100   108

it seems COUNT_TRANSACTIONS_ROWS_VALIDATING  fetch from Pipeline_stats_member_message is wrong, only show the low 1 bytes.

How to repeat:
1、deploy mgr cluster with 3 nodes, 
2、run sysbench oltp for a while, then 
3、do select * from performance_schema.replication_group_member_stats.
4、compare the COUNT_TRANSACTIONS_ROWS_VALIDATING in different nodes

Suggested fix:
may be the reason that:

void
Pipeline_stats_member_message::decode_payload(const unsigned char *buffer,
                                              const unsigned char *end)
.....
      case PIT_TRANSACTIONS_ROWS_VALIDATING:
        if (slider + payload_item_length <= end)
        {
          uint64 transactions_rows_validating_aux= *slider;
          slider += payload_item_length;
          m_transactions_rows_validating=
                 (int64)transactions_rows_validating_aux;
        }
        break;
.....

uint64 transactions_rows_validating_aux= *slider;
should be :
uint64 transactions_rows_validating_aux= sint8korr(slider);
[26 Mar 2018 8:57] Zhenghu Wen
add patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 90213.patch (application/octet-stream, text), 1.61 KiB.

[26 Mar 2018 14:36] MySQL Verification Team
Hello Zhenghu Wen,

Thank you for the report and contribution.

Thanks,
Umesh
[2 Aug 2018 10:53] MySQL Verification Team
Internally reported base bug Bug#27692831 has been fixed in 8.0.12 and the following was added to the 8.0.12 changelog: 
The PIT_TRANSACTIONS_NEGATIVE_CERTIFIED, the PIT_TRANSACTIONS_ROWS_VALIDATING 
and the PIT_TRANSACTIONS_LOCAL_ROLLBACK member messages were not being 
correctly decoded.