Bug #102175 log_status has wrong binary_log_position of gtid_executed
Submitted: 7 Jan 2021 12:11 Modified: 29 Jan 2021 7:17
Reporter: Rahul Malik Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.22 OS:Any
Assigned to: CPU Architecture:Any

[7 Jan 2021 12:11] Rahul Malik
Description:
When the server is under heavy write operation
,binary_log_position of gitid_executed shown in log_status doesn't match with the position of that gitd shown in the binary log file. 

How to repeat:
Run below load 
sysbench /usr/local/Cellar/sysbench/1.0.20/share/sysbench/oltp_insert.lua --mysql-db=test --mysql-user=root  --mysql-socket=$SOCKET --threads=10 --time=1500 --report-interval=1 --events=0 --db-driver=mysql prepare

Query binary_log_position and gtid_executed from log_status table. 

position of the GTID in binlog file is different  from what is shown above

for e.g 
mysql> select * from performance_schema.log_status\G
*************************** 1. row ***************************
    SERVER_UUID: c73be700-50b9-11eb-ae08-acf7d33aa797
          LOCAL: {"gtid_executed": "c73be700-50b9-11eb-ae08-acf7d33aa797:1-4841", "binary_log_file": "binlog.000001", "binary_log_position": 4211073}
    REPLICATION: {"channels": []}
STORAGE_ENGINES: {"InnoDB": {"LSN": 25150129, "LSN_checkpoint": 20476883}}
1 row in set (0.00 sec)

MySQL>

In binary log file it at 4205927

# immediate_commit_timestamp=1610021281305784 (2021-01-07 17:38:01.305784 IST)
/*!80001 SET @@session.original_commit_timestamp=1610021281305784*//*!*/;
/*!80014 SET @@session.original_server_version=80022*//*!*/;
/*!80014 SET @@session.immediate_server_version=80022*//*!*/;
SET @@SESSION.GTID_NEXT= 'c73be700-50b9-11eb-ae08-acf7d33aa797:4841'/*!*/;
# at 4205927
#210107 17:38:01 server id 1320  end_log_pos 4206002 CRC32 0xf6f68330   Query   thread_id=34    exec_time=0     error_code=0
SET TIMESTAMP=1610021281/*!*/;
BEGIN
/*!*/;
# at 4206002
#210107 17:38:01 server id 1320  end_log_pos 4206067 CRC32 0xe402d7e9   Table_map: `test`.`sbtest1` mapped to number 122
# at 4206067
#210107 17:38:01 server id 1320  end_log_pos 4206292 CRC32 0xc5cdcfe0   Write_rows: table id 122 flags: STMT_END_F
[7 Jan 2021 12:21] Rahul Malik
it is breaking replication.
[7 Jan 2021 13:41] Venkatesh Prasad Venugopal
When client threads update the executed_gtids gtid set, it doesn't take write lock on the global_sid_lock (https://github.com/percona/percona-server/blob/8.0/sql/rpl_gtid_state.cc#L174) but instead takes read_lock for updating the structure.

When log_status query tries to take write lock on global_sid_lock, it will wait until all reader threads to finish updating the executed gtids structure. But these client threads, after updating the executed_gtids are allowed to execute, till they wait for the lock on the binary log to update the log position.

Since, the required lock is held by the log_status query, we will always get the consistent binlog file and the position, but we may not always get the consistent executed_gtids if there are already some threads updating it when log_status query tries to acquire the write lock on the global_sid_lock.
[11 Jan 2021 14:12] MySQL Verification Team
Hi,

I'm having an issue reproducing this. The explanation of what happens makes sense but looks like it's not just write load that's important for reproducing this.

Quick question - what kind of hw are you using to reproduce this consistently?

thanks
Bogdan
[27 Jan 2021 11:02] Rahul Malik
I can easily reproduce on Ubuntu 18.04.4 LTS with config Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz
8 core with 32 GB 

Let me know if you need more details
[27 Jan 2021 17:51] MySQL Verification Team
Hi,

> I can easily reproduce on Ubuntu 18.04.4 LTS with 
> config Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz

Are you sure you are using Oracle APT repository when reproducing this. 
I tried on bit smaller machine (i7-6700 CPU @ 3.40GHz, Debian 10.7) and could not reproduce.
[28 Jan 2021 0:50] Rahul Malik
Using mysql 8.0.23 with commit id 7ed30a748964c009d4909cb8b4b22036ebdef239 (HEAD -> 8.0, tag: mysql-cluster-8.0.23, tag: mysql-8.0.23, origin/HEAD, origin/8.0)
Author: mysql-builder@oracle.com <>
Date:   Tue Dec 15 09:25:02 2020 +0100
[28 Jan 2021 11:46] Venkatesh Prasad Venugopal
Hi Bogdan!

This issue is easily reproducible by adding sleep(1) just after the client thread takes the global_sid_lock to update the gtid as part of binlog group commit.

=== diff begin ===
diff --git a/sql/rpl_gtid_state.cc b/sql/rpl_gtid_state.cc
index 3f5288dab4d..296583b51ed 100644
--- a/sql/rpl_gtid_state.cc
+++ b/sql/rpl_gtid_state.cc
@@ -163,6 +163,7 @@ void Gtid_state::update_commit_group(THD *first_thd) {
   DEBUG_SYNC(first_thd, "update_gtid_state_before_global_sid_lock");
   global_sid_lock->rdlock();
   DEBUG_SYNC(first_thd, "update_gtid_state_after_global_sid_lock");
+  sleep(1);

   update_gtids_impl_lock_sidnos(first_thd);

=== diff end ===

Steps to reproduce:

1. Apply the above patch on 8.0.23 code and build the server.
2. Start the server
   $ ./mtr --start --mem --suite=binlog_gtid

3. Run sysbench prepare to generate initial data
Prepare: sysbench /usr/share/sysbench/oltp_insert.lua --mysql-db=test --mysql-user=root --threads=24 --time=1500 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=13000 --events=0 --db-driver=mysql prepare
Run:     sysbench /usr/share/sysbench/oltp_insert.lua --mysql-db=test --mysql-user=root --threads=24 --time=1500 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=13000 --events=0 --db-driver=mysql run

4. While sysbench is running on one session, connect to server and execute

   select LOCAL from performance_schema.log_status\G

5. Compare the value of gtid_executed returned from the perfschema against the output of mysqlbinlog and SHOW BINLOG EVENTS.

Below is what I got during my tests on my Ubuntu Laptop with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, 6 cores with 64G RAM. Here, the perfschema query says that the transaction with GTID number 30 ends at position 1921863. But, the transaction ends at 1921388.

mysql> select LOCAL from performance_schema.log_status\G
*************************** 1. row ***************************
LOCAL: {"gtid_executed": "0f9b691a-612b-11eb-bbf5-74d83e29c093:1-30", "binary_log_file": "binlog.000001", "binary_log_position": 1921863}
1 row in set (3.51 sec)

SHOW BINLOG EVENTS;
| binlog.000001 | 1920913 | Gtid           |         1 |     1920992 | SET @@SESSION.GTID_NEXT= '0f9b691a-612b-11eb-bbf5-74d83e29c093:30'|
| binlog.000001 | 1920992 | Query          |         1 |     1921067 | BEGIN|
| binlog.000001 | 1921067 | Table_map      |         1 |     1921132 | table_id: 106 (test.sbtest1)|
| binlog.000001 | 1921132 | Write_rows     |         1 |     1921357 | table_id: 106 flags: STMT_END_F|
| binlog.000001 | 1921357 | Xid            |         1 |     1921388 | COMMIT /* xid=53 */|
| binlog.000001 | 1921388 | Gtid           |         1 |     1921467 | SET @@SESSION.GTID_NEXT= '0f9b691a-612b-11eb-bbf5-74d83e29c093:31'|
| binlog.000001 | 1921467 | Query          |         1 |     1921542 | BEGIN|
| binlog.000001 | 1921542 | Table_map      |         1 |     1921607 | table_id: 106 (test.sbtest1)|
| binlog.000001 | 1921607 | Write_rows     |         1 |     1921832 | table_id: 106 flags: STMT_END_F|
| binlog.000001 | 1921832 | Xid            |         1 |     1921863 | COMMIT /* xid=60 */

Here is another sample, the perfschema query says that the GTID number 30 ends at binlog position 1932788. But, the actual transaction at that position is GTID 54, but not 31.

mysql> select LOCAL from performance_schema.log_status\G
*************************** 1. row ***************************
LOCAL: {"gtid_executed": "0f9b691a-612b-11eb-bbf5-74d83e29c093:1-31", "binary_log_file": "binlog.000001", "binary_log_position": 1932788}
1 row in set (3.29 sec)

SHOW BINLOG EVENTS;
| binlog.000001 | 1932313 | Gtid           |         1 |     1932392 | SET @@SESSION.GTID_NEXT= '0f9b691a-612b-11eb-bbf5-74d83e29c093:54'|
| binlog.000001 | 1932392 | Query          |         1 |     1932467 | BEGIN|
| binlog.000001 | 1932467 | Table_map      |         1 |     1932532 | table_id: 106 (test.sbtest1)|
| binlog.000001 | 1932532 | Write_rows     |         1 |     1932757 | table_id: 106 flags: STMT_END_F|
| binlog.000001 | 1932757 | Xid            |         1 |     1932788 | COMMIT /* xid=62 */

Regards,
Venkatesh
[29 Jan 2021 7:17] MySQL Verification Team
Hi Venkatesh,

Yes this helps reproducing. I see how this could happen in real life scenario, just could not get my test setup to get there :)

verified
All best
Bogdan
[29 Jul 2021 15:03] Venkatesh Prasad Venugopal
Hi Bogdan,

Any updates on this?
[21 Sep 2021 16:41] Marcelo Altmann
Here is a description of the problem

    A commit has 3 stages:
    1. Flushing - protected by LOCK_log
    2. Syncing - we release LOCK_log and acquire LOCK_sync
    3. Committing - we release LOCK_sync and acquire LOCK_commit. In this
    stage we also acquire global_sid_lock and update gtid_executed.

    When querying log_status table, we acquire LOCK_log for binlog at
    Log_resource_binlog_wrapper::lock() and global_sid_lock for
    gtid_executed at Log_resource_gtid_state_wrapper::lock().

    In a high level, Gtid works as follow:
    1. a THD ask for the next GTID no for the server ID it's executing the
    transaction.
    2. it generates the binlog event with said GTID.
    3. at stage 3 if THD owns any gtid, it updates gtid_executed

    The problem comes when a query to ps.log_status comes after stage 1 of
    binlog commit. At this stage, commit has released LOCK_log, binlg file
    and position are already updated, Log_resource_binlog_wrapper::lock()
    will move forward. Then while binlog is still processing stage 2,
    Log_resource_gtid_state_wrapper::lock() will be invoked and will acquire
    global_sid_lock. At this point stage 3 of commit will not be able to
    update gtid_executed leaving the log_status query with a mismatch
    between binlog position (new values) vs gtid_executed (old values).

    Proposed Fix:
    Increase the lock at log_status table to also take LOCK_sync and
    LOCK_commit if GTID wrapper has been invoked. This way the query to
    log_status will have to wait until gtid_executed has been fully updated
    guaranteeing consistency with binlog position.
[21 Sep 2021 16:42] Marcelo Altmann
Patch generated based on tag mysql-8.0.26

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug_102175.patch (application/octet-stream, text), 9.86 KiB.

[13 Sep 2023 9:33] MySQL Verification Team
Bug #112331 marked as duplicate of this one