Bug #112331 GTID is behind the position in P_S.log_status
Submitted: 13 Sep 2023 6:08 Modified: 13 Sep 2023 9:33
Reporter: Jinyou Ma Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Performance Schema Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[13 Sep 2023 6:08] Jinyou Ma
Description:
There are 4 ( 0 - 3 ) stages in the binlog commit.
The log_status only hold mutex LOCK_log to protect binlog.

inline mysql_mutex_t *get_log_lock() { return &LOCK_log; }
void Log_resource_binlog_wrapper::lock() {
  mysql_mutex_lock(binlog->get_log_lock());
}

The binlog position is updated at the state #1, but the GTID will be updated in the state #3.

In the stage #2, It will release the mutex LOCK_log.

  if (change_stage(thd, Commit_stage_manager::SYNC_STAGE, wait_queue, &LOCK_log,
                   &LOCK_sync)) {
    DBUG_PRINT("return", ("Thread ID: %u, commit_error: %d", thd->thread_id(),
                          thd->commit_error));
    return finish_commit(thd);
  }

If we get the result in the stage #2, we will get the GTID behind the file position.

When the bug occurs, the GTID is 1-14, and the position is 1937.
select local from performance_schema.log_status\G
*************************** 1. row ***************************
local: {"gtid_executed": "00008035-0000-0000-0000-000000008035:1-14", "binary_log_file": "binlog.000004", "binary_log_position": 1937}

If we get the result after the stage 3, the GTID is 1-15.
select local from performance_schema.log_status\G
*************************** 1. row ***************************
local: {"gtid_executed": "00008035-0000-0000-0000-000000008035:1-15", "binary_log_file": "binlog.000004", "binary_log_position": 1937}

Percona server use 2 more mutex LOCK_sync, and LOCK_commit to fix this issue.

void Log_resource_gtid_state_wrapper::lock() {
  DEBUG_SYNC(current_thd, "log_resource_gtid_lock");
  mysql_mutex_lock(binlog->get_sync_lock());
  mysql_mutex_lock(binlog->get_commit_lock());
  global_sid_lock->wrlock();
}

However, It is still a bug in Percona version when the binlog_order_commits is OFF.

How to repeat:
1. start a gdb
# gdb
(gdb) set non-stop on
(gdb) attach ${the process id of mysqld}
(gdb) set substitute-path ../../mysql-8.0.34/  /home/root/mysql-server/
(gdb) b sql/binlog.cc:9036
Breakpoint 1 at 0x1d5d435: file ../../mysql-8.0.34/sql/binlog.cc, line 9036.
(gdb) c -a
Continuing.

2. start a dml or ddl sql
# the sql will suspend at the break point
mysql [localhost:8035] {msandbox} ((none)) > create database a;

3. get the result from log_status
# the sql will wait the LOCK_log mutex
time:05:52)[db: (none)] select local from performance_schema.log_status\G

4. continue the dml or ddl query on the gdb
(gdb) thread 38
[Switching to thread 38 (Thread 0x7fe25c30f640 (LWP 187653))]
#0  MYSQL_BIN_LOG::ordered_commit () at ../../mysql-8.0.34/sql/binlog.cc:9036
9036      if (change_stage(thd, Commit_stage_manager::SYNC_STAGE, wait_queue, &LOCK_log,
(gdb) n
(gdb) c -a
Continuing.

5. get wrong GTID from the log_status
time:05:52)[db: (none)] select local from performance_schema.log_status\G
*************************** 1. row ***************************
local: {"gtid_executed": "00008035-0000-0000-0000-000000008035:1-14", "binary_log_file": "binlog.000004", "binary_log_position": 1937}

time:05:52)[db: (none)] select local from performance_schema.log_status\G
*************************** 1. row ***************************
local: {"gtid_executed": "00008035-0000-0000-0000-000000008035:1-15", "binary_log_file": "binlog.000004", "binary_log_position": 1937}

Suggested fix:
Adding 2 more mutex LOCK_sync, and LOCK_commit to fix this issue.

void Log_resource_gtid_state_wrapper::lock() {
  DEBUG_SYNC(current_thd, "log_resource_gtid_lock");
  mysql_mutex_lock(binlog->get_sync_lock());
  mysql_mutex_lock(binlog->get_commit_lock());
  global_sid_lock->wrlock();
}
[13 Sep 2023 9:27] Marcelo Altmann
Seems duplicate of https://bugs.mysql.com/bug.php?id=102175
[13 Sep 2023 9:33] MySQL Verification Team
Hello Jinyou Ma,

Thank you for the report and feedback.
I agree with Marcelo, this is a duplicate of Bug #102175. 

regards,
Umesh