Bug #110561 select * from performance_schema.data_locks returns wrong result
Submitted: 30 Mar 2023 7:55 Modified: 27 Jun 2023 5:55
Reporter: Wen He (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: data_locks

[30 Mar 2023 7:55] Wen He
Description:
During purging records, `lock_rec_inherit_to_gap` waiting lock requests on rec are inherited as GRANTED gap locks. However, the result of selecting from performance_schema.data_locks shows it as WAITING. 
I repeats the error with the give case bellow on 8.0.23. Due to the concurrency between purge threads and the thread executing the insertion, it may not always be repeatable. I believe other 8.0.xx versions also have the same issue, as the related code kept unchanged in all versions.

`lock_rec_inherit_to_gap` might add a granted S,GAP lock at the end of the trx->lock->trx_locks list during purging. If the trx has a wait_lock, the newly added granted S,GAP lock will be behined the wait_lock in the trx->lock->trx_locks list. When executing `select * from performance_schema.data_locks`, `Innodb_data_lock_iterator::scan_trx` takes all locks starting from wait_lock to the end of trx->lock->trx_locks list as in WAITING lock status. Thus, a granted S,GAP lock is regarded as a waiting S,GAP.

How to repeat:
set global transaction_isolation='READ-COMMITTED';
set global innodb_lock_wait_timeout = 10000000;
drop table if exists ofs;

CREATE TABLE ofs (
  inode_id int(11) NOT NULL,
  name varchar(10) DEFAULT NULL,
  parent_id int(11) DEFAULT NULL,
  PRIMARY KEY (inode_id),
  UNIQUE KEY key_pid_name (parent_id,name),
  UNIQUE KEY key_pid_iid (parent_id,inode_id)
) ENGINE=InnoDB;

insert into ofs values(97, "00004", 224),(98, "00005", 224), (99, "00006", 224),(100, "00007", 224);

-- Session 1
begin;
delete from ofs where parent_id = 224 and name ="00004";

-- Session 2
begin;
delete from ofs where parent_id = 224 and name ="00005";

-- Session 3
begin;
insert into ofs values(101, "00004", 224);

-- Session 4
begin;
insert into ofs values(102, "00005", 224);

-- Session 0
select ENGINE_LOCK_ID, ENGINE_TRANSACTION_ID, THREAD_ID, EVENT_ID, OBJECT_INSTANCE_BEGIN, LOCK_TYPE, LOCK_MODE, LOCK_STATUS, LOCK_DATA from performance_schema.data_locks where index_name = 'key_pid_name';
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+
| ENGINE_LOCK_ID                         | ENGINE_TRANSACTION_ID | THREAD_ID | EVENT_ID | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_MODE     | LOCK_STATUS | LOCK_DATA        |
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+
| 140242751062200:26:5:3:140242761913024 |                148780 |        52 |     2497 |       140242761913024 | RECORD    | S             | WAITING     | 224, '00005', 98 |
| 140242751061160:26:5:2:140242761906432 |                148779 |        51 |     2417 |       140242761906432 | RECORD    | S             | WAITING     | 224, '00004', 97 |
| 140242751060120:26:5:3:140242761900128 |                148774 |        50 |     1894 |       140242761900128 | RECORD    | X,REC_NOT_GAP | GRANTED     | 224, '00005', 98 |
| 140242751059080:26:5:2:140242761893840 |                148773 |        49 |     1811 |       140242761893840 | RECORD    | X,REC_NOT_GAP | GRANTED     | 224, '00004', 97 |
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+

-- Session 1
commit;

-- Session 0
select ENGINE_LOCK_ID, ENGINE_TRANSACTION_ID, THREAD_ID, EVENT_ID, OBJECT_INSTANCE_BEGIN, LOCK_TYPE, LOCK_MODE, LOCK_STATUS, LOCK_DATA from performance_schema.data_locks where index_name = 'key_pid_name';
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+
| ENGINE_LOCK_ID                         | ENGINE_TRANSACTION_ID | THREAD_ID | EVENT_ID | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_MODE     | LOCK_STATUS | LOCK_DATA        |
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+
| 140242751062200:26:5:3:140242761913024 |                148780 |        52 |     2497 |       140242761913024 | RECORD    | S             | WAITING     | 224, '00005', 98 |
| 140242751061160:26:5:3:140242761906784 |                148779 |        51 |     3039 |       140242761906784 | RECORD    | S             | WAITING     | 224, '00005', 98 |
| 140242751061160:26:5:3:140242761907136 |                148779 |        36 |    23261 |       140242761907136 | RECORD    | S,GAP         | WAITING     | 224, '00005', 98 |
| 140242751060120:26:5:3:140242761900128 |                148774 |        50 |     1894 |       140242761900128 | RECORD    | X,REC_NOT_GAP | GRANTED     | 224, '00005', 98 |
+----------------------------------------+-----------------------+-----------+----------+-----------------------+-----------+---------------+-------------+------------------+

Note: The LOCK_STATUS of the S,GAP RECORD lock should be GRANTED.

Suggested fix:
As the trx->lock->trx_locks list doesn't need to guarantee that positions of granted locks in the list must be ahead of waiting locks, changing the lock status judgement in function `Innodb_data_lock_iterator::scan_trx` is enough.

Current:
size_t Innodb_data_lock_iterator::scan_trx(const trx_t *trx) {
  const char *lock_status_str = "GRANTED";
  const lock_t *lock;
  lock_t *wait_lock = trx->lock.wait_lock;
  
  for (lock = lock_get_first_trx_locks(&trx->lock); lock != nullptr;
       lock = lock_get_next_trx_locks(lock)) {
    
    if (lock == wait_lock) {
      lock_status_str = "WAITING";
    }
    
  }

}

Fix:
size_t Innodb_data_lock_iterator::scan_trx(const trx_t *trx) {
  const lock_t *lock;
  
  for (lock = lock_get_first_trx_locks(&trx->lock); lock != nullptr;
       lock = lock_get_next_trx_locks(lock)) {
    
    if (lock->type_mode & LOCK_WAIT) {
      lock_status_str = "WAITING";
    } else {
      lock_status_str = "GRANTED";
    }
    
  }

}
[27 Jun 2023 2:39] Wen He
Any update?
[27 Jun 2023 5:55] MySQL Verification Team
Hello Wen He,

Thank you for the report and feedback.
Verified as described.

regards,
Umesh