Bug #101305 trx->lock.n_rec_locks inconsistent with real num of trx rec locks
Submitted: 24 Oct 2020 8:55 Modified: 17 Feb 10:52
Reporter: Fungo Wang (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: inconsistent, lock, n_rec_locks

[24 Oct 2020 8:55] Fungo Wang
Description:
`trx->lock.n_rec_locks` is used to count the num of rec locks, it's increased by `lock_rec_set_nth_bit()`, and decreased by `lock_rec_reset_nth_bit()`.

When the bit for a heap no is already set in the rec lock bitmap, we should not call `lock_rec_set_nth_bit()`, otherwise the counter `trx->lock.n_rec_locks` will be incosistent with the real rec lock number.

How to repeat:
Read the code in `lock0lock.cc`.

1. In `lock_rec_lock_fast()`, there is a precheck before set
1538      /* If the nth bit of the record lock is already set
1539      then we do not set a new lock bit, otherwise we do
1540      set */
1541      if (!lock_rec_get_nth_bit(lock, heap_no)) {
1542        lock_rec_set_nth_bit(lock, heap_no);
1543        status = LOCK_REC_SUCCESS_CREATED;
1544      }

2. But in `lock_rec_add_to_queue()`, we do unconditional set

451        ut_ad(!found_waiter_before_lock ||
452              (ULINT_UNDEFINED == lock_rec_find_set_bit(lock)));
453
454        lock_rec_set_nth_bit(lock, heap_no);

Suggested fix:
Fix the unconditional set in `lock_rec_add_to_queue()`:

diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
index 978d757f211..51b7edd9dd6 100644
--- a/storage/innobase/lock/lock0lock.cc
+++ b/storage/innobase/lock/lock0lock.cc
@@ -1451,7 +1451,13 @@ static void lock_rec_add_to_queue(ulint type_mode, const buf_block_t *block,
         ut_ad(!found_waiter_before_lock ||
               (ULINT_UNDEFINED == lock_rec_find_set_bit(lock)));

-        lock_rec_set_nth_bit(lock, heap_no);
+        /* If the nth bit of the record lock is already set
+        then we do not set a new lock bit, otherwise we do
+        set */
+        if (!lock_rec_get_nth_bit(lock, heap_no)) {
+          lock_rec_set_nth_bit(lock, heap_no);
+        }
+
         if (found_waiter_before_lock) {
           lock_rec_move_granted_to_front(lock, RecID{lock, heap_no});
         }
[24 Oct 2020 8:57] Fungo Wang
Update bug title
[26 Oct 2020 12:48] MySQL Verification Team
Hi Mr. Wang,

Thank you for your bug report.

We have analysed the code and we agree with your analysis.

Thank you for all and especially for the patch.

Verified as reported.
[26 Oct 2020 14:39] Jakub Lopuszanski
Thanks, I confirm there is a bug here.
First of all, this bug report makes it apparent we should introduce following assert:

--- a/storage/innobase/include/lock0priv.ic
+++ b/storage/innobase/include/lock0priv.ic
@@ -86,8 +86,9 @@ void lock_rec_set_nth_bit(lock_t *lock, /*!< in: record lock */

   byte_index = i / 8;
   bit_index = i % 8;
-
-  ((byte *)&lock[1])[byte_index] |= 1 << bit_index;
+  byte *const the_byte = ((byte *)&lock[1]) + byte_index;
+  ut_ad(!(*the_byte & (1 << bit_index)));
+  *the_byte |= 1 << bit_index;

   lock->trx->lock.n_rec_locks.fetch_add(1, std::memory_order_relaxed);
 }

It looks like the following scenario can deterministically (well, modulo optimizer choice of full scan vs point select?) trigger above assert:

CREATE TABLE t1 (id INT PRIMARY KEY);
INSERT INTO t1 (id) VALUES (1),(2),  (4),(5);
# Initially purge_sys->done.trx is 0, which srv_export_innodb_status() handles
# by reporting INNODB_PURGE_TRX_ID_AGE 0. A workaround is to force at least one
# round of purge to update this value. See wait_innodb_all_purged below.
SET GLOBAL innodb_purge_stop_now = ON;
SET GLOBAL innodb_purge_run_now = ON;

--connect (c1, localhost, root,,)
	BEGIN;
	DELETE FROM t1 WHERE id=2;

--connect (c2, localhost, root,,)
	BEGIN;
	# This doesn't have to wait, but creates an S,GAP lock on gap before 4
	SELECT * FROM t1 WHERE id=3 FOR SHARE;
	SET DEBUG_SYNC = 'lock_wait_will_wait SIGNAL c2_will_wait WAIT_FOR c2_can_go';
	# This one has to wait for the DELETE of id=2 to COMMIT or ROLLBACK
	--send
	SELECT * FROM t1 WHERE id=2 FOR SHARE;

--connection c1
	SET DEBUG_SYNC = 'now WAIT_FOR c2_will_wait';
	# This will first release the X,REC_NOT_GAP lock from id=2 and grant it to c2
	COMMIT;
# Purge will remove the no longer needed row with id=2 and move all
# locks from it to the gap before 4.
# As c2 already has S,GAP lock before 4 we will double count it.
--source include/wait_innodb_all_purged.inc

# Clean up the test
--connection default
	SET DEBUG_SYNC = 'now SIGNAL c2_can_go';
--disconnect c2
--disconnect c1

DROP TABLE t1;

While the contributed patch is correct, I think it's also correct to move more code inside the new "if" because if we do not add a new lock, then there's no need to move anything.

@@ -1449,9 +1449,11 @@ static void lock_rec_add_to_queue(ulint type_mode, const buf_block_t *block,
         ut_ad(!found_waiter_before_lock ||
               (ULINT_UNDEFINED == lock_rec_find_set_bit(lock)));

-        lock_rec_set_nth_bit(lock, heap_no);
-        if (found_waiter_before_lock) {
-          lock_rec_move_granted_to_front(lock, RecID{lock, heap_no});
+        if (!lock_rec_get_nth_bit(lock, heap_no)) {
+          lock_rec_set_nth_bit(lock, heap_no);
+          if (found_waiter_before_lock) {
+            lock_rec_move_granted_to_front(lock, RecID{lock, heap_no});
+          }
         }
         return;
       }

This is a cosmetic difference, because according to the comment and asserts, `found_waiter_before_lock` would be false anyway.
[27 Oct 2020 12:02] MySQL Verification Team
Thank you, Jakub.
[8 Dec 2020 21:58] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.23  release, and here's the proposed changelog entry from the documentation team:

In some circumstances, such as when an existing gap lock inherits a lock
from a deleted record, the number of locks that appear in the
INFORMATION_SCHEMA.INNODB_TRX table could diverge from the actual number of
record locks. 

Thanks to Fungo Wang from Alibaba for the patch.
[9 Dec 2020 13:37] MySQL Verification Team
Thank you, Daniel.