Bug #119460 InnoDB transaction may get stuck in an infinite loop when releasing Read Locks
Submitted: 25 Nov 10:08
Reporter: Xizhe Zhang (OCA) Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:8.0.44 OS:Any
Assigned to: CPU Architecture:Any

[25 Nov 10:08] Xizhe Zhang
Description:
Whether it's an XA transaction or a commit with Binlog enabled, it will trigger the InnoDB transaction to complete the commit using the 2PC protocol. When an InnoDB transaction needs to `prepare`, under the `READ_UNCOMMITTED` and `READ_COMMITTED` isolation levels, the InnoDB transaction will release the READ GAP LOCK it holds in advance.

Within the `lock_trx_release_read_locks` function, locks are released under two scenarios: optimistic and pessimistic. Optimistic release allows five attempts; if all attempts fail, it enters a pessimistic release loop, which continues indefinitely until all locks held by the current transaction have been processed:

```cpp
void lock_trx_release_read_locks(trx_t *trx, bool only_gap) {
  const size_t MAX_FAILURES = 5;

  // optimistic
  for (size_t failures = 0; failures < MAX_FAILURES; ++failures) {
    if (locksys::try_release_read_locks_in_s_mode(trx, only_gap)) {
      return;
    }
    std::this_thread::yield();
  }

  // pessimistic
  while (!locksys::try_release_read_locks_in_x_mode(trx, only_gap)) {
    std::this_thread::yield();
  }
}
```

However, a 1s timeout is set during pessimistic release. If this transaction holds many locks (and they are not gap locks) and does not complete processing within 1 second, it will start processing from the beginning again, over and over. This will cause the transaction to get stuck here indefinitely!

```cpp
static bool try_release_read_locks_in_x_mode(trx_t *trx,
                                                           bool only_gap) {
  Global_exclusive_latch_guard guard{UT_LOCATION_HERE};
  const auto started_at = std::chrono::steady_clock::now();
  trx_mutex_enter_first_of_two(trx);

  for (auto lock : trx->lock.trx_locks.removable()) {
    // If we don't process all the Locks within 1 second,
    // we'll have to start all over again!
    if (MAX_CS_DURATION < std::chrono::steady_clock::now() - started_at) {
      trx_mutex_exit(trx);
      return false;
    }

    lock_release_read_lock(lock, only_gap);
  }

  trx_mutex_exit(trx);
  return true;
}
```

How to repeat:
I wrote a test case modeled after `lock_trx_release_read_locks_in_x_mode.test` to simulate this problem. I also slightly modified the `try_release_read_locks_in_x_mode` function to simulate the slow processing speed during iteration. All modifications are as follows:

```
diff --git a/mysql-test/t/bug_release_locks.test b/mysql-test/t/bug_release_locks.test
new file mode 100644
index 00000000000..60195cc2b60
--- /dev/null
+++ b/mysql-test/t/bug_release_locks.test
@@ -0,0 +1,49 @@
+--source include/have_debug_sync.inc
+--source include/count_sessions.inc
+
+# keep in sync with MAX_FAILURES defined in lock_trx_release_read_locks()
+--let MAX_FAILURES=5
+--let i=0
+while($i<=$MAX_FAILURES)
+{
+  --eval CREATE TABLE t$i (id INT PRIMARY KEY) ENGINE=InnoDB
+  --inc $i
+}
+INSERT INTO t0 (id) VALUES (1);
+
+--connect (c0, localhost, root,,)
+  SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
+  XA START 'x';
+  # create at least MAX_FAILURES implicit locks
+  --let i=1
+  while($i<=$MAX_FAILURES)
+  {
+    --eval INSERT INTO t$i (id) VALUES (1);
+    --inc $i
+  }
+  # create at least 1 explicit lock
+  SELECT * FROM t0 WHERE id=1 FOR UPDATE;
+  XA END 'x';
+  SET DEBUG='+d,simulate_release_many_locks';
+  SET DEBUG_SYNC='try_relatch_trx_and_shard_and_do_noted_expected_version
+    SIGNAL c0_noted_expected_version
+    WAIT_FOR c0_can_go
+    EXECUTE 5';
+  --send XA PREPARE 'x'
+
+--let i=1
+while($i<=$MAX_FAILURES)
+{
+  --connect (c$i, localhost, root,,)
+    BEGIN;
+    SET DEBUG_SYNC = 'now WAIT_FOR c0_noted_expected_version';
+    --eval SET DEBUG_SYNC='lock_wait_will_wait SIGNAL c0_can_go'
+    --send_eval SELECT * FROM t$i FOR SHARE
+
+  --inc $i
+}
+
+--connection c0
+  --reap
+  XA COMMIT 'x';
+
diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
index 0efa1f4e139..bb3fe9bc29d 100644
--- a/storage/innobase/lock/lock0lock.cc
+++ b/storage/innobase/lock/lock0lock.cc
@@ -4098,6 +4098,9 @@ It is called during XA prepare to release locks early.
     DEBUG_SYNC_C("lock_trx_release_read_locks_in_x_mode_will_release");
 
     lock_release_read_lock(lock, only_gap);
+
+    fprintf(stderr, "trx_locks.count = %ld\n", trx->lock.trx_locks.get_length());
+    DBUG_EXECUTE_IF("simulate_release_many_locks", sleep(1););
   }
 
   trx_mutex_exit(trx);
```

Compiling and running the test cases reveals that connection `c0` is stuck in an endless loop of `lock_trx_release_read_locks`. Furthermore, `tailf mysqld.1.err` shows that the number of locks to be processed remains unchanged:

```
$tailf mysqld.1.err
trx_locks.count = 12
trx_locks.count = 12
trx_locks.count = 12
trx_locks.count = 12
trx_locks.count = 12
trx_locks.count = 12
...
```

Suggested fix:
I think pessimistic release also needs a limit on the number of failures. Once the maximum number of failures is exceeded, the timeout limit should be ignored.
[25 Nov 10:26] Xizhe Zhang
Hello, Verification Team, this is my patch.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bugfix_release_locks.diff (application/octet-stream, text), 5.76 KiB.