MySQL Bugs: #103596: lock_trx_release_read_locks may get invoked even there's no gap lock

Bug #103596	lock_trx_release_read_locks may get invoked even there's no gap lock
Submitted:	6 May 2021 2:44	Modified:	12 May 2021 13:19
Reporter:	zhai weixiang (OCA)	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	8.0.24	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
During the prepare stage of transaction (in trx_prepare), it'll always invoke lock_trx_release_read_locks function if using RC isolation. But with RC isolation, gap lock is a rare case. I think we should only invoke lock_trx_release_read_locks when there's gap lock hold by transaction, otherwise it may lead to mutex contention of lock system

How to repeat:
read the code

Suggested fix:
a simple patch for verifing the optimization

diff --git a/storage/innobase/include/trx0trx.h b/storage/innobase/include/trx0trx.h
index f22a217..a4645e0 100644
--- a/storage/innobase/include/trx0trx.h
+++ b/storage/innobase/include/trx0trx.h
@@ -1241,6 +1241,7 @@ struct trx_t {
                                  error, or empty. */
   FlushObserver *flush_observer; /*!< flush observer */
 
+  bool has_gap_locks;          /*!< if trx holds GAP lock */
 #ifdef UNIV_DEBUG
   bool is_dd_trx; /*!< True if the transaction is used for
                   doing Non-locking Read-only Read
diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
index 695d825..ad3e139 100644
--- a/storage/innobase/lock/lock0lock.cc
+++ b/storage/innobase/lock/lock0lock.cc
@@ -1197,6 +1197,15 @@ lock_t *RecLock::create(trx_t *trx, const lock_prdt_t *prdt) {
 
   lock_add(lock);
 
+  if (lock->is_gap() || lock->is_next_key_lock()) {
+    trx->has_gap_locks = true;
+  }
+
+#ifdef UNIV_DEBUG
+  /* In order to pass case, always set it to true in debug */
+  trx->has_gap_locks = true;
+#endif
+
   return (lock);
 }
 diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
index 074de0f..dc8fa78 100644
--- a/storage/innobase/trx/trx0trx.cc
+++ b/storage/innobase/trx/trx0trx.cc
@@ -1239,6 +1239,8 @@ static void trx_start_low(
   }
   trx->persists_gtid = false;
 
+  trx->has_gap_locks = false;
+
 #ifdef UNIV_DEBUG
   /* If the transaction is DD attachable trx, it should be AC-NL-RO
   (AutoCommit-NonLocking-ReadOnly) trx */
@@ -2834,8 +2836,19 @@ static void trx_prepare(trx_t *trx) /*!< in/out: transaction */
     /* Stop inheriting GAP locks. */
     trx->skip_lock_inheritance = true;
 
-    /* Release only GAP locks for now. */
-    lock_trx_release_read_locks(trx, true);
+    /* Release only GAP locks for now. Due to description
+    of bug#27189701, this is only for solving conflicts on
+    slave, so we can only let slave thread  or the one who
+    already holds gap lock to release gap locks.
+    For other cases, locks are released at commit stage. This
+    will avoid one more acquirement of global lock_sys mutex for
+    most cases while RC isolation is used. */
+    if (!trx->mysql_thd ||
+        thd_is_replication_slave_thread(trx->mysql_thd) ||
+        trx->has_gap_locks) {
+        /* Release only GAP locks for now. */
+       lock_trx_release_read_locks(trx, true);
+    }
   }
 
   switch (thd_requested_durability(trx->mysql_thd)) {

Hello Zhai,

Thank you for the report and feedback.

regards,
Umesh

Posted by developer:
 
Hello Zhai,

thanks for the patch. 

Could you please share:
- the test scenario you've used 
- the exact MySQL's version you've used
- the gains you've observed from applying the patch
?

I'm having trouble reproducing any gain at all when applying your patch on top of trunk, and running sysbench OLTP RW with pareto or uniform distribution with 64 or 1024 clients, with binlog enabled or disabled. Actually, for some of these combinations the patched performance is even worse, but I guess this is just noise attributable to assembly layout shifts (as for example performance drop for cases with binlog disabled doesn't make much sense, given that the modified portion of the code is not even executed when binlog is disabled, because transactions don't get "prepared" then).

As the patch trades size (adding boolean to trx_t) and complexity (yet another flag, if, custom flow) for speed, I need to know how much we really gain and when - "read the code" is not a sufficient reproduction recipe for that.

Hi, 

The performance gain is not that obvious. I noticed the mutex contention from perf record/report while running tpcc. With the patch, the contention is gone. You can observe the gain by checking the locking time from performance schema. It should have less locking time of lock_sys with the patch.  (Remember testing with RC isolation )

Could you please share the exact performance_schema query and results you've got with and without the patch?

Hi, I finally got some time to test again with lastest version, and  you are right, I didn't gain better performance by running  sysbench lua. I wrote the patch while we are using 8.0.18 when lock system is not sharded. In 8.0.24, lock_trx_release_read_locks didn't acquire global exclusive lock but using shared lock and trx_t::mutex, so the gain is not that obvious. But I still believe we can save cpu time by not iterating lock list of the transaction holds,especially when the transaction holds many many lock objects.

Posted by developer:
 
Hi Zhai, 
thank you for double-checking! 
We'll keep your suggestion in mind as something we might use in future if we observe performance bottleneck in this area. 
For now, we close this issue as "no repro". Feel free to reopen if you find a repeatable test demonstrating the gain.