Bug #99643 innobase_commit_by_xid/innobase_rollback_by_xid is not thread safe
Submitted: 20 May 2020 7:24 Modified: 26 Jul 12:14
Reporter: zhai weixiang (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0, 5.7 OS:Any
Assigned to: CPU Architecture:Any

[20 May 2020 7:24] zhai weixiang
Description:
In function innobase_commit_by_xid/innobase_rollback_by_xid, it contructs TrxInInnoDB firstly which enters innodb by default, and exit innodb after freeing trx_t(put back trx_t to pool), This means another thread may get the trx_t object while it exits innodb by modifying same trx_t object. It's not thread safe. We should explicitly invoke end_stmt before trx_free_for_background

How to repeat:
read the code

Suggested fix:
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 146f440..c8d81ed 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -19041,13 +19041,15 @@ static xa_status_code innobase_commit_by_xid(
   trx_t *trx = trx_get_trx_by_xid(xid);

   if (trx != nullptr) {
-    TrxInInnoDB trx_in_innodb(trx);
+    TrxInInnoDB::begin_stmt(trx);

     innobase_commit_low(trx);
     ut_ad(trx->mysql_thd == nullptr);
     /* use cases are: disconnected xa, slave xa, recovery */
     trx_deregister_from_2pc(trx);
     ut_ad(!trx->will_lock); /* trx cache requirement */
+    TrxInInnoDB::end_stmt(trx);
+
     trx_free_for_background(trx);

     return (XA_OK);
@@ -19069,12 +19071,13 @@ static xa_status_code innobase_rollback_by_xid(
   trx_t *trx = trx_get_trx_by_xid(xid);

   if (trx != nullptr) {
-    TrxInInnoDB trx_in_innodb(trx);
-
+    TrxInInnoDB::begin_stmt(trx);
     int ret = innobase_rollback_trx(trx);

     trx_deregister_from_2pc(trx);
     ut_ad(!trx->will_lock);
+    TrxInInnoDB::end_stmt(trx);
+
     trx_free_for_background(trx);

     return (ret != 0 ? XAER_RMERR : XA_OK);
[20 May 2020 13:37] MySQL Verification Team
Hi Mr. weixiang,

Thank you for your bug report and patch.

However, there are some details that are not clear.

First of all, there is a check on whether a pointer is NULL or not.

Next, why would we need to add begin and end statement in that loop, when we are dealing with those elsewhere ???

Last, but not least, this would remove all benefits for the existing pool, wouldn't it ?????
[21 May 2020 2:44] zhai weixiang
Hi, 

TrxInInnoDB trx_in_innodb will implicitly enter innodb  while constructing it(same behavior as begin_stmt(trx)), and exit innodb while destructing the object(same as end_stmt(trx)), but hte destructing happens after invoking trx_free_in_background which means:
- current thread (thread 1)put back trx_t to pool, and then exit innodb by modifying the trx_t's value (trx_t::in_depth/in_innodb)
- meanwhile, the trx_t is already in pool and can be reused by another thread(thread2).
So there maybe two thread accessing same trx_t pointer.

Let me give you an example:

thread1:
- invoke trx_free_in_background, now trx_t is in pool, trx_t::in_depth = 1,trx_t::in_innodb++

thread2: get trx_t from pool, and enter innodb: trx_t::in_depth++ (=2)

thread1: destructing TrxInInnoDB object: decrease trx_t::in_depth-- (= 1)
thread2: exit innodb, decrease trx_t::in_depth-- (=0)

thread1: check in_depth = 0, and continue to decrease trx_t::in_innodb (=0)
thread2: check in_depth = 0, and continue to decrease trx_t::in_innodb ((unsigned)-1)

thread2: enter innodb again, and start waiting (TrxInInnoDB::wait() in include/trx0trx.h) forever...because all flag of trx_t::in_innodb is true
[21 May 2020 12:31] MySQL Verification Team
Hi Mr. weixiang,

Thanks for the feedback.

You have convinced me, so I will verify this bug and will let our colleagues analyse it further.

I have also concluded that this is 8.0-only bug.

Verified as reported.
[26 Sep 2021 13:14] MySQL Verification Team
Bug #105036 marked as duplicate of this one.
[13 Apr 2023 2:15] Shaohua Wang
Hi Weixiang, there must be some other bugs here.

See the bug: https://bugs.mysql.com/bug.php?id=110652

there are many different values of trx->innodb.
[13 Apr 2023 11:44] MySQL Verification Team
Thank you, Mr. Wang ......
[22 Sep 2023 11:30] MySQL Verification Team
Hi,

We do not think that this report is the original one for the following bug:

https://bugs.mysql.com/bug.php?id=112425
[26 Jul 10:42] zhai weixiang
This bug is fixed by commit  bc9b5f0327fb0104542f9edbeb456e93a5121df5
Can be closed now
[26 Jul 12:14] MySQL Verification Team
Thank you, very much, Zhai !!!!