Bug #92975 SQL thread stuck in init state waiting for asynchronous rollback to complete
Submitted: 27 Oct 2018 1:04 Modified: 20 Jan 2020 3:47
Reporter: Jaime Lichauco Email Updates:
Status: No Feedback Impact on me:
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.19, 5.7.21 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: asynchronous, init, replication, rollback

[27 Oct 2018 1:04] Jaime Lichauco
We have seen recent occurrences of replication hanging indefinitely on slaves running MySQL 5.7.19 and 5.7.21. The process list shows the system user thread with command Connect and state init.

In this case we have automation that attempts to run STOP SLAVE if it detects a possible replication stall. Unfortunately attempts at running STOP SLAVE are blocked leading to a pile-up of STOP SLAVE requests. Running pt-pmp against the database shows the following stack trace for the SQL thread.


From reviewing the InnoDB code it appears the SQL thread is in waiting for an asynchronous rollback to complete.

        @return true if it is a forced rollback, asynchronously */
        static bool is_forced_rollback(const trx_t* trx)

                return((trx->in_innodb & TRX_FORCE_ROLLBACK)) > 0;

        Wait for the asynchronous rollback to complete, if it is in progress */
        static void wait(trx_t* trx)

                ulint   loop_count = 0;
                /* start with optimistic sleep time - 20 micro seconds. */
                ulint   sleep_time = 20;

                while (is_forced_rollback(trx)) {

                        /* Wait for the async rollback to complete */


                        /* If the wait is long, don't hog the cpu. */
                        if (loop_count < 100) {
                                /* 20 microseconds */
                                sleep_time = 20;
                        } else if (loop_count < 1000) {
                                /* 1 millisecond */
                                sleep_time = 1000;
                        } else {
                                /* 100 milliseconds */
                                sleep_time = 100000;



However checking other MySQL and InnoDB metrics (e.g. SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX) show no other transaction in a rollback state.

Is it possible that some other thread incorrectly modified trx->in_innodb for the SQL thread?

How to repeat:
To be determined.

Suggested fix:
Restart mysqld
[7 Nov 2018 12:25] sudheer gadipathi
We are facing this issue. Is there any update from the MySQL team?
[22 Nov 2018 13:22] MySQL Verification Team

I cannot reproduce this, not with 5.7.24 nor with 8.0.13!

Can you upload - from the affected system, next time it happens:
 - both config files
 - both log files
 - binlog
 - relay log
 - show master status (from master)
 - show slave status (from slave)

[23 Dec 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[28 Feb 2019 8:26] Zhenghu Wen
We are facing this issue.in 5.7.20

Linux rdsrds-4-mobi-check-online1524624309467 3.2.73 #1 SMP Tue Feb 16 16:06:26 CST 2016 x86_64 GNU/Linux

Thread 15 (Thread 0x7f205b3a0700 (LWP 23175)):
#0  0x00007f22857dc91d in nanosleep () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x0000000001158090 in os_thread_sleep (tm=<optimized out>) at /home/hzwenzhh/mysql-5.7.20-v1a/storage/innobase/os/os0thread.cc:279
#2  0x00000000010ec430 in TrxInInnoDB::wait (trx=trx@entry=0x7f227a999b10) at /home/hzwenzhh/mysql-5.7.20-v1a/storage/innobase/include/trx0trx.h:1557
#3  0x00000000010dfac3 in TrxInInnoDB::enter (trx=0x7f227a999b10, disable=<optimized out>) at /home/hzwenzhh/mysql-5.7.20-v1a/storage/innobase/include/trx0trx.h:1477
#4  0x00000000010e37d0 in innobase_xa_prepare (hton=0x3a8bf40, thd=0x1188ba10, prepare_trx=true) at /home/hzwenzhh/mysql-5.7.20-v1a/storage/innobase/handler/ha_innodb.cc:17032
#5  0x000000000098f7e2 in ha_prepare (thd=0x1188ba10) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/handler.cc:1468
#6  0x0000000000e8f79c in Sql_cmd_xa_prepare::trans_xa_prepare (this=<optimized out>, thd=thd@entry=0x1188ba10) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/xa.cc:794
#7  0x0000000000e90448 in Sql_cmd_xa_prepare::execute (this=<optimized out>, thd=0x1188ba10) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/xa.cc:835
#8  0x0000000000f6c47e in XA_prepare_log_event::do_commit (this=0x7f20449be090, thd=0x1188ba10) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/log_event.cc:7773
#9  0x0000000000f68f73 in Xid_apply_log_event::do_apply_event_worker (this=0x7f20449be178, w=0x14b04820) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/log_event.cc:7451
#10 0x0000000000fc772b in slave_worker_exec_job_group (worker=0x14b04820, rli=0x118924d0) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/rpl_rli_pdb.cc:2594
#11 0x0000000000faccdb in handle_slave_worker (arg=arg@entry=0x14b04820) at /home/hzwenzhh/mysql-5.7.20-v1a/sql/rpl_slave.cc:6143
#12 0x00000000010591b1 in pfs_spawn_thread (arg=0x11df68b0) at /home/hzwenzhh/mysql-5.7.20-v1a/storage/perfschema/pfs.cc:2190
#13 0x00007f22857d4b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#14 0x00007f228453afbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#15 0x0000000000000000 in ?? ()
[19 Dec 2019 20:29] Maciej Dobrzanski
I don't have any new details as the best I can offer at this point are backtraces from gdb, but we've seen this happen once on 5.7.26.

           ID: 6820
         USER: system user
           DB: <..>
      COMMAND: Connect
         TIME: 7585
        STATE: init
         INFO: INSERT INTO <..> (<..>) VALUES (<..>)
      TIME_MS: 7584892
    ROWS_SENT: 0

[20 Dec 2019 3:47] MySQL Verification Team

Thanks but still can't reproduce it with available data.
[21 Jan 2020 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[8 Dec 2020 13:39] Jean-François Gagné
Maybe related: Bug#101907.
[10 Dec 2020 11:47] MySQL Verification Team

With regards to XA, a number of XA issues were fixed in 5.7.21, 5.7.22, and 5.7.23 so you should really upgrade to the latest 5.7