Bug #85560 SQL thread has hung and the state is "Waiting for dependent transaction to commi
Submitted: 21 Mar 2017 9:08 Modified: 14 Nov 2018 9:55
Reporter: kfpanda kf Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.40 OS:Any
Assigned to: Bogdan Kecman CPU Architecture:Any

[21 Mar 2017 9:08] kfpanda kf
Description:
Description:
When running mysql with MTS(slave_parallel_workers = 4, slave_parallel_type=LOGICAL_CLOCK), 
SQL thread has hung and the state is "Waiting for dependent transaction to commit".
Seconds_Behind_Master has increased continually.

mysql> show processlist;
+------+-------------+-----------+------+---------+------+---------------------------------------------+------------------+
| Id   | User        | Host      | db   | Command | Time | State                                       | Info             |
+------+-------------+-----------+------+---------+------+---------------------------------------------+------------------+
|    6 | root        | localhost | NULL | Query   |    0 | starting                                    | show processlist |
| 1043 | system user |           | NULL | Connect |   28 | Waiting for master to send event            | NULL             |
| 1044 | system user |           | NULL | Connect | 3603 | Waiting for dependent transaction to commit | NULL             |
| 1045 | system user |           | NULL | Connect | 3571 | Executing event                             | NULL             |
| 1046 | system user |           | NULL | Connect | 3571 | Waiting for an event from Coordinator       | NULL             |
| 1047 | system user |           | NULL | Connect | 3571 | Waiting for an event from Coordinator       | NULL             |
| 1048 | system user |           | NULL | Connect | 3571 | Waiting for an event from Coordinator       | NULL             |
+------+-------------+-----------+------+---------+------+---------------------------------------------+------------------+
7 rows in set (0.00 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.5.61
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 610273715
               Relay_Log_File: relay-bin.000005
                Relay_Log_Pos: 956992713
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 956992500
              Relay_Log_Space: 1684032199
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 2157
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 611643
                  Master_UUID: 7bd12b57-0e07-11e7-ae14-fa163e57891d
             Master_Info_File: /var/lib/mysql/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for dependent transaction to commit
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 7bd12b57-0e07-11e7-ae14-fa163e57891d:1-499285  #IO thread is OK, increase continually
            Executed_Gtid_Set: 7bd12b57-0e07-11e7-ae14-fa163e57891d:1-75109  #it keep unchanged.
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

MySQL version: MySQL5.7.17 & MySQL5.7.16

OS version and kernel:
[root@data]# uname -r
3.10.0-123.el7.x86_64
[root@data]# cat /etc/redhat-release 
CentOS Linux release 7.0.1406 (Core) 

By the way,
using the same configure and method, I smoothly run MTS in the following environment.
[root@result]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core) 
[root@result]# uname -r
3.10.0-327.el7.x86_64
MySQL version: MySQL5.7.17 & MySQL5.7.16

How to repeat:
1) Start slave with slave_parallel_workers =4(>0) and slave_parallel_type=LOGICAL_CLOCK

2) run a sysbench on master
./sysbench --test=tests/db/oltp.lua --oltp-table-size=1000000 --oltp-tables-count=10 --num-threads=40 --mysql-db=dbtest --mysql-user=root --mysql-password=root --mysql-host=10.154.68.52 --mysql-port=3306 --oltp-test-mode=complex --oltp-read-only=off --report-interval=3 --max-time=300 --max-requests=0 --db-driver=mysql run

3) execute SHOW PROCESSLIST and SHOW SLAVE STATUS\G
[4 Apr 2017 1:49] Bogdan Kecman
Hi,

I'm not able to reproduce this issues in any of the environments I have.

You say that both MySQL5.7.17 & MySQL5.7.16 on centos 7.0.1406 with 3.10.0-123.el7.x86_64 kernel is making problem while on 7.2.1511 and 3.10.0-327.el7.x86_64 you don't have a problem?

I can't reproduce this neither on 7.0 nor 7.2 centos.

What binaries are you using? The ones from Oracle YUM repository or downloaded from oracle or from some centos repo? I don't think this is a MySQL bug but it have something to do with your os.

best regards
Bogdan
[14 Nov 2018 9:51] kfpanda kf
We encountered this problem several times.

It is found that there is a relationship between slave_preserve_commit_order and ON on the repository.
[14 Nov 2018 9:51] kfpanda kf
gdb stack

Attachment: gdb.stack (application/octet-stream, text), 91.23 KiB.

[14 Nov 2018 9:52] kfpanda kf
gdb stack

Attachment: gdb.stack (application/octet-stream, text), 91.23 KiB.

[14 Nov 2018 9:55] kfpanda kf
It is found that this problem is related to setting up slave_preserve_commit_order=ON on the slave.
[19 Nov 2018 9:20] Bogdan Kecman
Hi,

Still does not help us reproduce the problem :(

Are you in position to upload binary logs and relay logs along with mysql logs from the instance you reproduced the problem?