Bug #102538 Master dump thread stuck in 'Waiting to finalize termination' state
Submitted: 9 Feb 2021 4:03 Modified: 10 Feb 2021 17:49
Reporter: HULONG CUI Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.22 OS:Any
Assigned to: MySQL Verification Team CPU Architecture:Any

[9 Feb 2021 4:03] HULONG CUI
Description:
1)I have  one master  two slave with semi-sync replication with。
my.cnf setting:
# semi sync replication settings #
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 3000
rpl_semi_sync_slave_enabled = 1

# Multi-Threaded replication settings 
slave-parallel-type = LOGICAL_CLOCK
slave-parallel-workers = 16
slave_preserve_commit_order=1

2)when find "Waiting to finalize termination" 4houre , i killed this session,
but not effice
mysql> show processlist;
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
| Id | User            | Host      | db   | Command | Time | State                  | Info             |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+
|  5 | event_scheduler | localhost | NULL | Daemon  |    7 | Waiting on empty queue | NULL             |
|  8 | root            | localhost | NULL | Query   |    0 | init                   | show processlist |
|  9 | repl            | ****:50796 | NULL | Killed   |     | 89158                   | Waiting to finalize termination  |

Other session:

+----+-----------------+-----------+------+---------+------+------------------------+------------------+
|  177320898 | repl 	  | ****:55601 | NULL | Register Slave    | 57259              | Waiting to finalize termination |
|  177321407 | repl            | ****:55618 | NULL | Register Slave    | 57019             | Waiting to finalize termination|
|  177323458 | repl            | ****:55638 | NULL | Register Slave     | 56899             | Waiting to finalize termination  |
+----+-----------------+-----------+------+---------+------+------------------------+------------------+

3)error logs:
2021-02-08T21:50:24.787405+08:00 176133478 [Note] Stop asynchronous binlog_dump to slave (server_id: 958840896)
2021-02-08T21:51:51.457171+08:00 177391627 [Warning] Timeout waiting for reply of binlog (file: binlog.000670, pos: 896622582), semi-sync up to file binlog.000670, position 506357371.
2021-02-08T21:51:51.457211+08:00 177391627 [Note] Semi-sync replication switched OFF.
2021-02-08T21:51:51.766137+08:00 0 [Note] Semi-sync replication switched ON at (binlog.000670, 896703231)
2021-02-08T21:53:27.184563+08:00 177391627 [Warning] Timeout waiting for reply of binlog (file: binlog.000670, pos: 3851319630), semi-sync up to file binlog.000670, position 913745638.
2021-02-08T21:53:27.184620+08:00 177391627 [Note] Semi-sync replication switched OFF.
2021-02-08T21:53:59.880237+08:00 0 [Note] Semi-sync replication switched ON at (binlog.000671, 24415)
2021-02-08T21:54:28.099794+08:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 11068ms. The settings might not be optimal. (flushed=3750 and evicted=0, during the time.)
2021-02-08T21:56:57.579557+08:00 177322705 [Note] Aborted connection 177322705 to db: 'ybwp' user: 'pro_ybwp' host: '10.5.79.158' (Got timeout reading communication packets)
2021-02-08T21:57:03.755578+08:00 177392956 [Warning] Timeout waiting for reply of binlog (file: binlog.000671, pos: 5221148259), semi-sync up to file binlog.000671, position 93696.
2021-02-08T21:57:03.755612+08:00 177392956 [Note] Semi-sync replication switched OFF.
2021-02-08T21:57:59.962680+08:00 0 [Note] Semi-sync replication switched ON at (binlog.000672, 26732)

I searched bug list for semi replication and 
to find this look like 
https://bugs.mysql.com/bug.php?id=90857
https://bugs.mysql.com/bug.php?id=89370

QA:
1)Is it the same isue?
2)restarting the service can be resolved? 

How to repeat:
can not repeat
[9 Feb 2021 10:32] MySQL Verification Team
Hi,

This looks like a duplicate of #908857.

I was never able to reproduce #90857 using regular testing equipment. Are you using some cloud setup?

all best
Bogdan
[10 Feb 2021 4:08] HULONG CUI
The environment is VMware。this MySQL has been in use for 2 years. It's  recent problem
[10 Feb 2021 17:49] MySQL Verification Team
Hi,

Well, for start, you are using a rather old version of MySQL Server so first thing to do is of course upgrade. As for the issue, you can read in the other bug report that we assume this is happening:

[quote]
(1) one of the dump threads begins to exit. 
(2) it gets stuck while exiting. 
(3) slave tries to reconnect, and the reconnect attempts pile up on master.

(1) is certainly happening because there are lots of messages about error reading communication packets, and eventually net_flush failed.
(3) happens because the new dump threads will wait for the old ones to finish.

So the issue is in (2) that might happen if you freeze all the IO (full snapshot in your cloud env?).
[/quote]

But we cannot reproduce this.