| Bug #84358 | Incorrect error reported in master server's log about slave's UUID | ||
|---|---|---|---|
| Submitted: | 28 Dec 2016 12:47 | Modified: | 5 Nov 2024 14:35 |
| Reporter: | Desmond Chew | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
| Version: | 5.7.17 | OS: | Red Hat |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | UUID, zombie dump | ||
[28 Dec 2016 12:47]
Desmond Chew
[29 Dec 2016 9:47]
MySQL Verification Team
Hello Desmond Chew, Thank you for the report. Observed this with 5.7.17 build. Thanks, Umesh
[29 Dec 2016 9:49]
MySQL Verification Team
Post Bug #72578 fix, a NOTE is added on master's error log file with the following information when log_warnings is greater than 1. "While initializing dump thread for slave with UUID <slave_uuid>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread." I agree with Simon's suggestion, better to simplify the note to avoid confusion here.
[29 Dec 2016 9:53]
MySQL Verification Team
-- 5.7.17
-- How to repeat
- Setup master/slave replication
rm -rf master
bin/mysqld --initialize-insecure --basedir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17 --datadir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/master -v
bin/mysqld --no-defaults --basedir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17 --datadir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/master --core-file --socket=/tmp/mysql.sock --port=3306 --log-error=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/master/log.err --log-bin=master-bin --server_id=1 2>&1 &
rm -rf slave
bin/mysqld --initialize-insecure --basedir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17 --datadir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/slave -v
bin/mysqld --no-defaults --basedir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17 --datadir=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/slave --core-file --socket=/tmp/mysql_slave.sock --port=3307 --log-error=/export/umesh/server/binaries/GABuilds/mysql-5.7.17/slave/log.err --log-bin=slave-bin --server_id=2 --skip_slave_start 2>&1 &
- master
- create repl user, note down position etc
- slave
- instruct slave to connect master
root@localhost [(none)]> CHANGE MASTER TO
-> MASTER_HOST='localhost',
-> MASTER_PORT=3306,
-> MASTER_USER='repl',
-> MASTER_PASSWORD='slavepass',
-> MASTER_LOG_FILE='master-bin.000001',
-> MASTER_LOG_POS=1220;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
root@localhost [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)
root@localhost [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: localhost
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 1220
Relay_Log_File: hod03-relay-bin.000002
Relay_Log_Pos: 321
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.
Master_TLS_Version:
1 row in set (0.00 sec)
root@localhost [(none)]> stop slave; start slave;
Query OK, 0 rows affected (0.01 sec)
-- note master's error log
[umshastr@hod03]~: tail -f /export/umesh/server/binaries/GABuilds/mysql-5.7.17/master/log.err
2016-12-29T09:38:29.340374Z 0 [Note] IPv6 is available.
2016-12-29T09:38:29.340391Z 0 [Note] - '::' resolves to '::';
2016-12-29T09:38:29.340404Z 0 [Note] Server socket created on IP: '::'.
2016-12-29T09:38:29.351580Z 0 [Note] Event Scheduler: Loaded 0 events
2016-12-29T09:38:29.351682Z 0 [Note] Executing 'SELECT * FROM INFORMATION_SCHEMA.TABLES;' to get a list of tables using the deprecated partition engine. You may use the startup option '--disable-partition-engine-check' to skip this check.
2016-12-29T09:38:29.351690Z 0 [Note] Beginning of list of non-natively partitioned tables
2016-12-29T09:38:29.363315Z 0 [Note] End of list of non-natively partitioned tables
2016-12-29T09:38:29.363409Z 0 [Note] bin/mysqld: ready for connections.
Version: '5.7.17-log' socket: '/tmp/mysql.sock' port: 3306 MySQL Community Server (GPL)
2016-12-29T09:40:13.378235Z 4 [Note] Start binlog_dump to master_thread_id(4) slave_server(2), pos(master-bin.000001, 1220)
2016-12-29T09:40:57.221527Z 5 [Note] While initializing dump thread for slave with UUID <93071309-cdaa-11e6-bbbd-0010e05f3e06>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(4).
2016-12-29T09:40:57.221596Z 5 [Note] Start binlog_dump to master_thread_id(5) slave_server(2), pos(master-bin.000001, 1220)
[29 Dec 2016 9:58]
MySQL Verification Team
Bug #79771 marked as duplicate of this
[2 Jan 2017 12:12]
Sven Sandberg
Posted by developer: This is expected behavior. Asynchronous replication does not communicate from slave to master (other than the initial handshake). In particular, when you issue STOP SLAVE on the slave, the slave just stops listening to the master without notifying the master. The master's dump thread will keep sending events that are not be received by any slave. Internally, such threads are called 'zombie dump threads'. Zombie dump threads are automatically killed in two cases: 1. When net_write_timeout seconds have elapsed since the slave disconnected, and the zombie dump thread tries to send another event, the zombie thread will kill itself. 2. When a slave connects using the same server_uuid (which means the same slave connects, unless there is a misconfiguration), then that new connection will kill the existing zombie dump thread. So the 'Note' that you are seeing in the log informs that the second of these expected cases has happened. So there is no bug in mysqld or in your application and nothing to worry about. The 'Note' was introduced in BUG#18731211 (5.6.21). However, I think the message could be a bit more helpful: 1. "Zombie dump thread" is probably just an internal term and should be avoided. 2. It should be more clear from the text (and not just the severity level) that this is a normal, expected condition. 3. It should be clear from the text that this probably just means that the same slave reconnects. Also, maybe we should document this in the manual (maybe under START SLAVE unless there is a better place).
[12 Jan 2017 13:01]
Sven Sandberg
Posted by developer: See also comment from Aug 2016 at BUG#72578.
[13 Jun 2017 8:09]
MySQL Verification Team
Bug #86676 marked as duplicate of this one
[5 Nov 2024 14:35]
Jon Stephens
Documented fix as follows in the MySQL 8.0.41, 8.4.4, and 9.2.0 changelogs:
The log message written when a replica was stopped and restarted
(for example, by issuing STOP REPLICA followed by START REPLICA)
-While initializing dump thread for replica with UUID /uuid/,
found a zombie dump thread with the same UUID. Source is killing
the zombie dump thread(/thread_id/)- has been improved to -Upon
reconnection with the replica, while initializing the dump
thread for UUID /uuid/, an existing dump thread with the same
UUID was detected. The source is terminating the previous dump
thread (/thread_id/), which is normal and expected-.
Closed.
