Bug #16927 | Duplicate slave server-id gives no specific error message, poor relay log use | ||
---|---|---|---|
Submitted: | 31 Jan 2006 10:01 | Modified: | 3 Sep 2010 16:50 |
Reporter: | Bruce Dembecki | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S4 (Feature request) |
Version: | 5.0, 5.1 | OS: | Any |
Assigned to: | Libing Song | CPU Architecture: | Any |
[31 Jan 2006 10:01]
Bruce Dembecki
[1 Feb 2006 9:55]
Valeriy Kravchuk
Thank you for a problem report. Please, upload that 'binary-log.001012' file. Can you provide a small set of SQL statements to execute on master to repeat the behaviour you described each and every time?
[1 Feb 2006 19:30]
Bruce Dembecki
I can give you any combination of b inary and relay logs you may want... The relay logs tend to be small, a handfull of queries... here is a portion of the ls -l of where I stashed the relay logs to give you an idea. I can upload a binary log from before I rolled back to 5.0.15... how about one from the middle of the night, wee roll binaries every two hours, so the middle of the night ones are much smaller at about 10M each. This exceeds the file upload ability on the web interface, where would you like me to put it? FTP would be better, I'm offsite this week so would like to send it directly from the server, rather than routing it to my laptop first. Here is a small piece of that directory listing so you can see the typical file size of those relay logs: -rw-rw---- 1 mysql mysql 1580 Jan 30 23:31 data-admin-relay-bin.586501 -rw-rw---- 1 mysql mysql 2551 Jan 30 23:31 data-admin-relay-bin.586502 -rw-rw---- 1 mysql mysql 5154 Jan 30 23:31 data-admin-relay-bin.586503 -rw-rw---- 1 mysql mysql 1548 Jan 30 23:31 data-admin-relay-bin.586504 -rw-rw---- 1 mysql mysql 8699 Jan 30 23:31 data-admin-relay-bin.586505 -rw-rw---- 1 mysql mysql 710 Jan 30 23:31 data-admin-relay-bin.586506 -rw-rw---- 1 mysql mysql 3665 Jan 30 23:31 data-admin-relay-bin.586507 -rw-rw---- 1 mysql mysql 1846 Jan 30 23:31 data-admin-relay-bin.586508 -rw-rw---- 1 mysql mysql 7486 Jan 30 23:31 data-admin-relay-bin.586509 -rw-rw---- 1 mysql mysql 2624 Jan 30 23:31 data-admin-relay-bin.586510 -rw-rw---- 1 mysql mysql 1195 Jan 30 23:31 data-admin-relay-bin.586511 -rw-rw---- 1 mysql mysql 12683 Jan 30 23:31 data-admin-relay-bin.586512 -rw-rw---- 1 mysql mysql 6765 Jan 30 23:31 data-admin-relay-bin.586513 -rw-rw---- 1 mysql mysql 5477 Jan 30 23:32 data-admin-relay-bin.586514 -rw-rw---- 1 mysql mysql 1613 Jan 30 23:32 data-admin-relay-bin.586515 -rw-rw---- 1 mysql mysql 1373 Jan 30 23:32 data-admin-relay-bin.586516 -rw-rw---- 1 mysql mysql 10729 Jan 30 23:32 data-admin-relay-bin.586517 -rw-rw---- 1 mysql mysql 473 Jan 30 23:32 data-admin-relay-bin.586518 -rw-rw---- 1 mysql mysql 1159 Jan 30 23:32 data-admin-relay-bin.586519 -rw-rw---- 1 mysql mysql 5469 Jan 30 23:32 data-admin-relay-bin.586520 -rw-rw---- 1 mysql mysql 12258 Jan 30 23:32 data-admin-relay-bin.586521
[8 Feb 2006 8:22]
Bruce Dembecki
It appears when setting up an offsite slave I used a my.cnf file from another slave and there was multiple slaves with the same serverID... Apparently this behaviour is a known problem when there s a serverID conflict... I'll close this issue as a stupid user issue :-)
[9 Feb 2006 19:19]
James Day
Bruce, I've reopened this with the original 'Slave receives "end packet" from Master' title updated to reflect the underlying problem. This is a fairly common problem, usually for new setups. Three feature requests: 1. A more clear error message in the slave. Might require the master telling the slave being disconnected the server IP and/or name of the slave it is accepting the connection from just before closing the connection. Then the slave can see that those differ from its own and can report both that information from the master and that it differs from what the slave thinks it is called. 2. The master error log should be logging the disconnect/reconnect with the same server ID. But no more than once a minute per server-id/slave name pair, so the error log doesn't grow to a huge size. 3. Perhaps only create a new relay log file if it has been more than five seconds since the last disconnect from the master (ideally also only if it looks like a duplicate server-id issue)? I'm also adding a check for this to the desired feature list for a possible server monitoring tool.
[9 Feb 2006 23:59]
Lachlan Mulcahy
In addition/essentially: - We shouldn't allow the new connections since there's already an active server connection with that server id. - The error message and error handling should be better
[10 Feb 2006 0:35]
Bruce Dembecki
I do believe not allowing a new connection if there is already a slave with that serverID prevents both the problem and the Stupid User Error, assuming the error message covers the reason. I like that solution a lot (as the Stupid User in question I'm allowed to call it a Stupid User Error). :-)
[24 Feb 2010 2:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/101279 3352 Luis Soares 2010-02-24 BUG#16927: Duplicate slave server-id gives no specific error message, poor relay log use Sketch of a test case.
[24 Feb 2010 2:19]
Luis Soares
Setting back to verified. Previous patch was just a sample of a test case.
[22 Mar 2010 10:07]
Libing Song
This bug happens when killing a zombie thread. When a slave connects to its master, the master always kills the other slave thread which has the same server-id. Let us demonstrate it. Slave1 and Slave2 have the same server-id. 1. Slave1 connects to master. 2. Slave2 connects to master. Slave1 has the same server-id, so it is killed. 3. Slave1 tries to connect to master automatically. Slave1 connects successful. Slave2 is killed. 4. Slave2 tries to connect to master again. Slave2 connects successful. Slave1 is killed. 5. ...... ...... This bug is fixed by WL#4677. WL#4677 gives each server one UUID. it is generated automatically and can not be same between different severs. In the worklog, the rule of killing zombie slave thread is changed. When a slave connects to its master, the master only kills the slave thread which has the same UUID instead of the thread which has the same server-id.
[23 Mar 2010 15:59]
Paul DuBois
Resetting report to Need Merge. It cannot be set to Documenting until it specifies the version(s) into which the fix was pushed.
[29 Mar 2010 8:22]
Libing Song
WL#4677 is not pushed yet, so go back to verified.
[23 Jun 2010 2:42]
Libing Song
Fixed by the patch for WL#4677. The patch's revid: li-bing.song@sun.com-20100622014153-9jsvfdejgw78r4qe It has arrived mysql-next-mr-rpl-merge now and will arrive mysql-next-mr-bugfixing soon.
[3 Sep 2010 16:50]
Jon Stephens
This is being documented as part of docs work for WL#4766, so closing.