Bug #33602 | Relay log corruption when relay_log_space_limit reached | ||
---|---|---|---|
Submitted: | 31 Dec 2007 18:03 | Modified: | 7 Oct 2008 11:43 |
Reporter: | Baron Schwartz (Basic Quality Contributor) | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.0.40 | OS: | Linux (gentoo) |
Assigned to: | CPU Architecture: | Any | |
Tags: | qc |
[31 Dec 2007 18:03]
Baron Schwartz
[2 Jan 2008 16:26]
MySQL Verification Team
Baron, is the corruption always at the end of the relay log? If yes, I suspect bug #26489 is related.
[2 Jan 2008 17:16]
Baron Schwartz
I saw that report too, but I thought this was probably a different bug: the corruption is in the middle of the log, and it's not binary-looking garbage data that looks like an uninitialized buffer. Instead, it always looks like two log events got mixed together. My current thought is that this might be a related problem, though. I can't understand the logging code well enough to make anything but a wild guess, but would the following be possible? * I/O thread reads an event, tries to log it, and finds there's not enough space * Perhaps it writes part of an event before finding not enough space? Or maybe something like this: * I/O thread stops, sleeps and wakes, then tries to write the event again * Perhaps it has a buffer initialized to the wrong size, and filled with the last event's data? If last event is bigger than the current event, and it copies the current event into the buffer, then part of last event's SQL would appear to be mixed with the current event's SQL. I tested my theory about relay_log_space yesterday by setting up a system with a very small relay_log_space and binlog size, so it rotated relay logs constantly, and then just threw a bunch of queries at it. I couldn't get it to corrupt :-( but I was also using unrealistic test queries.
[18 Feb 2008 4:49]
Valeriy Kravchuk
Please, check if problem like this ever happens on a newer version, 5.0.54a.
[18 Feb 2008 12:37]
Baron Schwartz
I have removed relay_log_space_limit from my configuration files.
[4 Apr 2008 12:04]
Robert Schmidt
I think this is a problem with the net write timeout on the master. I had this problem and the corruption would always happen at the relay_log_space_limit. I increased net_write_timeout on the master to 3600 and the problem went away. It took me a while to track this down. I haven't tried it on newer versions of the server, but the client is 5.0.45.
[7 Oct 2008 11:43]
Susanne Ebrecht
Many thanks for the feedback. I will close the bug report now. Please feel free to re-open it when you have this problem after increasing net_write_timeout and by using actual version of MySQL 5.0 or 5.1 (at the moment actual are 5.0.67 and 5.1.28-rc).