| Bug #64503 | mysql frequently ignores --relay-log-space-limit | ||
|---|---|---|---|
| Submitted: | 1 Mar 2012 5:27 | Modified: | 14 Mar 2012 8:44 |
| Reporter: | Mark Callaghan | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
| Version: | 5.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[1 Mar 2012 5:29]
Mark Callaghan
repro case
Attachment: rpl_trx_limit.test (application/octet-stream, text), 2.04 KiB.
[1 Mar 2012 5:29]
Mark Callaghan
repro case
Attachment: rpl_trx_limit-master.opt (application/octet-stream, text), 26 bytes.
[1 Mar 2012 5:29]
Mark Callaghan
repro case
Attachment: rpl_trx_limit-master.opt (application/octet-stream, text), 26 bytes.
[1 Mar 2012 5:29]
Mark Callaghan
repro case
Attachment: rpl_trx_limit-slave.opt (application/octet-stream, text), 104 bytes.
[1 Mar 2012 5:31]
Mark Callaghan
An example of the output in var/log/mysqld.2.err from the repro case. "total" should not exceed 1M next_event sleep total 4038688, ignore 1 next_event sleep total 4044657, ignore 1 next_event sleep total 4056897, ignore 1 next_event sleep total 4060569, ignore 1 next_event sleep total 4064649, ignore 1 next_event sleep total 4069545, ignore 1 next_event sleep total 4073625, ignore 1 next_event sleep total 4078113, ignore 1 next_event sleep total 4081785, ignore 1 next_event sleep total 4085865, ignore 1
[2 Mar 2012 2:09]
Mark Callaghan
Wow, who would have thought that a variable named "ignore_log_space_limit" causes the problem? I tested a simple hack: * change appendv() to return a variable indicating whether a new log file is opened * change queue_event() to set rli->ignore_log_space_limit=0 when appendv() created a new file With this change relay-log-space-limit is respected.
[6 Mar 2012 11:34]
MySQL Verification Team
http://mysqlha.blogspot.com/2012/03/mysql-can-ignore-relay-log-space-limit.html
[14 Mar 2012 8:44]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.
If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at
http://dev.mysql.com/doc/en/installing-source.html
[14 Mar 2012 8:45]
Jon Stephens
Fixed in 5.1+. Documented in the 5.1.63, 5.523, and 5.6.6 changelogs as follows:
The --relay-log-space-limit option was sometimes ignored.
More specifically, when the SQL thread went to sleep, it allowed
the I/O thread to queue additional events in such a way that the
relay log space limit was bypassed, and the number of events in
the queue could grow well past the point where the relay logs
needed to be rotated. Now in such cases, the SQL thread checks to
see whether the I/O thread should rotate and provide the SQL
thread a chance to purge the logs (thus freeing space).
Note that, when the SQL thread is in the middle of a
transaction, it cannot purge the logs; it can only ask for more
events until the transaction is complete. Once the transaction
is finished, the SQL thread can immediately instruct the I/O
thread to rotate.
Closed.

Description: There is a race that allows MySQL to frequently ignore --relay-log-space-limit. To reproduce this: 1) let the SQL slave thread catch up to the IO thread so that it sets rli->ignore_log_space_limit to 1 in next_event() 2) give the SQL thread a workload that it is slow to execute so that it doesn't set ignore_log_space_limit to zero. That is done when the SQL thread exits and when the SQL thread reaches EOF of a log file that is not hot (in use by the IO thread). When it reaches EOF in that case it would call purge_first_log which sets ignore_log_space_limit to 0. This reproduces the problem because the IO thread ignores --relay-log-space-limit when ignore_log_space_limit is set How to repeat: I will attach an MTR case. It is more likely to reproduce the problem when this is added to the start of next_event() if (rli->ignore_log_space_limit == 1) { fprintf(stderr, "next_event sleep total %d, ignore %d\n", (int) rli->log_space_total, (int) rli->ignore_log_space_limit); usleep(200000); } Then tail var/log/mysqld.2.err when running the attached mtr case. I reproduced this for unmodified 5.1.60 and with --relay-log-space-limit=1000000 there was more than 4MB of relay logs. Suggested fix: Don't know yet but the protocol for using/reading ignore_log_space_limit doesn't work.