Bug #60776 | InnoDB Max Purge Lag setting is wrongly designed | ||
---|---|---|---|
Submitted: | 6 Apr 2011 11:42 | Modified: | 1 May 2013 20:09 |
Reporter: | Dimitri KRAVTCHUK | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB Plugin storage engine | Severity: | S2 (Serious) |
Version: | any | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | innodb, innodb_max_purge_lag |
[6 Apr 2011 11:42]
Dimitri KRAVTCHUK
[12 Apr 2011 2:16]
James Day
It's worth remembering why we have this setting: it's to prevent the server from filling the hard drive, not for normal use throttling. The setting should normally be at a high anti-disaster value, not one that is ever reached during normal activities, including normal backups. For normal work it's the normal purging that should handle things, not this. The rest of the discussion covers normal use, not anti-disaster use. That is, it covers use of this setting for tasks which it wasn't intended for. I agree that the maximum delay is too high. From a support perspective what we get is someone with a large lag and we can't tell them to set the max purge lag to the final target value immediately or it will effectively kill their server. Instead we have to tell them to reduce it gradually. Some gradual increasing of the delay from a low value to a higher one over time if the initial delay doesn't produce a decrease would be more helpful. It's also worth considering what happens during backups: the lag can increase to a value in the few billion range during the backup due to disk I/O contention lasting for many hours. It's OK to change the maximum purge lag for a backup to a higher value but then you again need to be very careful when reducing it, lest you freeze your server. And this has to be done while nobody is watching the server because backups are usually unattended. There's no problem with having a high purge lag during a backup. However, consider a server that does all of its work in large batch transactions that run for many hours. It may really be necessary to insert large delays between those because of the large amount of work that each does. This causes me to think that we need two different things: 1. the existing disaster prevention setting, innodb_max_purge_lag. 2. purging able to use smaller throttling, but only if it's been behind for a long time and isn't already catching up, lest it throttle due to normal daily variations like backup. Both purging and flushing need to be able to throttle foreground threads because it's possible to submit more work to a cached data set than the underlying durable storage can keep up with. But that throttling needs to ramp up gradually, not start with a cliff.
[1 May 2013 20:09]
Bugs System
Added a changelog entry for 5.6.5: "The "innodb_max_purge_lag" variable controls how to delay DML operations when purge operations are lagging. Previously, if an old consistent read view was detected, DML operations would not be delayed even though the purge lag exceeded the "innodb_max_purge_lag" setting. Additionally, if the "innodb_max_purge_lag" setting was used, situations could arise in which the DML delay time would continue to increase but not be applied right away due to the presence an old consistent read view. This could result in a lengthy DML delay when the accumulated DML delay time is eventually applied. This fix caps the DML delay at a maximum value, removes the consistent read check, and revises the DML delay calculation."