Bug #94403 | Question about the lwm_lsn and dpa_lsn in log_get_available_for_checkpoint_lsn | ||
---|---|---|---|
Submitted: | 20 Feb 2019 6:30 | Modified: | 26 Feb 2019 14:18 |
Reporter: | Zongzhi Chen (OCA) | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S3 (Non-critical) |
Version: | 8.0.14 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[20 Feb 2019 6:30]
Zongzhi Chen
[25 Feb 2019 11:52]
Pawel Olchawa
Hello, first of all, I would like to say that it is very good analysis! You are right in principle - we could simply assign lsn to the lwm_lsn in that branch and it would be correct. However there is a reason why it's written that way (doing extra computation which *almost* always evaluates to the lwm_lsn). Let me try to explain that below. As you noticed, the lwm_lsn promises itself that it is smaller or equal to the recent_closed's tail (that's guaranteed because before we add pages to flush lists, we wait for space in the recent_closed). However, the assertion you added there (for a test), could fail (!). That's because load of the dpa_lsn and load of the lwm_lsn are not a single atomic operation. After the dpa_lsn is loaded, new pages could be added and the recent_closed could become advanced before the lwm_lsn is loaded. The dpa_lsn would be then outdated. If we swapped order of these two loads, the assertion would then always hold. However, for the case of empty flush lists (lwm_lsn == 0), we need to have the dpa_lsn from *before* the load of the lwm_lsn. This is extremely important, because otherwise following could happen (if we swapped order of the loads): 1. After we detected the lwm_lsn is zero, but before we loaded the dpa_lsn: - new dirty pages could be added, - recent_closed could become advanced up to the current lsn. 2. Then we would read the dpa_lsn (equal to the current_lsn) and write checkpoint at such lsn - bypassing the new dirty pages which were added to flush lists. This explains why we must not swap order of these two loads. However, still following options are correct: 1. The new assertion could be based on yet another load of the recent_closed's tail (after load of the lwm_lsn). 2. We could skip the new assertion and skip the computation there - just assign the lwm_lsn. 3. Current solution. Note that the dpa_lsn could be smaller than the lwm_lsn if it is outdated, because of the order of these two loads. Now, if we decided for solution 2 (which you suggest AFAIU), then we would introduce kind of "optimization" for unlikely scenario with the outdated dpa_lsn. It would be correct, but the result of function would no longer be a consistent decision made according to the two values we loaded there: the dpa_lsn and the lwm_lsn. In case lwm_lsn == 0 - it would stay consistent. In case lwn_lsn != 0, it would ignore the fact that possibly the dpa_lsn is much smaller (very unlikely as the thread would have to be scheduled out for longer between these two loads, and the loads happen one after another in the code). The consistency I mentioned here, is no longer important as soon as we exit the function. However I believe it provides some safety for any future changes to that function. I believe it's better to stay with the current approach. However one could consider the difference as a matter of taste. I think the comment could be enhanced there to cover what I described here (stressing the fact that the two loads are not atomic operation). I am open to suggestions and discussion here. Please let me know what you think. Thanks for the very good and deep analysis of the code!
[25 Feb 2019 18:00]
Zongzhi Chen
Thank you for you detailed explain. It make sense now. We had discussed that the load of two lsn won't be atomic. However, we hadn't find the corner case..
[26 Feb 2019 14:18]
MySQL Verification Team
Hi Chen, I understood that you are happy with the explanation provided. Hence, I have changed the status of this report. If you have any other related questions, or find other problems stemming from those watermarks (which is how I call them), please do let us know. Thank you for your report.