Description:
We found in some situation purge will suspend forever until new update arrive.
srv_purge_coordinator_suspend:
if (ret == OS_SYNC_TIME_EXCEEDED) {
/* No new records added since wait started then simply
wait for new records. The magic number 5000 is an
approximation for the case where we have cached UNDO
log records which prevent truncate of the UNDO
segments. */
if (rseg_history_len == trx_sys->rseg_history_len
&& trx_sys->rseg_history_len < 5000) {
stop = true;
}
}
reading comments above we known cached UNDO won't remove from rseg history list and 5000 is a approximation of cached UNDO in rseg history list,
this is true before MySQL 5.7.17, but Bug #24450908 UNDO LOG EXISTS AFTER SLOW SHUTDOWN removed cached UNDO from rseg history list in MySQL 5.7.17, so this code and comment is out of date.
If we entered this code and stop assigend to true, purge thread will suspend forever until new update arrive.
How to repeat:
run mtr bugfix_purge_suspend_forever.test in repeat.patch
Suggested fix:
only set stop to true when trx_sys→rseg_history_len==0
--- a/storage/innobase/srv/srv0srv.cc
+++ b/storage/innobase/srv/srv0srv.cc
@@ -2733,17 +2733,11 @@ srv_purge_coordinator_suspend(
rw_lock_x_unlock(&purge_sys->latch);
if (ret == OS_SYNC_TIME_EXCEEDED) {
-
- /* No new records added since wait started then simply
- wait for new records. The magic number 5000 is an
- approximation for the case where we have cached UNDO
- log records which prevent truncate of the UNDO
- segments. */
-
- if (rseg_history_len == trx_sys->rseg_history_len
- && trx_sys->rseg_history_len < 5000) {
-
+ if (trx_sys->rseg_history_len == 0) {
stop = true;
+ } else {
+ os_event_wait_time_low(
+ slot->event, 100 * SRV_PURGE_MAX_TIMEOUT, sig_count);
}
}