Description:
innodb_log_checkpoint_now implementation does not account for carried-over-checkpoint MLOG_FILE_NAME / MLOG_FILE_RENAME2 records as introduced by WL#7142. Thus it is possible to put into an infinite loop, for example, by stopping an online ALTER thread with log_sys->append_on_checkpoint != NULL.
How to repeat:
diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc
index 12abfa5..01b91e9 100644
--- a/storage/innobase/handler/handler0alter.cc
+++ b/storage/innobase/handler/handler0alter.cc
@@ -6829,6 +6829,8 @@ commit_cache_rebuild(
ctx->old_table, ctx->tmp_name, FALSE);
ut_a(error == DB_SUCCESS);
+ DEBUG_SYNC_C("commit_cache_rebuild_middle");
+
error = dict_table_rename_in_cache(
ctx->new_table, old_name, FALSE);
ut_a(error == DB_SUCCESS);
--source include/have_debug.inc
--source include/have_debug_sync.inc
--source include/have_innodb.inc
CREATE TABLE t1 (x INT NOT NULL UNIQUE KEY) ENGINE=InnoDB;
INSERT INTO t1 VALUES(5);
SET @@GLOBAL.innodb_log_checkpoint_now=TRUE;
# Start an ALTER TABLE and stop it after the table -> temp table rename
--connect (con2,localhost,root,,)
--connection default
SET DEBUG_SYNC="commit_cache_rebuild_middle SIGNAL alter_table_ready WAIT_FOR finish_alter_table";
send ALTER TABLE t1 ADD PRIMARY KEY(x);
--connection con2
SET DEBUG_SYNC="now WAIT_FOR alter_table_ready";
SET @@GLOBAL.innodb_log_checkpoint_now=TRUE;
SET DEBUG_SYNC="now SIGNAL finish_alter_table";
Run the above, wait until server gets stuck on the 2nd innodb_log_checkpoint_now=TRUE statement, attach with the debugger, go to the innodb_log_checkpoint_now=TRUE-processing thread and checkpoint_now_set frame there, inspect log_sys->lsn and log_sys->last_checkpoint_lsn increasing and never satisfying the loop termination condition.
Suggested fix:
Not sure. Save the current LSN at the start of checkpoint_now_set and only loop until that value is reached?