Description:
ALTER TABLE ADD FOREIGN KEY on InnoDB table which is executed using INPLACE algorithm (which implies SET FOREIGN_KEY_CHECK = 0 mode) might sometimes fail with unwarranted error or even crash, when there is heavy concurrent DDL load.
Investigation of core files has shown that this happens when InnoDB's dict_table_t object for the parent table of Foreign Key added is evicted from its internal cache between prepare and commit phases of such in-place ALTER TABLE.
The below test case for MTR framework demonstrates the problem by crashing 1-2 times per each 10 runs on my machine. In the majority of remaining cases ALTER TABLE in question fails with unwarranted 1825 error (which is incorrect behavior and can be considered a bug as well) and succeeds in a few cases.
The stack trace printed in case of crash looks like:
...mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x59) [0x611b2b601397]
...mysqld(print_fatal_signal(int)+0x3a6) [0x611b29ff51f3]
...mysqld(handle_fatal_signal+0x79) [0x611b29ff5414]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7da315042520]
...mysqld(cmp_cols_are_equal(dict_col_t const*, dict_col_t const*, bool)+0x119) [0x611b2b8c7318]
...mysqld(dict_foreign_qualify_index(dict_table_t const*, char const**, char const**, unsigned long, dict_index_t const*, dict_index_t const*, bool, unsigned long)+0x198) [0x611b2bc075be]
...mysqld(dict_foreign_find_index(dict_table_t const*, char const**, char const**, unsigned long, dict_index_t const*, bool, unsigned long)+0x11e) [0x611b2bc02aed]
...mysqld(+0x51dc00b) [0x611b2b6ff00b]
...mysqld(commit_try_norebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, trx_t*, char const*)+0x20d) [0x611b2b71b985]
...mysqld(bool ha_innobase::commit_inplace_alter_table_impl<dd::Table>(TABLE*, Alter_inplace_info*, bool, dd::Table*)+0xc9e) [0x611b2b722c1e]
...mysqld(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool, dd::Table const*, dd::Table*)+0x1d0) [0x611b2b6f7058]
...mysqld(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool, dd::Table const*, dd::Table*)+0xdd) [0x611b2a1c2507]
...mysqld(+0x3965700) [0x611b29e88700]
...mysqld(mysql_alter_table(THD*, char const*, char const*, HA_CREATE_INFO*, Table_ref*, Alter_info*)+0x3cc2) [0x611b29e9509a]
...mysqld(Sql_cmd_alter_table::execute(THD*)+0x614) [0x611b2a54c440]
...mysqld(mysql_execute_command(THD*, bool)+0x5a34) [0x611b29d9a411]
...mysqld(dispatch_sql_command(THD*, Parser_state*)+0x7c0) [0x611b29d9c97a]
...mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1749) [0x611b29d91ca2]
...mysqld(do_command(THD*)+0x60f) [0x611b29d8fb0b]
...mysqld(+0x3abb78f) [0x611b29fde78f]
...mysqld(+0x5dc46be) [0x611b2c2e76be]
How to repeat:
# Test for MTR framework which shows the problem when run against server
# with debug sync support enabled (e.g. debug version of server).
--source include/have_debug_sync.inc
SET GLOBAL table_definition_cache=400;
SET GLOBAL table_open_cache= 1024;
CREATE TABLE parent (pk INT PRIMARY KEY);
CREATE TABLE child (fk INT);
SET FOREIGN_KEY_CHECKS = 0;
SET DEBUG_SYNC="alter_table_inplace_after_lock_downgrade SIGNAL reached WAIT_FOR go";
--send ALTER TABLE child ADD FOREIGN KEY (fk) REFERENCES parent (pk), ALGORITHM=INPLACE;
connect(con1,localhost,root,,);
SET DEBUG_SYNC="now WAIT_FOR reached";
select sleep(10);
--echo # Flush table, so that the table can be evicted by master thread.
FLUSH TABLE p;
--echo # Create 1000 tables, try the best to evict the table.
--disable_query_log
let $loop=1500;
while ($loop > 1) {
eval CREATE TABLE t_$loop (id int) ENGINE=innodb;
dec $loop;
}
--enable_query_log
SELECT SLEEP(60);
SET DEBUG_SYNC="now SIGNAL go";
--connection default
# In most cases we get error 1825 and not crash.
# The former is not so severe as crash but still is not a valid behavior.
# Occasionally, ALTER TABLE succeeds.
--error 0,1825
--reap
--connection con1
--disable_query_log
let $loop=1500;
while($loop > 1)
{
eval DROP TABLE t_$loop;
dec $loop;
}
--enable_query_log
--disconnect con1
--connection default
DROP TABLES parent, child;
SET GLOBAL table_definition_cache=default;
SET GLOBAL table_open_cache= default;
Suggested fix:
Fix InnoDB In-place ALTER TABLE code either pin dict_table_t objects to the internal cache until the commit phase of operation or ensure that commit phase gets fresh reference to parent table dict_table_t object instead of possible stale one.