Description:
Consider the procedure of tablespace importing for two cases:
1) with cfg file:
Let's look into row_import_for_mysql():
...
err = row_import_read_cfg(table, trx->mysql_thd, cfg);
/* Check if the table column definitions match the contents
of the config file. */
if (err == DB_SUCCESS) {
/* We have a schema file, try and match it with our
data dictionary. */
err = cfg.match_schema(trx->mysql_thd);
/* Update index->page and SYS_INDEXES.PAGE_NO to match the
B-tree root page numbers in the tablespace. Use the index
name from the .cfg file to find match. */
if (err == DB_SUCCESS) {
cfg.set_root_by_name();
autoinc = cfg.m_autoinc;
}
rw_lock_s_unlock_gen(dict_operation_lock, 0);
DBUG_EXECUTE_IF("ib_import_set_index_root_failure",
err = DB_TOO_MANY_CONCURRENT_TRXS;);
} else if (cfg.m_missing) {
...
So row_import_read_cfg() just reads metadata from cfg file into cfg object, while row_import::match_schema() just checks the correctness of loaded metadata comparing them with what is currently in internal table object. row_import::set_root_by_name() sets page and space numbers for the corresponding internal index objects in table->indexes container. To correspond loaded cfg index description with internal index description in table->indexes the index names are used (see row_import::set_root_by_name()).
2) without cfg file:
row_import_for_mysql()
...
} else if (cfg.m_missing) {
rw_lock_s_unlock_gen(dict_operation_lock, 0);
/* We don't have a schema file, we will have to discover
the index root pages from the .ibd file and skip the schema
matching step. */
ut_a(err == DB_FAIL);
cfg.m_page_size.copy_from(univ_page_size);
FetchIndexRootPages fetchIndexRootPages(table, trx);
err = fil_tablespace_iterate(
table, IO_BUFFER_SIZE(
cfg.m_page_size.physical(),
cfg.m_page_size.physical()),
fetchIndexRootPages);
if (err == DB_SUCCESS) {
err = fetchIndexRootPages.build_row_import(&cfg);
/* Update index->page and SYS_INDEXES.PAGE_NO
to match the B-tree root page numbers in the
tablespace. */
if (err == DB_SUCCESS) {
err = cfg.set_root_by_heuristic();
}
}
space_flags = fetchIndexRootPages.get_space_flags();
} else {
...
fetchIndexRootPages gathers index space-pageno pairs from the attached tablespace in the order of their root pages appearance in the tablespace. The same order will be preserved for cfg object, as fetchIndexRootPages.build_row_import() just fills index descriptors array in cfg object. Then row_import::set_root_by_heuristic() does the same thing as row_import::set_root_by_name() in (1), i.e. sets page and space numbers for the corresponding internal index objects in tables->indexes container.
But the general thing is that row_import::set_root_by_heuristic() does not use index names to map internal index descriptor from table->indexes container to cfg index descriptor, indexes order is used instead.
So, for example, if indexes order during table creation is not the same as indexes root pages order in the attached table space, then the correspondence will be wrong and the indexes in table->indexes container will contain wrong page number.
3) How dictionary indexes are sorted during table creation?
Their initial order is the same as in the table definition,
see mysql_execute_command():
case SQLCOM_CREATE_TABLE:
{
...
Alter_info alter_info(lex->alter_info, thd->mem_root);
...
}
then the index descriptors are sorted in mysql_prepare_create_table():
...
if (auto_increment > 0)
{
my_message(ER_WRONG_AUTO_KEY, ER(ER_WRONG_AUTO_KEY), MYF(0));
DBUG_RETURN(TRUE);
}
/* Sort keys in optimized order */
my_qsort((uchar*) *key_info_buffer, *key_count, sizeof(KEY),
(qsort_cmp) sort_keys);
create_info->null_bits= null_fields;
/* Check fields. */
it.rewind();
...
If we look into sork_keys() function comments we will see the following:
Sort keys according to the following properties, in decreasing order of
importance:
- PRIMARY KEY
- UNIQUE with all columns NOT NULL
- UNIQUE without partial segments
- UNIQUE
- without fulltext columns
- without virtual generated columns
So in the case if the tables were just created and their indexes were not altered, then heuristic, based on index orders and described in (2), works fine. But in the case if we, for example, alter table and add new indexes, for example, UNIQUE indexes after non-unique indexes, then there will be different order of index descriptors in table definition and index root pages in the attached tablespace, the heuristic will not work, and there will be error on tablespace import or indexes usage.
How to repeat:
There are two mtr test cases. They are identical as the reason of the error is the same - wrong root page number for the index if the tablespace is imported without cfg file. The difference is in the results itself. The first test finishes with crash, while the second test outputs wrong data when the correspondent index is used.
To make them work just copy them into "mysql-test/suite/innodb/t/"
===1.test===
--source include/have_innodb.inc
--source include/not_embedded.inc
let MYSQLD_DATADIR =`SELECT @@datadir`;
CREATE TABLE `t` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`a` bigint(20) NOT NULL,
`b` bigint(20) NOT NULL,
`c` bigint(20) NOT NULL,
`d` bigint(20) NOT NULL,
`e` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `k_cde` (`c`, `d`, `e`)
) ENGINE=InnoDB;
INSERT INTO `t` (`a`, `b`, `c`, `d`, `e`) VALUES (0, 0, 0, 0, 0);
CREATE UNIQUE INDEX `uk_a_b` ON `t` (`a`, `b`);
SHOW CREATE TABLE t;
FLUSH TABLES `t` FOR EXPORT;
perl;
do 'include/innodb-util.inc';
ib_backup_ibd_file("test", "t");
EOF
UNLOCK TABLES;
DROP TABLE t;
CREATE TABLE `t` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`a` bigint(20) NOT NULL,
`b` bigint(20) NOT NULL,
`c` bigint(20) NOT NULL,
`d` bigint(20) NOT NULL,
`e` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_a_b` (`a`,`b`),
KEY `k_cde` (`c`,`d`,`e`)
) ENGINE=InnoDB;
ALTER TABLE t DISCARD TABLESPACE;
perl;
do 'include/innodb-util.inc';
my $datadir = $ENV{'MYSQLD_DATADIR'};
my $tmpd = $ENV{'MYSQLTEST_VARDIR'} . "/tmp";
ib_restore_ibd_file($tmpd, $datadir, "test", "t");
EOF
ALTER TABLE t IMPORT TABLESPACE;
SELECT * FROM t WHERE a = 0 and b = 0;
DROP TABLE t;
======
===2.test===
--source include/have_innodb.inc
--source include/not_embedded.inc
let MYSQLD_DATADIR =`SELECT @@datadir`;
CREATE TABLE `t` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`a` bigint(20) NOT NULL,
`b` bigint(20) NOT NULL,
`c` bigint(20) NOT NULL,
`d` bigint(20) NOT NULL,
`e` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
KEY `k_de` (`d`, `e`)
) ENGINE=InnoDB;
INSERT INTO `t` (`a`, `b`, `c`, `d`, `e`) VALUES (0, 0, 0, 0, 0);
CREATE UNIQUE INDEX `uk_a_b_c` ON `t` (`a`, `b`, `c`);
SELECT * FROM `t`;
SHOW CREATE TABLE t;
FLUSH TABLES `t` FOR EXPORT;
perl;
do 'include/innodb-util.inc';
ib_backup_ibd_file("test", "t");
EOF
UNLOCK TABLES;
DROP TABLE t;
CREATE TABLE `t` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`a` bigint(20) NOT NULL,
`b` bigint(20) NOT NULL,
`c` bigint(20) NOT NULL,
`d` bigint(20) NOT NULL,
`e` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_a_b_c` (`a`, `b`, `c`),
KEY `k_de` (`d`,`e`)
) ENGINE=InnoDB;
ALTER TABLE t DISCARD TABLESPACE;
perl;
do 'include/innodb-util.inc';
my $datadir = $ENV{'MYSQLD_DATADIR'};
my $tmpd = $ENV{'MYSQLTEST_VARDIR'} . "/tmp";
ib_restore_ibd_file($tmpd, $datadir, "test", "t");
EOF
ALTER TABLE t IMPORT TABLESPACE;
SELECT * FROM t WHERE a = 0 and b = 0 and c = 0;
SELECT * FROM t;
DROP TABLE t;
======
Suggested fix:
Develop new heuristic to map dictionary indexes to tablespace index root pages, or forbid tablespace importing without cfg file with metadata.