Description:
InnoDB partition table auto-increment behavior is inconsistent.
From Bug#105317, we can see that the partition table has a problem with the auto-increment processing. Here I will give another example:
CREATE TABLE t1 (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100)) ENGINE=InnoDB
PARTITION BY RANGE(id) PARTITIONS 3
( PARTITION p0 VALUES LESS THAN (100),
PARTITION p1 VALUES LESS THAN (200),
PARTITION p2 VALUES LESS THAN (300));
INSERT INTO t1 VALUES (99, "AAA"), (199, "BBB"), (299, "CCC");
ALTER TABLE t1 DROP PARTITION p2;
SHOW CREATE TABLE t1;
--error ER_NO_PARTITION_FOR_GIVEN_VALUE
INSERT INTO t1(name) VALUES ("DDD");
DROP TABLE t1;
SQL that does not explicitly specify an auto-increment value will fail to insert, because the auto-increment value is not cleared when the partition is dropped.
In fact, I think the setting of auto-increment in InnoDB partition table is very strange. There are several places in the partition table that store auto-increment:
1. DD::Table, set by dd_set_autoinc
2. dict_table_t, before DDL, each partition (dict_table_t) maintains its own auto-increment, set by ha_innopart::save_auto_increment, which is the cause of instability mentioned in bug#105317
3. Partitin_share->next_auto_inc_val, when initialized, this value traverses each dict_table_t and selects the maximum value. After DDL, this value is not corrected
OK, assuming DDL occurs, the auto-increment of the new DD::Table will use Partitin_share->next_auto_inc_val (ha_innopart::commit_inplace_alter_partition).
Then dict_table_t is closed and reopen. When we reopen dict_table_t, we find that the auto-increment values of all dict_table_t are the values saved in DD, which means that the auto-increment values of all dict_table_ts are consistent after the table is opened. However, when the data of each partition is modified, the auto-increment values of each partition(dict_table_t) will be inconsistent due to the call of ha_innopart::save_auto_increment.
This raises a question: Do we want the auto-increment values of dict_table_t to be consistent or inconsistent?
How to repeat:
This is a code-based analysis
Suggested fix:
case 1 - For partitioned tables, the auto-increment values of all dict_table_t are always consistent. Currently, ha_innopart::save_auto_increment only updates the dict_table_t involved.
The modification method is very simple. We traverse all dict_table_t and update them. In this way, the auto-increment values of all dict_table_t are consistent after the table is opened or written. This is the "consistent" solution.
case 2 - For partitioned tables, the auto-increment value is maintained at the DD::partition level. The auto-increment value of each dict_table_t is associated with DD::partition. In this way, each dict_table_t is more isolated and the exchange operation is more optimized. Of course, this depends on the modification of DD and the set/get of the auto-increment value. This is the "separation" solution.
In any case, I think it is important to have consistent behavior. Sometimes it is consistent and sometimes it is inconsistent, which will be confusing.