Bug #116791 Contribution by Tencent: add partition rollback crash
Submitted: 27 Nov 3:39 Modified: 27 Nov 9:04
Reporter: Xiaodong Huang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:8.0.40 OS:Any
Assigned to: CPU Architecture:Any

[27 Nov 3:39] Xiaodong Huang
Description:
Currently, if a crash occurs after the Online DDL commit phase in the partition table DDL, it will evict all parts of the cache in the partition table. At this time, if there is another table under the same database with a name that uses the current table name as a prefix and it also exists in the cache, it may cause a crash.

Root Cause:

Describe our issue by analyzing the following code:

// Need to eliminate all partition tables related to table names with "name"
void dict_partitioned_table_remove_from_cache(const char *name) {
  ut_ad(mutex_own(&dict_sys->mutex));

  size_t name_len = strlen(name);

  // Traverse all table caches in the current system
  for (uint32_t i = 0; i < hash_get_n_cells(dict_sys->table_id_hash); ++i) {
    dict_table_t *table;

    table =
        static_cast<dict_table_t *>(HASH_GET_FIRST(dict_sys->table_hash, i));

    while (table != nullptr) {
      dict_table_t *prev_table = table;

      table = static_cast<dict_table_t *>(HASH_GET_NEXT(name_hash, prev_table));
      ut_ad(prev_table->magic_n == DICT_TABLE_MAGIC_N);

      if (prev_table->is_dd_table) {
        continue;
      }
      // There is a problem with this logic judgment, which results in: when the name is: test/sbtest187, test/sbtest187_bak#p#p1#sp#sp2 will also enter the following elimination logic , as the former is the prefix part of the latter
      if ((strncmp(name, prev_table->name.m_name, name_len) == 0) &&  
          dict_table_is_partition(prev_table)) {
        btr_drop_ahi_for_table(prev_table);
        dict_table_remove_from_cache(prev_table);
      }
    }
  }
}

How to repeat:
Add the following patch to help with reproduction this problem:

diff --git a/sql/sql_table.cc b/sql/sql_table.cc
index 6adba7d15f0..83ce9f47f87 100644
--- a/sql/sql_table.cc
+++ b/sql/sql_table.cc
@@ -13630,6 +13630,11 @@ static bool mysql_inplace_alter_table(
     altered_table_def->set_name(alter_ctx->alias);
     altered_table_def->set_hidden(dd::Abstract_table::HT_VISIBLE);
 
+    DBUG_EXECUTE_IF("partition_ddl_rollback", {
+      my_error(ER_SECONDARY_ENGINE_DDL, MYF(0));
+      goto cleanup2;
+    });
+
     /*
       Copy pre-existing triggers to the new table definition.
       Since trigger names have to be unique per schema, we cannot

MTR:

--echo #
--echo # Prepare
--echo #

CREATE TABLE t1 (
    id INT,
    data INT,
    PRIMARY KEY(id),
    KEY idx1(data)
)
PARTITION BY LIST(id) (
    PARTITION p0 VALUES IN (5, 10, 15),
    PARTITION p1 VALUES IN (6, 12, 18)
);

CREATE TABLE t1xx like t1; 
--echo #
--echo # Run 
--echo #

connect(con1, localhost, root,,);
--send select sleep(3), data from t1xx

connection default;
set session debug = "+d,partition_ddl_rollback";
--error ER_SECONDARY_ENGINE_DDL
ALTER TABLE t1 ADD PARTITION (PARTITION p2 VALUES IN (7, 14, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43));

connection con1;
--reap

--echo #
--echo # Cleanup 
--echo #
drop table t1, t1xx;
disconnect con1;
[27 Nov 9:04] MySQL Verification Team
Hello Xiaodong,

Thank you for the report and feedback.

regards,
Umesh