Bug #106884 Data are not repartitioned when the partition key's charset is modified
Submitted: 1 Apr 2022 8:53 Modified: 4 Apr 2022 9:55
Reporter: Cheng Zhou Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:8.0, 8.0.28 OS:Any
Assigned to: CPU Architecture:Any

[1 Apr 2022 8:53] Cheng Zhou
Description:
When the partition key's charset is modified, records of the partition table are not repartitioned.

How to repeat:
1、create table tab1(c1 varchar(10)) partition by key(c1) partitions 3;

2、mysql> show create table tab1\G
*************************** 1. row ***************************
       Table: tab1
Create Table: CREATE TABLE `tab1` (
  `c1` varchar(10) COLLATE utf8mb4_bin DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
/*!50100 PARTITION BY KEY (c1)
PARTITIONS 3 */
1 row in set (0.02 sec)

3、insert into tab1 values('a');

4、mysql> select table_name, partition_name, partition_method, partition_expression, table_rows, avg_row_length, data_length from information_schema.partitions where table_name = 'tab1';
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | PARTITION_METHOD | PARTITION_EXPRESSION | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| tab1       | p0             | KEY              | `c1`                 |          0 |              0 |       16384 |
| tab1       | p1             | KEY              | `c1`                 |          0 |              0 |       16384 |
| tab1       | p2             | KEY              | `c1`                 |          1 |          16384 |       16384 |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+

5、alter table tab1 modify c1 varchar(10) charset utf8mb4;

6、mysql> show create table tab1\G
*************************** 1. row ***************************
       Table: tab1
Create Table: CREATE TABLE `tab1` (
  `c1` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
/*!50100 PARTITION BY KEY (c1)
PARTITIONS 3 */
1 row in set (0.01 sec)

7、insert into tab1 values('a');
8、analyze table tab1;

9、mysql> select table_name, partition_name, partition_method, partition_expression, table_rows, avg_row_length, data_length from information_schema.partitions where table_name = 'tab1';
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | PARTITION_METHOD | PARTITION_EXPRESSION | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| tab1       | p0             | KEY              | `c1`                 |          0 |              0 |       16384 |
| tab1       | p1             | KEY              | `c1`                 |          1 |          16384 |       16384 |
| tab1       | p2             | KEY              | `c1`                 |          1 |          16384 |       16384 |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+

10、
mysql> select * from tab1 partition(p2);
+------+
| c1   |
+------+
| a    |
+------+
1 row in set (0.00 sec)

mysql> select * from tab1 partition(p1);
+------+
| c1   |
+------+
| a    |
+------+
1 row in set (0.00 sec)

Two same records are in different partitions.
[1 Apr 2022 11:43] MySQL Verification Team
Hello Cheng Zhou,

Thank you for the report and test case.
I'm not seeing any issues with 8.0.28 build(same is with 5.7.37, where DEFAULT CHARSET=latin1). Could you please check with GA build and inform us if you are still seeing? Thank you!

- 8.0.28
 bin/mysql -uroot -S /tmp/mysql.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.28-commercial MySQL Enterprise Server - Commercial

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database test;
Query OK, 1 row affected (0.01 sec)

mysql> use test
Database changed
mysql> set global information_schema_stats_expiry =0;
Query OK, 0 rows affected (0.00 sec)

mysql> create database if not exists test;
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> use test;
Database changed
mysql> create table tab1(c1 varchar(10)) partition by key(c1) partitions 3;
Query OK, 0 rows affected (0.05 sec)

mysql> show create table tab1\G
*************************** 1. row ***************************
       Table: tab1
Create Table: CREATE TABLE `tab1` (
  `c1` varchar(10) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY KEY (c1)
PARTITIONS 3 */
1 row in set (0.00 sec)

mysql> insert into tab1 values('a');
Query OK, 1 row affected (0.00 sec)

mysql> select table_name, partition_name, partition_method, partition_expression, table_rows, avg_row_length, data_length from information_schema.partitions where table_name = 'tab1';
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | PARTITION_METHOD | PARTITION_EXPRESSION | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| tab1       | p0             | KEY              | `c1`                 |          0 |              0 |       16384 |
| tab1       | p1             | KEY              | `c1`                 |          1 |          16384 |       16384 |
| tab1       | p2             | KEY              | `c1`                 |          0 |              0 |       16384 |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
3 rows in set (0.00 sec)

mysql> alter table tab1 modify c1 varchar(10) charset utf8mb4;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into tab1 values('a');
Query OK, 1 row affected (0.00 sec)

mysql> analyze table tab1;
+-----------+---------+----------+----------+
| Table     | Op      | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| test.tab1 | analyze | status   | OK       |
+-----------+---------+----------+----------+
1 row in set (0.02 sec)

mysql> select table_name, partition_name, partition_method, partition_expression, table_rows, avg_row_length, data_length from information_schema.partitions where table_name = 'tab1';
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| TABLE_NAME | PARTITION_NAME | PARTITION_METHOD | PARTITION_EXPRESSION | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
| tab1       | p0             | KEY              | `c1`                 |          0 |              0 |       16384 |
| tab1       | p1             | KEY              | `c1`                 |          2 |           8192 |       16384 |
| tab1       | p2             | KEY              | `c1`                 |          0 |              0 |       16384 |
+------------+----------------+------------------+----------------------+------------+----------------+-------------+
3 rows in set (0.00 sec)

mysql> select * from tab1 partition(p2);
Empty set (0.00 sec)

mysql> select * from tab1 partition(p1);
+------+
| c1   |
+------+
| a    |
| a    |
+------+
2 rows in set (0.00 sec)

regards,
Umesh
[1 Apr 2022 14:51] Cheng Zhou
In my case, the original table tab1 is:
*************************** 1. row ***************************
       Table: tab1
Create Table: CREATE TABLE `tab1` (
  `c1` varchar(10) COLLATE utf8mb4_bin DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
/*!50100 PARTITION BY KEY (c1)
PARTITIONS 3 */

The COLLATE of column `c1` is utf8mb4_bin;
After executing "alter table tab1 modify c1 varchar(10) charset utf8mb4;", The COLLATE of column `c1` is also modified to utf8mb4_0900_ai_ci.
In your case, after created, The COLLATE of column `c1` is already utf8mb4_0900_ai_ci.
I think the different COLLATES before and after executing the alter statement causes this problem.
[2 Apr 2022 1:22] Cheng Zhou
Reported by GoldenDB Team.
[4 Apr 2022 9:55] MySQL Verification Team
Hello Cheng Zhou,

Thank you for the report and feedback.

regards,
Umesh
[9 Apr 2022 11:16] Xiaodi Z
Maybe should fix the issue here:

sql/sql_table.cc

...
  case IS_EQUAL_PACK_LENGTH:
    /*
      New column type differs from the old one, but has compatible packed
      data representation. Depending on storage engine, such a change can
      be carried out by simply updating data dictionary without changing
      actual data (for example, VARCHAR(300) is changed to VARCHAR(400)).

      If the collation has changed, and there is an index on the column,
      we must mark this as a change in stored column type, which is
      usually rejected as inplace operation by the SE.
    */
    if (is_collation_change_for_indexed_field(*field, *new_field,
                                              ha_alter_info)) {
      ha_alter_info->handler_flags |=
          Alter_inplace_info::ALTER_STORED_COLUMN_TYPE;
    } else {
      ha_alter_info->handler_flags |=
          Alter_inplace_info::ALTER_COLUMN_EQUAL_PACK_LENGTH;
    }
    break;
...
[24 May 2022 20:59] Jon Stephens
Documented fix as follows in the MySQL 8.0.31 changelog:

    A partitioned table was not repartitioned as expected when ALTER
    TABLE acted to change to change the character set of any of its
    columns. The root cause of this was that, when determining
    whether a change in character set could be done as an in-place
    ALTER, we did not consider whether any columns affected by the
    character set change were part of the table's partitioning
    expression. Since such a change might lead to a repartitioning
    of the table, it could not be done in place. Now when a change
    in character set is planned, we also check whether this affects
    any columns that are part of a partitioning expression; if not,
    then we can perform the operation in place.

Closed.