Bug #111328 utf8mb4_0900_ai_ci be added to Column definition when convert table to utf8mb4
Submitted: 8 Jun 2023 8:08 Modified: 9 Jun 2023 3:18
Reporter: Chuan Jin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DDL Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: collate, DDL, utf8mb4_unicode_ci

[8 Jun 2023 8:08] Chuan Jin
Description:
When user converts the Character set for a table from utf8mb3 to utf8mb4, and indicates the collate as "utf8mb4_general_ci" at the same time, the char columns' collate in that table will be set to  utf8mb4_0900_ai_ci (but not the utf8mb4_general_ci issued by user). This is not as expected.

How to repeat:
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.33    |
+-----------+
1 row in set (0.00 sec)
mysql> show create table sbtest40;
+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                                                                                                                                                                                                                                                   |
+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest40 | CREATE TABLE `sbtest40` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_40` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=25001 DEFAULT CHARSET=utf8mb3 |
+----------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> alter table sbtest40 convert to CHARacter set utf8mb4, collate utf8mb4_general_ci;
Query OK, 0 rows affected (0.97 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table sbtest40;
+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                                                                                                                                                                                                                                                                                                                                                                                |
+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest40 | CREATE TABLE `sbtest40` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '',
  `pad` char(60) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_40` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=25001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>
mysql> alter table sbtest40 add d char(120) default 'test';
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table sbtest40;
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sbtest40 | CREATE TABLE `sbtest40` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '',
  `pad` char(60) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '',
  `d` char(120) COLLATE utf8mb4_general_ci DEFAULT 'test',
  PRIMARY KEY (`id`),
  KEY `k_40` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=25001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |

Suggested fix:
Collate of columns should be added as user's command. In this case ,it should be same with the collate of table definition, utf8mb4_general_ci. Actually, when we add a new char column for the test table "sttest40", it's collate has definitely been set to the "utf8mb4_general_ci", this is as expected.
[8 Jun 2023 14:30] MySQL Verification Team
HI Mr. 川 金,

Thank you for your bug report.

We have repeated the behaviour that you are reported.

This bug is verified.
[8 Jun 2023 14:32] MySQL Verification Team
HI,

We must notify that collation is changed for the table, not for each column separately.

Hence, this could result only in the change of our documentation. 

This is just for your info.
[9 Jun 2023 3:18] Chuan Jin
Hi,
  Thank you for your help. I am a litter confused that it is only incorrect document specification. Could you please help to tell that if user converted collate of a table(not each column) in Mysql8.0 to "utf8mb4_general_ci", which collate of the data in table would have been converted to? 
  From my point of view, in MySQL8.0, before converting DDL on the table `sbtest40`, the collate of the columns `c` and `pad` is "utf8_general_ci", which was inherited from the table definition when be created. After converting, an new added column also obeys the rule as shown above. 
  Why only converting process  breaks the rule?  User maybe get unexpected ordered results after converting, especially when user adds a new column `d` without specifying collate separately and copies data from column `c`. 
  Besides, in MySQL5.7, if user doesn't specify the char set or collation of the char columns, they should be inherited from the table no matter creating tables, adding columns or converting char set of tables.
[9 Jun 2023 12:08] MySQL Verification Team
HI,

This bug is verified.

It yet remains to be seen whether it will require any changes in the code or not. It will take time for the team in charge to determine that.

Regarding your questions, they will be answered when the bug is fixed. Even if the fix will be done in the code, it will be documented in both our Ref. Manual and a ChangeLog. Then, you can make your own conclusions.