Bug #110291 SHOW CREATE TABLE breaks DEFAULT for 4 byte Unicode characters
Submitted: 7 Mar 2023 10:57 Modified: 7 Mar 2023 13:17
Reporter: Dirkjan Bussink (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:8.0, 5.7 OS:Any
Assigned to: CPU Architecture:Any

[7 Mar 2023 10:57] Dirkjan Bussink
Description:
When creating a table with a DEFAULT for a utf8mb4 column which uses a 4 byte encoded unicode character, the output is corrupted and the character replaced with a ? in SHOW CREATE TABLE.

How to repeat:
CREATE TABLE `encodings` (
  `id` bigint NOT NULL AUTO_INCREMENT ,
  `a` varchar(100) CHARACTER SET ascii DEFAULT 'foo',
  `l` varchar(100) CHARACTER SET latin1 DEFAULT 'føö',
  `u` varchar(100) CHARACTER SET utf8mb4 DEFAULT 'føö√🙈',
  PRIMARY KEY (`id`)
);

SHOW CREATE TABLE encodings\G
*************************** 1. row ***************************
       Table: encodings
Create Table: CREATE TABLE `encodings` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `a` varchar(100) CHARACTER SET ascii COLLATE ascii_general_ci DEFAULT 'foo',
  `l` varchar(100) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'føö',
  `u` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT 'føö√?',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

Suggested fix:
It looks like MySQL internally still uses the utf8mb3 encoding to print the DEFAULT value. Unicode characters which encode to 3 bytes work correctly but 4 byte characters like emojis break.
[7 Mar 2023 13:17] MySQL Verification Team
Hello Dirkjan,

Thank you for the report and feedback.

regards,
Umesh