Bug #105077 As of 8.0.26, mysqldump substitutes utf8mb3 for utf8
Submitted: 29 Sep 2021 15:31 Modified: 30 Sep 2021 17:50
Reporter: Glen Peterson Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:8.0.26 OS:Any
Assigned to: CPU Architecture:Any
Tags: utf8, utf8mb3, utf8mb4

[29 Sep 2021 15:31] Glen Peterson
Description:
I just upgraded my mysql docker container to 8.0.26.  I check my schema into git after every change and noticed that when I run:

```
docker run \
    -e LANG=C.UTF-8 \
    -it \
    --link mjl-dev-db:mysql \
    --rm \
    -v /home/mysql/mjl/dev/conf:/etc/mysql/conf.d:ro \
    -v $MJL_HOME/:/etc/mysql/backups/ \
    mysql:8 sh -c 'exec mysqldump -h"$MYSQL_PORT_3306_TCP_ADDR" -P"$MYSQL_PORT_3306_TCP_PORT" -uroot -p"$MYSQL_ENV_MYSQL_ROOT_PASSWORD" --single-transaction --tables --no-data mj_online > /etc/mysql/backups/tablesBackup.sql'
```

I used to see tables with CHARSET=utf8 on the tables and nothing on the varchar columns.  Now I see CHARSET=utf8mb3 on the tables, which is deprecated.  Why isn't it utf8mb4?  Mariadb uses 4 as the synonym for utf8.

How to repeat:
Use mysqldump to dump tables of any database in mysql 8.0.25.  Then do the same in 8.0.26.  Notice that the tables that were utf8 are now utf8mb3 which is deprecated.

Suggested fix:
Make mysqldump show utf8mb4 instead of utf8.  If there was only ever utf8mb3 data in those tables, utf8mb4 should preserve all that and allow new characters as well.
[29 Sep 2021 15:49] Glen Peterson
I don't know if the root cause is with mysqldump or with the table or the storage engine.

When I tried to switch back to 8.0.25 I get:

```
2021-09-29 15:46:08+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.0.25-1debian10 started.
2021-09-29T15:46:08.902602Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.25) starting as process 1
2021-09-29T15:46:08.908907Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2021-09-29T15:46:09.065990Z 1 [ERROR] [MY-013171] [InnoDB] Cannot boot server version 80025 on data directory built by version 80026. Downgrade is not supported
mysqld: Can't open file: 'mysql.ibd' (errno: 0 - )
2021-09-29T15:46:14.067765Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine
2021-09-29T15:46:14.068055Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2021-09-29T15:46:14.068318Z 0 [ERROR] [MY-010119] [Server] Aborting
2021-09-29T15:46:14.068843Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.25)  MySQL Community Server - GPL.
```
I'll continue trying to verify that 8.0.25 doesn't have this problem.
[30 Sep 2021 8:57] MySQL Verification Team
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html

10.9.3 The utf8 Character Set (Alias for utf8mb3)
[30 Sep 2021 12:35] MySQL Verification Team
Hi Mr. Petersen,

Thank you for your bug report.

However , it is not a bug.

We have already posted that UTF8 is synonymous with UTF8mb3.

That is first thing. Second, UTF8mb3 is not deprecated, it is simply not a default character set any more.

Last, but not least, we do not support downgrading our 8.0 releases to the previous release.

If you wish to use UTF8mb4, you have to ALTER your tables accordingly.

Not a bug.
[30 Sep 2021 15:34] Steinar Gunderson
utf8mb3 is indeed deprecated, and nobody should use it anymore.

utf8 is (currently) an alias for utf8mb3, and by extension, nobody should it use anymore.

The recommendation is to use utf8mb4, which is also the default if you do not specify a collation/character set.
[30 Sep 2021 17:50] Glen Peterson
If utf8mb3 is not deprecated, you might want to update the documentation to reflect that here:
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html

Changing the charset on the table, all the columns suddenly show the old/bad charset.

Original:

`cover_png` varchar(255) DEFAULT NULL,

After changing table charset to utf8mb4:

`cover_png` varchar(255) CHARACTER SET utf8 DEFAULT NULL,

After changing column to match the table:

`cover_png` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,

I understand that this is not a bug because I have the option to change every table to utf8mb4, then find every varchar field in every table and also change it to utf8mb4.  This is necessary because the old utf8 now defaults to utf8mb3 instead of utf8mb4.

mb3 data is valid mb4 data so that it would be just as easy for utf8 to default to mb4 the way another popular open-source database does.  If it did, I wouldn't have to do anything and I'd get increased character support.
[7 Oct 2021 13:41] Tor Didriksen
https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html

says

 Note

The utf8mb3 character set is deprecated and you should expect it to be removed in a future MySQL release. Please use utf8mb4 instead.