Bug #86233 NVARCHAR should default to utf8mb4
Submitted: 9 May 2017 3:22 Modified: 27 Apr 2020 16:24
Reporter: Rick James Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[9 May 2017 3:22] Rick James
See: https://dev.mysql.com/doc/refman/8.0/en/string-type-overview.html

That page implies (without clearly stating) that "MySQL uses utf8" for NVARCHAR, but fails to specifically say that "CHARACTER SET utf8" is defaulted (or forced) on the definition of a column.  It does not say "utf8mb4", which seems like the 'right thing to do' in 8.0.

(This is probably both a Code and a Documentation issue.  And it may impact pre-8.0 versions; see "Suggested fix".)

How to repeat:

Suggested fix:
Change action and documentation (see link above).

Consider whether to retrofit documentation into prior versions.  Something like:

"A NVARCHAR column that is not does not have an explicit CHARACTER SET or COLLATION is assigned CHARACTER SET utf8mb4"  (or utf8 for pre-8.0 versions).
[9 May 2017 8:57] MySQL Verification Team
Hello Rick,

Thank you for the report and feedback.

[4 Sep 2017 10:15] Manyi Lu
Beware, changing definition of NCHAR has impact on restoring from a mysqldump image.
[27 Apr 2020 16:24] Rick James
It is getting quite late in the 8.0 cycle to make a semi-incompatible change.  Perhaps this should be changed to a "Documentation" error?