Bug #95205 Support of U+32FF, a ligature of Japanese new regnal year "Reiwa"
Submitted: 30 Apr 2019 15:01 Modified: 1 May 2019 6:42
Reporter: Ryusuke Kajiyama Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[30 Apr 2019 15:01] Ryusuke Kajiyama
Description:
Collation of MySQL is not collate "U+32FF" (a ligature of Japanese new regnal year "Reiwa") and Decomposition Mapping "U+4EE4 U+548C" as expected.
U+32FF is defined as of Unicode 12.1;
 http://unicode.org/versions/Unicode12.1.0/

The new Japanese regnal name was officially effective as of May 1, 2019.

Code points of ligatures of previous eras U+337E (㍾), U+337D (㍽), U+337C (㍼), and U+337B (㍻) are handled properly.

How to repeat:
MySQL > SELECT _utf8mb4 0xe5b9b3e68890 = _utf8mb4 0xE38DBB AS "平成 = U+337B";
+---------------+
| 平成 = U+337B |
+---------------+
|             1 |
+---------------+
1 row in set (0.0005 sec)

MySQL > SELECT _utf8mb4 0xE4BBA4E5928C = _utf8mb4 0xE38BBF AS "令和 = U+32FF";
+------------------------+
| 令和 = ligature U+32FF |
+------------------------+
|                      0 |
+------------------------+
1 row in set (0.0090 sec)

Please refer 5th code in the following blog entity by Masahiro Tomita;
 https://tmtms.hatenablog.com/entry/201904/mysql-reiwa

Suggested fix:
Support Unicode 12.1 in collations or update existing utf8mb4_is_0900_* collations.
The latter should be also 1.
[1 May 2019 6:42] Erlend Dahl
Thank you for the feature request.
[30 Mar 2020 22:16] Sveta Smirnova
May 1 2019 is in past. Maybe time to update ICU code, shipped with the server?
[31 Mar 2020 5:28] Erlend Dahl
ICU is only used for regular expressions in MySQL 8.0.

In order to support the Reiwa ligature, we will have to add new collations based on Unicode 12.1.0 or later. We will need new collations in any case, since old collations are never modified. I don't have a timeline for this work.

(The ICU version is scheduled for upgrade in 8.0.21.)