Bug #114469 utf8mb4_general_ci can't convert to utf8mb4_0900_ai_ci
Submitted: 25 Mar 2:55 Modified: 26 Mar 11:15
Reporter: yuexiao liu Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[25 Mar 2:55] yuexiao liu
Description:
  In my case,I have a table using utf8mb4_general_ci which restored from mysql 5.7.43.When I want to switch from utf8mb4_general_ci to utf8mb4_0900_ai_ci beacuse I want to upgrade this server to 8.0,after upgrade ,I used `ALTER TABLE cfca_me.cert CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci; `,and then I got an error with  'ERROR 1062 (23000) at line 275: Duplicate entry 'CN=051@丁一@0**************l@1,OU=Individual-1,OU=Qiyuesuo,' for key 'cert.IDX_CERT_1''.
  I used `mysqldump --set-gtid-purged=off -p -i -c -C -E -e -F --max-allowed-packet=32M -R --single-transaction --triggers -uroot -p -h****** cfca_me >cfca_me.sql` to backup my database ,and used `mysql --default-character-set=utf8mb4 --binary_mode -uroot -p cfca_me <cfca_me.sql` to restore my database.
  How can I switch from utf8mb4_general_ci to utf8mb4_0900_ai_ci? The utf8mb4_0900_ai_ci is not a superset of utf8mb4_general_ci ?

How to repeat:
Please use my sql file restore to your database.
[26 Mar 11:15] MySQL Verification Team
Hi Mr. liu,

Thank you for your bug report.

However, we are unable to repeat it.

Do note, first, that you can do that conversion only after you have upgraded to 8.0.

Second, collations differ a lot between each other. Hence, one collation can have all values unique, while the other will hit some that are duplicate.

In that case, you have to find those that are duplicate.  You can do it by a query that would use introducers to each side in the equality expression. This is all explained in our Reference Manual.

This is, actually, expected behaviour.