Bug #85989 | Danish collations latin1 and utf8 treats 'å' and 'aa' differently | ||
---|---|---|---|
Submitted: | 18 Apr 2017 18:57 | Modified: | 20 Apr 2017 11:37 |
Reporter: | Martin Larsson | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S2 (Serious) |
Version: | 5.5/5.6/5.7/8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[18 Apr 2017 18:57]
Martin Larsson
[18 Apr 2017 22:33]
MySQL Verification Team
Thank you for the bug report. Looks like duplicate\related to bug: https://bugs.mysql.com/bug.php?id=37571 mysql 5.7 > SHOW VARIABLES LIKE "%VERSION%"; +-------------------------+---------------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------------+ | innodb_version | 5.7.19 | | protocol_version | 10 | | slave_type_conversions | | | tls_version | TLSv1,TLSv1.1 | | version | 5.7.19 | | version_comment | Source distribution pull: 2017-apr-05 | | version_compile_machine | x86_64 | | version_compile_os | Linux | +-------------------------+---------------------------------------+ 8 rows in set (0,02 sec) mysql 5.7 > create database db CHARACTER SET latin1 COLLATE latin1_danish_ci; Query OK, 1 row affected (0,00 sec) mysql 5.7 > use db Database changed mysql 5.7 > create table n( -> t varchar(5), -> primary key(t) -> ); Query OK, 0 rows affected (0,31 sec) mysql 5.7 > insert into n values ('å'); Query OK, 1 row affected (0,05 sec) mysql 5.7 > insert into n values ('aa'); Query OK, 1 row affected (0,07 sec) mysql 5.7 > select * from n; +----+ | t | +----+ | aa | | å | +----+ 2 rows in set (0,00 sec) mysql 5.7 > drop database db; Query OK, 1 row affected (0,39 sec) mysql 5.7 > create database db CHARACTER SET utf8 COLLATE utf8_danish_ci; Query OK, 1 row affected (0,01 sec) mysql 5.7 > use db Database changed mysql 5.7 > create table n( -> t varchar(5), -> primary key(t) -> ); Query OK, 0 rows affected (0,37 sec) mysql 5.7 > insert into n values ('å'); Query OK, 1 row affected (0,06 sec) mysql 5.7 > insert into n values ('aa'); ERROR 1062 (23000): Duplicate entry 'aa' for key 'PRIMARY' mysql 5.6 > SHOW VARIABLES LIKE "%VERSION%"; +-------------------------+---------------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------------+ | innodb_version | 5.6.37 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.6.37 | | version_comment | Source distribution pull: 2017-apr-05 | | version_compile_machine | x86_64 | | version_compile_os | Linux | +-------------------------+---------------------------------------+ 7 rows in set (0,01 sec) mysql 5.6 > create database db CHARACTER SET latin1 COLLATE latin1_danish_ci; Query OK, 1 row affected (0,02 sec) mysql 5.6 > use db Database changed mysql 5.6 > create table n( t varchar(5), primary key(t) ); Query OK, 0 rows affected (0,35 sec) mysql 5.6 > insert into n values ('å'); Query OK, 1 row affected (0,05 sec) mysql 5.5 > SHOW VARIABLES LIKE "%VERSION%"; +-------------------------+---------------------------------------+ | Variable_name | Value | +-------------------------+---------------------------------------+ | innodb_version | 5.5.56 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.56 | | version_comment | Source distribution pull: 2017-apr-05 | | version_compile_machine | x86_64 | | version_compile_os | Linux | +-------------------------+---------------------------------------+ 7 rows in set (0,00 sec) mysql 5.5 > create database db CHARACTER SET latin1 COLLATE latin1_danish_ci; Query OK, 1 row affected (0,00 sec) mysql 5.5 > use db Database changed mysql 5.5 > create table n( t varchar(5), primary key(t) ); Query OK, 0 rows affected (0,09 sec) mysql 5.5 > insert into n values ('å'); Query OK, 1 row affected (0,05 sec) mysql 5.5 > insert into n values ('aa'); Query OK, 1 row affected (0,05 sec) mysql 5.5 > drop database db; Query OK, 1 row affected (0,05 sec) mysql 5.5 > create database db CHARACTER SET utf8 COLLATE utf8_danish_ci; Query OK, 1 row affected (0,00 sec) mysql 5.5 > use db Database changed mysql 5.5 > create table n( t varchar(5), primary key(t) ); Query OK, 0 rows affected (0,07 sec) mysql 5.5 > insert into n values ('å'); Query OK, 1 row affected (0,06 sec) mysql 5.5 > insert into n values ('aa'); ERROR 1062 (23000): Duplicate entry 'aa' for key 'PRIMARY' mysql 5.5 > mysql 5.6 > insert into n values ('aa'); Query OK, 1 row affected (0,04 sec) mysql 5.6 > select * from n; +----+ | t | +----+ | aa | | å | +----+ 2 rows in set (0,00 sec) mysql 5.6 > drop database db; Query OK, 1 row affected (0,21 sec) mysql 5.6 > create database db CHARACTER SET utf8 COLLATE utf8_danish_ci; Query OK, 1 row affected (0,00 sec) mysql 5.6 > use db Database changed mysql 5.6 > create table n( t varchar(5), primary key(t) ); Query OK, 0 rows affected (0,31 sec) mysql 5.6 > insert into n values ('å'); Query OK, 1 row affected (0,04 sec) mysql 5.6 > insert into n values ('aa'); ERROR 1062 (23000): Duplicate entry 'aa' for key 'PRIMARY'
[19 Apr 2017 5:16]
Martin Larsson
While this is similar to https://bugs.mysql.com/bug.php?id=37571, 37571 concerns the inconvenience that latin1_danish_ci sorts 'aa' wrong. This report is about inserts failing with a 'duplicate key' exception when using utf8_danish_ci. Having the sort order wrong is terrible UX, but the solution is to use utf8 throughout. This would be the most sensible option for new development anyway. However, having that fail with duplicate key errors for unique indexes and primary keys is rather worse.
[19 Apr 2017 6:44]
Martin Larsson
Done criteria for both this bug and https://bugs.mysql.com/bug.php?id=37571 would be that, given my table definition above, the following inserts: insert into n values ('å'); insert into n values ('aa'); insert into n values ('ø'); insert into n values ('aab'); should succeed. And then this select: select * from n order by t; should return +-----+ | t | +-----+ | ø | | aa | | å | | aab | +-----+ 4 rows in set (0.00 sec) The consequence is that the city of Aachen will be sorted wrong. This is unavoidable.
[19 Apr 2017 13:35]
MySQL Verification Team
Thank you for the feedback. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 8.0.2-dmr Source distribution - pull: 2017-apr-05 Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 8.0 > create database db CHARACTER SET latin1 COLLATE latin1_danish_ci; Query OK, 1 row affected (0.08 sec) mysql 8.0 > use db Database changed mysql 8.0 > create table n( -> t varchar(5), -> primary key(t) -> ); Query OK, 0 rows affected (0.39 sec) mysql 8.0 > insert into n values ('å'); Query OK, 1 row affected (0.06 sec) mysql 8.0 > insert into n values ('aa'); Query OK, 1 row affected (0.02 sec) mysql 8.0 > select * from n; +----+ | t | +----+ | aa | | å | +----+ 2 rows in set (0.02 sec) mysql 8.0 > drop database db; Query OK, 1 row affected (0.42 sec) mysql 8.0 > create database db CHARACTER SET utf8 COLLATE utf8_danish_ci; Query OK, 1 row affected (0.03 sec) mysql 8.0 > create table n( -> t varchar(5), -> primary key(t) -> ); Query OK, 0 rows affected (0.24 sec) mysql 8.0 > insert into n values ('å'); Query OK, 1 row affected (0.02 sec) mysql 8.0 > insert into n values ('aa'); ERROR 1062 (23000): Duplicate entry 'aa' for key 'PRIMARY' mysql 8.0 >
[20 Apr 2017 9:11]
Steinar Gunderson
Posted by developer: This is, quite simply, not a bug. latin1_danish_ci and utf8_danish_ci are both accent- and case-insensitive (comparing first level only), and as such, aa and a-with-ring compare equal (they differ only on second level). In MySQL 8.0, you have utf8mb4_da_0900_as_cs, which is accent- and case-sensitive, making them compare different for the purposes of UNIQUE.
[20 Apr 2017 11:37]
Martin Larsson
Good to hear you have fixed this non-bug i 8.0. When can we expect 8.0 to be generally available?
[20 Apr 2017 12:17]
Steinar Gunderson
There is no published release date for 8.0 GA, but there are milestone releases available.