Bug #97103 Altering column collation doesn't seem to affect unique indexes until restart
Submitted: 4 Oct 2019 8:55 Modified: 17 Dec 2019 19:53
Reporter: lalit Choudhary Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:8.0.17 OS:Any
Assigned to: CPU Architecture:Any

[4 Oct 2019 8:55] lalit Choudhary
Description:
There is some issue with column collation and unique indexes/primary key.

Altering column collation doesn't seem to affect unique indexes until restart

How to repeat:
CREATE TABLE `aaa` ( `letter` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL , PRIMARY KEY (`letter`)) ENGINE = InnoDB CHARSET=utf8mb4 COLLATE utf8mb4_0900_ai_ci;
Query OK, 0 rows affected (0.05 sec)

show create table aaa \G
*************************** 1. row ***************************
       Table: aaa
Create Table: CREATE TABLE `aaa` (
  `letter` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL,
  PRIMARY KEY (`letter`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

insert into aaa (letter) values ('Đ'),('D');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

select * from aaa;
+--------+
| letter |
+--------+
| D      |
| Đ      |
+--------+
2 rows in set (0.00 sec)
	
creating the table "correctly" from start works fine, as seen above, now the same with wrong starting column collation

drop table aaa;
Query OK, 0 rows affected (0.06 sec)

CREATE TABLE `aaa` ( `letter` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL , PRIMARY KEY (`letter`)) ENGINE = InnoDB CHARSET=utf8mb4;
Query OK, 0 rows affected (0.06 sec)

ALTER TABLE `aaa` CHANGE `letter` `letter` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL;
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

show create table aaa \G
*************************** 1. row ***************************
       Table: aaa
Create Table: CREATE TABLE `aaa` (
  `letter` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL,
  PRIMARY KEY (`letter`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

insert into aaa (letter) values ('Đ'),('D');
ERROR 1062 (23000): Duplicate entry 'D' for key 'PRIMARY'
insert into aaa (letter) values ('Đ'),('D');
ERROR 1062 (23000): Duplicate entry 'D' for key 'PRIMARY'

restart mysqld service.  reconnect and insert works

show create table aaa \G
*************************** 1. row ***************************
       Table: aaa
Create Table: CREATE TABLE `aaa` (
  `letter` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL,
  PRIMARY KEY (`letter`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

insert into aaa (letter) values ('Đ'),('D');
Query OK, 2 rows affected (0.02 sec)
Records: 2  Duplicates: 0  Warnings: 0
[4 Oct 2019 9:17] Hrvoje Novosel
Just a further note, doing table rebuild with
alter table aaa engine=innodb;
fixes it without the need to restart mysql
[4 Oct 2019 14:11] MySQL Verification Team
Hello Mr. Choudhary,

Thank you for your bug report.

I have run your test case and got identical results, which means that this is a bug.

Verified as reported.
[8 Oct 2019 2:25] Tsubasa Tanaka
This problem has workaround but very dangerous.
No one who think UNIQUE constraint doesn't work until restart or re-construct table.

I think this severity should be S2(Serious)

Regards,
[8 Oct 2019 12:09] MySQL Verification Team
I do not think that this fact should change a severity.

It is expected behaviour that our server can use new resources from ALTERing the table, only when ALTER has finished it work. That behaviour of DDLs is defined in SQL standards ......
[8 Oct 2019 12:31] Tsubasa Tanaka
Shinisa,

The issue occurs after ALTER TABLE statement finished until restarting mysqld (or OPTIMIZE TABLE and so)

1. Starting ALTER TABLE
2. Executing ALTER TABLE
3. Finished ALTER TABLE
4. After finishied ALTER TABLE
5. Table definition is new collation but unique index still uses old collation
6. Restart mysqld or execute ALTER TABLE Engine = None

You told about between 1. and 3., yes, I agree, I should not expect new collation.
But this problem occurs 4. and 5. and that's S2(serious), this behavior breaks expected UNIQUE constraint but have workaround.

Isn't it?
[8 Oct 2019 13:47] MySQL Verification Team
You wrote it yourself .....

It has a workaround ..... an easy workaround .....
[8 Oct 2019 14:45] Tsubasa Tanaka
Both S2 and S3 have "workaround is available" in their definitions.

I think 'Ignoreing new-collation is "significant functionality is missing; but a workaround is available.", so S2'.

See https://bugs.mysql.com/report.php and click "see definitions" near by "severity"

```
> S1 (Critical): Represents a complete loss of service, a significant functionality is missing, a system that hangs indefinitely; and there is no available workaround.

> S2 (Serious): Represents a severe loss of service, significant functionality is missing; but a workaround is available.

> S3 (Non-critical): Represents a minor loss of service, inconvenient usage; but a workaround is available.

<snip>
```
[9 Oct 2019 11:50] MySQL Verification Team
This report will be reviewed by a separate body which will set a correct priority and severity.
[10 Oct 2019 9:54] Hrvoje Novosel
Mr. Tanaka may be right regarding the severity,
this example has an easy fix/workaround, but reverse the example:
Start with an AS collation, then alter to AI and you will be able to insert invalid data into the table. That table will become kinda corrupted (you won't be able to alter it any more for example because any rebuild alter will trigger a duplicate key error) unless records are manually cleaned... and now imagine this table being a part of a FK to other tables...... cleaning this up could be quite an undertaking. And also the errors aren't immediately evident, you could realize you have a problem N months later.
[10 Oct 2019 12:31] MySQL Verification Team
All your comments are forwarded to our internal bugs database.
[17 Dec 2019 19:53] Paul DuBois
Posted by developer:
 
Fixed in 8.0.20.

Altering column collations did not affect unique indexes until a
server restart.
[24 Aug 2023 3:37] linkang zhang
// I think this bug should be critical, because it can cause wrong query result and can make unique index invalid.

// You can use this testcase:

CREATE TABLE `t1` ( `c1` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL , UNIQUE KEY (`c1`)) ENGINE = InnoDB;

SHOW CREATE TABLE t1;

INSERT INTO t1 (c1) VALUES ('Đ');

INSERT INTO t1 (c1) VALUES ('D');

ALTER TABLE `t1` CHANGE `c1` `c1` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL;

SELECT * FROM t1 WHERE c1='d';

DROP TABLE t1;

// And the .result is :

CREATE TABLE `t1` ( `c1` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL , UNIQUE KEY (`c1`)) ENGINE = InnoDB;
SHOW CREATE TABLE t1;
Table	Create Table
t1	CREATE TABLE `t1` (
  `c1` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci NOT NULL,
  UNIQUE KEY `c1` (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
INSERT INTO t1 (c1) VALUES ('Đ');
INSERT INTO t1 (c1) VALUES ('D');
ALTER TABLE `t1` CHANGE `c1` `c1` VARCHAR(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL;
SELECT * FROM t1 WHERE c1='d';
c1
D
DROP TABLE t1;

// We just have one result, and we can't find 'Đ', which is critical
[24 Aug 2023 12:06] MySQL Verification Team
Hi,

This bug was fixed 4 years ago.