Description:
I created a table
CREATE TABLE chartest.jpcharcollja_test (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`charcol1` VARCHAR(100) DEFAULT NULL,
`charcol2` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_0900_AI_CI DEFAULT NULL,
`charcol3` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_BIN DEFAULT NULL,
`charcol4` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_JA_0900_AS_CS DEFAULT NULL,
`charcol5` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_JA_0900_AS_CS_KS DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=INNODB DEFAULT CHARSET=UTF8MB4
The data of charcol1, charcol2 ... charcol5 are same, but collation are different.(In fact, charcol1 as same as charcol2)。
I execute a "select" clause.
mysql> select * from chartest.jpcharcollja_test order by charcol1;
...
300000 rows in set (0.60 sec)
Then, I explicit COLLATE the defaults collation, executed the "select" clause again
mysql> select * from chartest.jpcharcollja_test order by charcol1 collate UTF8MB4_0900_AI_CI;
...
300000 rows in set (1.68 sec)
Then, I changed a new collation, executed the "select" clause again
mysql> select * from chartest.jpcharcollja_test order by charcol1 collate UTF8MB4_JA_0900_AS_CS_KS;
...
300000 rows in set (53.79 sec)
Another pattern:
mysql> select * from chartest.jpcharcollja_test order by charcol5;
...
300000 rows in set (0.77 sec)
Then, I explicit COLLATE the new collation executed the "select" clause again
mysql> select * from chartest.jpcharcollja_test order by charcol5 collate UTF8MB4_JA_0900_AS_CS_KS;
...
300000 rows in set (54.05 sec)
------------------------------------------------------------------------------------------------------------------------------------------------------
For above results, I think maybe optimizer doesn't work for the explicit COLLATE. Even the collation as same as the column's collation when it's created by defaults. Execution time is twice the default.
For pattern 2, execution time is very poor.
This should be a very common application scenario, when the application wants to display the results in a different collation, usually do not change the table definition, but in the generated SQL to specify the collation.
How to repeat:
Step1 Create a table
CREATE TABLE chartest.jpcharcollja_test (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`charcol1` VARCHAR(100) DEFAULT NULL,
`charcol2` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_0900_AI_CI DEFAULT NULL,
`charcol3` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_BIN DEFAULT NULL,
`charcol4` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_JA_0900_AS_CS DEFAULT NULL,
`charcol5` VARCHAR(100)CHARACTER SET UTF8MB4 COLLATE UTF8MB4_JA_0900_AS_CS_KS DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=INNODB DEFAULT CHARSET=UTF8MB4
Step2
insert data
Data are same for every column.
sample data:
# id, charcol1, charcol2, charcol3, charcol4, charcol5
10737, きたわら かずのり, きたわら かずのり, きたわら かずのり, きたわら かずのり, きたわら かずのり
163545, つのおか のぶや, つのおか のぶや, つのおか のぶや, つのおか のぶや, つのおか のぶや
264840, 井手, 井手, 井手, 井手, 井手
167580, もりかわ ひでや, もりかわ ひでや, もりかわ ひでや, もりかわ ひでや, もりかわ ひでや
193661, マスタ トシヒロ, マスタ トシヒロ, マスタ トシヒロ, マスタ トシヒロ, マスタ トシヒロ
48624, 増江 雄仁, 増江 雄仁, 増江 雄仁, 増江 雄仁, 増江 雄仁
182350, すえもり ひろただ, すえもり ひろただ, すえもり ひろただ, すえもり ひろただ, すえもり ひろただ
177443, 北里 隆嗣, 北里 隆嗣, 北里 隆嗣, 北里 隆嗣, 北里 隆嗣
149921, キタウラ ケンスケ, キタウラ ケンスケ, キタウラ ケンスケ, キタウラ ケンスケ, キタウラ ケンスケ
160313, ハコイ カツマサ, ハコイ カツマサ, ハコイ カツマサ, ハコイ カツマサ, ハコイ カツマサ
Step3 execute below SQL then confirm the execution time
select * from chartest.jpcharcollja_test order by charcol1;
select * from chartest.jpcharcollja_test order by charcol1 collate UTF8MB4_0900_AI_CI;
select * from chartest.jpcharcollja_test order by charcol1 collate UTF8MB4_JA_0900_AS_CS_KS;
select * from chartest.jpcharcollja_test order by charcol5;
select * from chartest.jpcharcollja_test order by charcol5 collate UTF8MB4_JA_0900_AS_CS_KS;
Suggested fix:
I think optimizer doesn't work for the explicit COLLATE.