Description:
Server version:
mysql [localhost] {msandbox} (coll_test) > \s
--------------
/work/binaries/mysql-server/5.5.30/bin/mysql Ver 14.14 Distrib 5.5.30, for linux2.6 (x86_64) using readline 5.1
Connection id: 1
Current database: coll_test
Current user: msandbox@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 5.5.30 MySQL Community Server (GPL)
Protocol version: 10
Connection: Localhost via UNIX socket
Server characterset: latin1
Db characterset: utf8
Client characterset: latin1
Conn. characterset: latin1
UNIX socket: /tmp/mysql_sandbox5530.sock
Uptime: 20 min 42 sec
Threads: 1 Questions: 72 Slow queries: 0 Opens: 40 Flush tables: 1 Open tables: 28 Queries per second avg: 0.057
When comparing datetime values to strings, the utf8_unicode_ci collation prevents correct comparisons.
Suppose we have the following table:
mysql [localhost] {msandbox} (coll_test) > show create table lang\G
*************************** 1. row ***************************
Table: lang
Create Table: CREATE TABLE `lang` (
`dt` datetime NOT NULL,
PRIMARY KEY (`dt`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
1 row in set (0.00 sec)
And the table contains the following data:
mysql [localhost] {msandbox} (coll_test) > select * from lang;
+---------------------+
| dt |
+---------------------+
| 1979-12-09 00:00:00 |
| 1980-11-09 00:00:00 |
| 2012-12-11 12:15:00 |
+---------------------+
3 rows in set (0.00 sec)
Then the results of comparison between the column dt and a date string depends on the collation of the string.
When the collation is latin1_swedish_ci, the comparison is correctly performed:
mysql [localhost] {msandbox} (coll_test) > select dt, charset(dt), collation(dt) from lang where dt = convert('1979-12-09' using latin1) collate latin1_swedish_ci;
+---------------------+-------------+---------------+
| dt | charset(dt) | collation(dt) |
+---------------------+-------------+---------------+
| 1979-12-09 00:00:00 | binary | binary |
+---------------------+-------------+---------------+
1 row in set (0.00 sec)
When the collation is utf8_general_ci, the comparison is also correctly performed:
mysql [localhost] {msandbox} (coll_test) > select dt, charset(dt), collation(dt) from lang where dt = convert('1979-12-09' using utf8) collate utf8_general_ci;
+---------------------+-------------+---------------+
| dt | charset(dt) | collation(dt) |
+---------------------+-------------+---------------+
| 1979-12-09 00:00:00 | binary | binary |
+---------------------+-------------+---------------+
1 row in set (0.00 sec)
But when the collation is utf8_unicode_ci, then the comparison is incorrectly performed:
mysql [localhost] {msandbox} (coll_test) > select dt, charset(dt), collation(dt) from lang where dt = convert('1979-12-09' using utf8) collate utf8_unicode_ci;
Empty set (0.00 sec)
The comparison results should be consistent and should not change as above depending on the collation used.
How to repeat:
create database coll_test default character set = utf8 default collate = utf8_unicode_ci;
use coll_test;
create table lang(dt datetime not null primary key) engine=myisam;
insert into lang values('1980-11-09 00:00:00'), ('1979-12-09 00:00:00'), ('2012-12-11 12:15:00');
set names utf8 collate utf8_general_ci;
select * from lang where dt='1979-12-09'; -- returns 1 row correctly
set names utf8 collate utf8_unicode_ci;
select * from lang where dt='1979-12-09'; -- returns 0 rows
set names latin1 collate latin1_swedish_ci;
select * from lang where dt='1979-12-09'; -- returns 1 row correctly