| Bug #58686 | CREATE TABLE .. CHARSET=<...> without a COLLATE using case-insensitive collation | ||
|---|---|---|---|
| Submitted: | 3 Dec 2010 0:16 | Modified: | 15 Dec 2010 23:39 |
| Reporter: | Greg Kemnitz | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server | Severity: | S3 (Non-critical) |
| Version: | 5.1.43sp1-enterprise-gpl | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | case insensitive, case sensitive, charset, collate, CREATE TABLE | ||
[3 Dec 2010 0:18]
Greg Kemnitz
The "suggested fix" applies to CREATE TABLE with CHARSET defined but not COLLATE defined. Since this is ambiguous, it would also be reasonable to simply produce an error if CHARSET is defined without COLLATE - instead of producing a potentially problematic default.
[15 Dec 2010 23:39]
Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php If CHARACTER SET X was specified without COLLATE, then character set X and its default collation are used. Default collation for latin1 is latin1_swedish_ci. See also http://dev.mysql.com/doc/refman/5.1/en/charset-charsets.html and output of query show character set;

Description: If you do CREATE TABLE XX (A VARCHAR(20)) ENGINE=INNODB CHARSET=LATIN1; and the settings for (conn/server/db) collation all are latin1_general_cs, you will still get a case-insensitive collation. Note that if you create the table without an explicit character set declaration, you _will_ correctly get the latin1 character set with a case sensitive collation. Below is a log showing the behavior: mysql>show variables like '%collation%'; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | latin1_general_cs | | collation_database | latin1_general_cs | | collation_server | latin1_general_cs | +----------------------+-------------------+ 3 rows in set (0.44 sec) create table tt (a varchar(20)) charset=latin1 engine=innodb; Query OK, 0 rows affected (0.00 sec) mysql>insert into tt values ('aa'); Query OK, 1 row affected (0.00 sec) mysql>insert into tt values ('AA'); Query OK, 1 row affected (0.00 sec) mysql>select * from tt where a = 'aa'; +------+ | a | +------+ | aa | | AA | +------+ 2 rows in set (0.00 sec) mysql>drop table tt; Query OK, 0 rows affected (0.01 sec) mysql>create table tt (a varchar(20)) engine=innodb; Query OK, 0 rows affected (0.00 sec) mysql>insert into tt values ('AA'); Query OK, 1 row affected (0.00 sec) mysql>insert into tt values ('aa'); Query OK, 1 row affected (0.00 sec) mysql>select * from tt where a = 'aa'; +------+ | a | +------+ | aa | +------+ 1 row in set (0.00 sec) We were able to repair this and make our database case-sensitive by doing ALTER TABLE ... CHARSET=DEFAULT, but since are "table rebuild" ALTERs, it took several hours on our production database. How to repeat: See above log Suggested fix: The suggested fix is to have the CREATE TABLE use the default collation as long as it is compatible with the character set, or if not (ie, using UTF8), it should use a case-sensitive collation if that is the default. The default should NOT be to a case-insensitive collation; that is definitely not expected behavior.