Bug #8748 LIKE operator dishonours default collation in certain contexts
Submitted: 23 Feb 2005 20:42 Modified: 24 Feb 2005 10:12
Reporter: Axel Schwenke Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.7 OS:Linux (Linux)
Assigned to: Alexander Barkov CPU Architecture:Any

[23 Feb 2005 20:42] Axel Schwenke
Description:
The LIKE operator sometimes does not honour the default collation defined by table and/or connection. However, LIKE and = operators behave as expected when used outside a table context. Example: 'SELECT "Ü" LIKE "Y"' honours the collation set in @@collation_connection. 'SELECT ... FROM ... WHERE ... LIKE "Ü"'  does not. See 'how to repeat' for more details.

How to repeat:
~ $mysql test
Your MySQL connection id is 2 to server version: 4.1.7-log

mysql> set @@collation_connection=latin1_german1_ci;

mysql> show variables like "collation%";
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_german1_ci |
| collation_database   | latin1_swedish_ci |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+

mysql> show create table foo\G
       Table: foo
Create Table: CREATE TABLE `foo` (
  `string` char(20) character set latin1 default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci

mysql> select * from foo where string like "Ü%";
+---------+
| string  |
+---------+
| Übel    |
| Yttrium |
+---------+

--> should not yield 'Yttrium'

The query works if a collation is specified:

mysql> select * from foo where string like "Ü%" collate latin1_german1_ci;
+--------+
| string |
+--------+
| Übel   |
+--------+

The query works again, if LIKE is used without(?) a context:

mysql> select "Ü" LIKE "Y";
+--------------+
| "Ü" LIKE "Y" |
+--------------+
|            0 |
+--------------+

Even more strange:

mysql> alter database test collate latin1_german1_ci;
mysql> set @@collation_server=latin1_german1_ci;
mysql> show variables like "collation%";
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_german1_ci |
| collation_database   | latin1_german1_ci |
| collation_server     | latin1_german1_ci |
+----------------------+-------------------+

mysql> select * from foo where string like "Ü%";
+---------+
| string  |
+---------+
| Übel    |
| Yttrium |
+---------+

but still:

mysql> select "Ü%" LIKE "Y%";
+----------------+
| "Ü%" LIKE "Y%" |
+----------------+
|              0 |
+----------------+

Suggested fix:
As a workaround one can specify the collation to be used in all queries. However this renders the default collation setting for server/database/table/column/connection useless.
[23 Feb 2005 21:54] Jorge del Conde
Verified in 4.1.11 - bk
[23 Feb 2005 22:15] Alexander Barkov
Axel, this is not a bug.
Your column was created with latin1_swedish_ci collation, which
is default for latin1. You can check it with "SHOW FULL FIELDS FROM foo".
This is the reason why your column is acting according to Swedish rules.
[24 Feb 2005 10:12] Hartmut Holzgraefe
See also Example #2 on

 http://dev.mysql.com/doc/mysql/en/charset-examples.html