Bug #32800 Umlaut in WHERE filters incorrectly
Submitted: 28 Nov 2007 7:43 Modified: 28 Nov 2007 12:20
Reporter: Martin Kirchner Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S2 (Serious)
Version:5.0.45 and 5.0.32 OS:Any
Assigned to: CPU Architecture:Any

[28 Nov 2007 7:43] Martin Kirchner
Description:
SELECT * FROM umlauttest u WHERE value LIKE 'Ä%';

returns all records with value starting with A and Ä (A umlaut).

How to repeat:
USE test;
DROP TABLE IF EXISTS `test`.`umlauttest`;
CREATE TABLE  `test`.`umlauttest` (
  `value` varchar(45) NOT NULL,
  PRIMARY KEY  (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO UmlautTest VALUES ('Ähren'); -- Ä - Umlaut
INSERT INTO UmlautTest VALUES ('Äpfel'); -- Ä - Umlaut
INSERT INTO UmlautTest VALUES ('Ahorn'); -- just A
INSERT INTO UmlautTest VALUES ('Birne');

SELECT * FROM umlauttest u WHERE value LIKE 'Ä%';

Suggested fix:
Handle A and Ä as different letters.
[28 Nov 2007 8:42] Sveta Smirnova
Thank you for the report.

Please provide output of SHOW VARIABLES LIKE 'collation%'
[28 Nov 2007 12:13] Martin Kirchner
'collation_connection', 'utf8_general_ci'
'collation_database', 'utf8_general_ci'
'collation_server', 'utf8_general_ci'
[28 Nov 2007 12:20] Sveta Smirnova
Thank you for the feedback.

But this is expected behaviour. See also http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html about utf8_general_ci collation.