Bug #14716 | 'oe' does not match œ in utf8_unicode_ci | ||
---|---|---|---|
Submitted: | 7 Nov 2005 19:11 | Modified: | 6 Dec 2005 21:08 |
Reporter: | M K | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 4.1.15 | OS: | Windows (Windows XP) |
Assigned to: | Vasily Kishkin | CPU Architecture: | Any |
[7 Nov 2005 19:11]
M K
[9 Nov 2005 11:43]
Vasily Kishkin
Could you please write here table definition for example?
[9 Nov 2005 15:47]
M K
# # Warning! Do not use MySQL Command Line Client (at least in Windows) for this. Use MySQL Query Browser to correctly INSERT the œ ligature. # CREATE TABLE egg ( `col1` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT '' ) ENGINE = MYISAM; INSERT egg SET col1='oeuf'; INSERT egg SET col1='œuf'; # In short, these two queries should return the same results: SELECT * FROM egg WHERE col1='oeuf'; SELECT * FROM egg WHERE col1 LIKE 'oeuf'; # More relevant queries: # Equals (=) works fine (return both 'oeuf' and ''œuf''): SELECT * FROM egg WHERE col1='oeuf'; SELECT * FROM egg WHERE col1='œuf'; # But LIKE and RLIKE only return one of the two: SELECT * FROM egg WHERE col1 LIKE 'œuf'; SELECT * FROM egg WHERE col1 LIKE 'oeuf'; SELECT * FROM egg WHERE col1 RLIKE 'œuf'; SELECT * FROM egg WHERE col1 RLIKE 'oeuf'; SELECT * FROM egg WHERE col1 LIKE '%œ%'; SELECT * FROM egg WHERE col1 LIKE '%oe%'; # Each of the above only return one of the two lines. They should all return two lines. # LIKE 'oeuf' should return the same as = 'oeuf' # There might be a second bug - the inability of INSERT egg SET col1='œuf'; to correctly insert binary C593 # for œ MySQL Command Line Client in Windows. This maybe a Windows System bug though.
[17 Nov 2005 8:08]
Vasily Kishkin
Thanks for bug reprot. I was able to reproduce the bug on Windows. "Like" returns only one row. SELECT * FROM egg WHERE col1='oeuf'; 'oeuf' 'œuf' SELECT * FROM egg WHERE col1 LIKE 'oeuf'; 'oeuf' It was tested in MySQL Query Browser.
[21 Nov 2005 19:32]
M K
Strangely, this bug seems similar to a (fixed) bug that I had reported previously: http://bugs.mysql.com/bug.php?id=7878 Maybe a test could be created to always test if '=' and LIKE produce the same results when accented or other special charaters are involved. Thanks.
[24 Nov 2005 10:07]
Alexander Barkov
Dear Michael, This is expected behaviour, conformant to the SQL standard, which requires one-to-one character comparison in LIKE operator. Expansions is the case when LIKE works differently comparing to equals (=) operator: - During equals, "OE ligature" is compared to "O+E", - while during LIKE, "OE ligature" is compared only to "O", and then the next character after "OE ligature" is compared to "E". Bug#7878 was really a bug. Accented letters are not expansions, they are single letters. Thus, they are processed the same way LIKE and =.
[6 Dec 2005 21:08]
M K
Thank you, but after thinking about it for a while, I still have a lot of trouble with your answer. You are basically saying that it is OK for '=' and LIKE to have different set of matching rules for certain characters. I really think that people are going to be very confused with functionality like this. You are the experts, but I also feel that this should be put in the category of "won't fix" instead of "not a bug".
[6 Dec 2005 21:31]
Paul DuBois
This behavior is according to standard SQL, and it is documented here: http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html