Bug #70767 | LOCATE() and INSTR() claim to be multibyte aware, but are not | ||
---|---|---|---|
Submitted: | 29 Oct 2013 19:09 | Modified: | 19 May 2014 23:29 |
Reporter: | Arthur O'Dwyer | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S2 (Serious) |
Version: | 5.5.31, 5.5.34 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[29 Oct 2013 19:09]
Arthur O'Dwyer
[30 Oct 2013 8:27]
MySQL Verification Team
Hello Arthur, Thank you for the bug report. Verified as described. Thanks, Umesh
[6 Jan 2014 11:09]
Matthijs Langenberg
Here is another test case for this bug: http://sqlfiddle.com/#!2/9bf21/6 It is also listed as a question on Stack Overflow: http://stackoverflow.com/questions/20923186
[19 May 2014 23:29]
Arthur O'Dwyer
http://bugs.mysql.com/bug.php?id=72685 is related but not identical. (Bug 70767 is about what consitutes a "match"; bug 72685 is about what integer value is returned after a match is found.)
[16 Jun 2015 21:36]
Rick James
Furthermore, "latin1" seems to be hard-coded: mysql> select instr('abc sakā def', 'saka') collate utf8_unicode_ci; ERROR 1253 (42000): COLLATION 'utf8_unicode_ci' is not valid for CHARACTER SET 'latin1' That error occurs in spite of mysql> show variables like 'char%'; +--------------------------+-----------------------------------------------+ | Variable_name | Value | +--------------------------+-----------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | ... +--------------------------+-----------------------------------------------+ and everything else I can think of being utf8.
[22 Feb 2018 18:52]
Mark Guinness
This issue still occurs when using MySQL 5.6 as demonstrated by the following SQL Fiddle. http://www.sqlfiddle.com/#!9/b4fbf/2 Also tried MySQL 8.0.4 with same results but MariaDB gave different results. Regardless the documentation below should be updated to explain how diacritical marks are handled to avoid any confusion. https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_instr
[14 Feb 2020 21:55]
Mark Guinness
This issue still occurs when using MySQL 8.0 and was raised over 6 years ago. If you aren't going to fix then at least the documentation below should be updated to explain collations aren't respected to avoid any confusion. https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_instr