Bug #46659 | Using Percent and Backslashes in LIKE Queries is buggy | ||
---|---|---|---|
Submitted: | 11 Aug 2009 16:23 | Modified: | 12 Aug 2009 12:07 |
Reporter: | Christian Hammers (Silver Quality Contributor) (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Charsets | Severity: | S3 (Non-critical) |
Version: | 5.0, 5.1 | OS: | Linux (Debian 5.0 lenny, Ubuntu 9.04 amd64) |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | qc |
[11 Aug 2009 16:23]
Christian Hammers
[11 Aug 2009 16:55]
MySQL Verification Team
I couldn't repeat on Windows with source tree server: c:\dbs>c:\dbs\5.1\bin\mysql -uroot --port=3510 --prompt="mysql 5.1 >" Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 6 Server version: 5.1.38-Win X64 Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.1 >use test Database changed mysql 5.1 >CREATE TABLE t (info varchar(255) DEFAULT NULL) ENGINE=MyISAM DEFAULT -> CHARSET=latin1; Query OK, 0 rows affected (0.17 sec) mysql 5.1 >INSERT INTO t VALUES ('März'), ('C:\\Windows'), ('foo'); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql 5.1 >SELECT * FROM t WHERE info like '%\\\\%'; +------------+ | info | +------------+ | C:\Windows | +------------+ 1 row in set (0.00 sec) mysql 5.1 >SHOW VARIABLES LIKE "%char%"; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | c:\dbs\5.1\share\charsets\ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql 5.1 >
[12 Aug 2009 5:17]
Sveta Smirnova
Thank you for the report. I can not repeat described behavior too. Please provide your configuration file and indicate accurate name of package you are using (file name or configure options if you built MySQL yourself).
[12 Aug 2009 8:58]
Christian Hammers
Hello I can reproduce the bug by cut&paste the commands that Miguel tried, too! My result ist still: mysql> SELECT * FROM t WHERE info like '%\\\\%'; +------------+ | info | +------------+ | März | | C:\Windows | +------------+ 2 rows in set (0.00 sec) My locale settings: $ locale LANG=de_DE@euro LC_CTYPE="de_DE.ISO-8859-15@euro" LC_NUMERIC="de_DE.ISO-8859-15@euro" LC_TIME="de_DE.ISO-8859-15@euro" LC_COLLATE="de_DE.ISO-8859-15@euro" LC_MONETARY="de_DE.ISO-8859-15@euro" LC_MESSAGES="de_DE.ISO-8859-15@euro" LC_PAPER="de_DE.ISO-8859-15@euro" LC_NAME="de_DE.ISO-8859-15@euro" LC_ADDRESS="de_DE.ISO-8859-15@euro" LC_TELEPHONE="de_DE.ISO-8859-15@euro" LC_MEASUREMENT="de_DE.ISO-8859-15@euro" LC_IDENTIFICATION="de_DE.ISO-8859-15@euro" LC_ALL=de_DE.ISO-8859-15@euro I'm using mysql-5.1.36-linux-i686-glibc2 on a Debian 5.0 "lenny" system with the following my.cnf: [client] port = 3306 socket = /tmp/mysql.sock [mysqld] port = 3306 socket = /tmp/mysql.sock skip-locking key_buffer = 256M max_allowed_packet = 1M table_cache = 256 sort_buffer_size = 1M read_buffer_size = 1M myisam_sort_buffer_size = 64M thread_cache = 8 query_cache_size = 256M thread_concurrency = 4 tmp_table_size = 128M max_heap_table_size = 64M datadir = /usr/local/mysql3306/data log-bin = /usr/local/mysql3306/log/dbpriv2-bin relay-log = /usr/local/mysql3306/log/dbpriv2-relay-bin slow_query_log_file = /usr/local/mysql3306/log/dbpriv2-slow slow_query_log long_query_time = 1 max_connections = 1024 expire_logs_days = 31 server-id = 2 replicate-ignore-db = mysql log-slave-updates skip-innodb [mysqldump] quick max_allowed_packet = 32M [mysql] [isamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
[12 Aug 2009 12:07]
Susanne Ebrecht
I am able to repeat it. Test: Use an utf8 KDE/Gnome Terminal Open MySQL CLI SET NAMES UTF8; CREATE DATABASE bug46659 CHARACTER SET latin1; USE bug46659; CREATE TABLE t(info VARCHAR(255) DEFAULT NULL) ENGINE=myisam CHARSET latin1; INSERT INTO t VALUES('März'),('C:\\Windows'),('foo'),('\\\\'); SELECT * FROM t WHERE info LIKE '%\\\\%'; +------------+ | info | +------------+ | März | | C:\Windows | | \\ | +------------+ 3 rows in set (0.00 sec) Why is 'März' and 'C:\Windows' listed here. LIKE '%\\\\%' should only match when I have two backslashes in a row. And not for for single backslash. Hex code is 5c5c select HEX(info) FROM t; +----------------------+ | hex(info) | +----------------------+ | 4DE4727A | | 433A5C57696E646F7773 | | 666F6F | | 5C5C | +----------------------+ 433A5C57 == C:\W and not C:\\W ... so LIKE '%\\\\%' not should find this. Same behaviour with ISO-8859-15 and LATIN1 environment. **** Same test by using default database and table charset utf8: SELECT * FROM t WHERE info LIKE '%\\\\%'; +------------+ | info | +------------+ | C:\Windows | | \\ | +------------+ März is not found anymore but still single backslash is found. I used utf8 database/table for further tests: INSERT INTO t VALUES('\\'); SELECT * FROM t WHERE info LIKE '%\\\\%'; +------------+ | info | +------------+ | C:\Windows | | \\ | | \ | +------------+ SELECT * FROM t WHERE info LIKE '\\\\%'; +------+ | info | +------+ | \\ | | \ | +------+ SELECT * FROM t WHERE info LIKE '\\\\'; +------+ | info | +------+ | \ | +------+ SELECT * FROM t WHERE info = '\\\\'; +------+ | info | +------+ | \\ | +------+ It seems LIKE is doing double escaping here. When I use latin1 database/table again .... With content März, C:\Windows and \\ (INSERT INTO t VALUES('März'),('C:\\Windows'),('foo'),('\\\\');) then: select * from t where info like '%\\\\\\\\%'; +------+ | info | +------+ | \\ | +------+ Why do I need 8 backslashes to find double backslash?
[12 Aug 2009 12:13]
Susanne Ebrecht
Same bug in 5.0
[12 Aug 2009 12:14]
Peter Laursen
This statement is wrong "LIKE '%\\\\%' should only match when I have two backslashes in a row. And not for for single backslash." So result here should be +------------+ | info | +------------+ | C:\Windows | | \\ | +------------+ refer to: http://dev.mysql.com/doc/refman/5.1/en/string-comparison-functions.html#operator_like "Note Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\\n”. To search for “\”, specify it as “\\\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against. (Exception: At the end of the pattern string, backslash can be specified as “\\”. At the end of the string, backslash stands for itself because there is nothing following to escape.)" (this was added to manual around 2 years ago on my request actually. I confused this, my colleagues did and our users did too!)
[15 Aug 2009 10:09]
Sergei Golubchik
I suppose it's intentional. According to http://www.collation-charts.org/mysql60/mysql604.latin1_swedish_ci.html in latin1_swedish_ci '\' == 'ä'
[15 Aug 2009 11:33]
Peter Laursen
"in latin1_swedish_ci '\' == 'ä'" well .. sometimes collations have surprises included! How can '\' == 'ä' in Swedish? If this is specific for Swedish collation, I think the collation is buugy. And pretty serious then as Swedish collation is latin1 default.
[19 Oct 2009 8:04]
Bernt Marius Johnsen
In Latin1 swedish 'ä' and '\' have different values. In ISO-646-SE (which is a 7-bit charset) 'ä' has the same value as in 7-bit ASCII '{'. It is 'ö' that has the same value as '\'. But then again, don't confuse latin1 (aka ISO-8859-1) with ISO-646-SE.
[24 Dec 2009 7:35]
Alexander Barkov
http://forge.mysql.com/worklog/task.php?id=5170