Bug #14585 cannot find two literal backslashes using LIKE predicate
Submitted: 2 Nov 2005 18:46 Modified: 3 Nov 2005 11:29
Reporter: Raimund Jacob Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1.9-4.1.16-BK OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[2 Nov 2005 18:46] Raimund Jacob
Description:
See the SQL fragment. The second select should return the other row.

How to repeat:
/* executed on debian mysql 4.1.9 connected via JDBC driver 3.0.16 */
drop table if exists foo;
create table foo (i int, str varchar(42));

insert into foo (i, str) values (1, '\\');
insert into foo (i, str) values (2, '\\\\');

/* ok, matches row 1 */
select * from foo where str like '\\';

/* wrong, matches row 1 again, should match row 2 */
select * from foo where str like '\\\\';

Suggested fix:
handle backslash-escapes in string literals correctly :)
[2 Nov 2005 19:00] C.J. Adams-Collier
10:57 < Raimi> it works on mssql, ora and maxdb like tis
10:57 < Raimi> had pretty hard time making it work everywhere
10:58 < Raimi> (like this == proper quoting on each db)
10:58 < Raimi> i played around a little - didnt find a different working
               statement
[3 Nov 2005 8:30] Valeriy Kravchuk
Thank you for a problem report. This '\' problem is documented, in fact. Read the manual carefully (http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html):

"Note: Because MySQL uses C escape syntax in strings (for example, ‘\n’ to represent a newline character), you must double any ‘\’ that you use in LIKE strings. For example, to search for ‘\n’, specify it as ‘\\n’. To search for ‘\’, specify it as ‘\\\\’; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against."

The real solution is to use ESCAPE to set different escape character:

mysql> drop table if exists foo;
Query OK, 0 rows affected, 1 warning (0,00 sec)

mysql> create table foo (i int, str varchar(42));
Query OK, 0 rows affected (0,07 sec)

mysql> insert into foo (i, str) values (1, '\\');
Query OK, 1 row affected (0,00 sec)

mysql> insert into foo (i, str) values (2, '\\\\');
Query OK, 1 row affected (0,00 sec)

mysql> select * from foo;
+------+------+
| i    | str  |
+------+------+
|    1 | \    |
|    2 | \\   |
+------+------+
2 rows in set (0,00 sec)

mysql> select * from foo where str like '\\';
+------+------+
| i    | str  |
+------+------+
|    1 | \    |
+------+------+
1 row in set (0,02 sec)

mysql> select * from foo where str like '\\\\';
+------+------+
| i    | str  |
+------+------+
|    1 | \    |
+------+------+
1 row in set (0,00 sec)

So, yes, the problem you described exitsts... Here is solution (for MySQL only)

mysql> select * from foo where str like '|\\' escape '|';
+------+------+
| i    | str  |
+------+------+
|    1 | \    |
+------+------+
1 row in set (0,00 sec)

mysql> select * from foo where str like '|\\\\' escape '|';
+------+------+
| i    | str  |
+------+------+
|    2 | \\   |
+------+------+
1 row in set (0,00 sec)

I am still not sure you'll be able to get the same results with exactly this SQL in any other database, though...

Is the workaround I proposed OK for you?
[3 Nov 2005 10:03] Raimund Jacob
Thanks for your comments. Well, using a different ESCAPE character is no solution for me. I found this bug while systematic testing my code which dynamically creates SQL statements (for MySQL, MaxDB, ora  and mssql) and the code uses only one escape character, properly quoting it when querying for it. As mentioned above: There is always a way to specify it correctly.

Now, After reading the note in the docs it is clear that:

 select * from foo where str like '\\\\';

is supposed to match row 1. After the same rationale:

 select * from foo where str like '\\\\\\\\';

is supposed to match row 2 and in fact it does.

However, there is still a bug: The initial statement

 select * from foo where str like '\\';

should match nothing since it is actually an illegal literal. This expression matching row 1 made me believe i knew how escaping worked.

IMHO, the double-stripping of backslashes is a bug. I can understand that it follows the implementation, but it is not what you would expect and what you are used to from other programs/dbs.
[3 Nov 2005 10:15] Valeriy Kravchuk
Changing a \ behaviour to be the same as in other RDBMSes is, at most, a feature request...

But that handling of \\ (see the example below):

mysql> select * from foo where str like '\\';
+------+------+
| i    | str  |
+------+------+
|    1 | \    |
+------+------+
1 row in set (0,02 sec)

is either a bug (it contradicts our documentation I had quoted - '\\' should be equivalent to ''), or a request to document it properly (if this weird behaviour is intended).

Verified on 4.1.16-BK (ChangeSet@1.2460, 2005-11-01 13:00:02+02:00, monty@mysql.com) on Linux Fedora Core 1.
[3 Nov 2005 11:29] Sergei Golubchik
Thank you for your bug report. This issue has already been fixed
in the latest released version of that product, which you can download at 
http://www.mysql.com/downloads/

Additional info:

This feature request is already implemented - search the manual for the sql-mode NO_BACKSLAHS_ESCAPES