Bug #16699 inconsistent sql escape sequence behaviour in where conditions
Submitted: 21 Jan 2006 13:04 Modified: 1 May 2006 16:42
Reporter: Jan S. S.R Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.0.21-BK, 4.1.18-BK OS:multiple
Assigned to: Paul DuBois CPU Architecture:Any

[21 Jan 2006 13:04] Jan S. S.R
Description:
varchar values ending in or containing escape characters are not consistently handled when searched for with = or like.
Both of the strange forum nicknames used in "how to repeat" should be returned with only one of the search strings each for = and like, but are sometimes returned for two of them, and not always for the one expected.

How to repeat:
tested versions:

MySQL Server:
Versions 4.1.12-max/linux and 5.0.18/nt

Clients:
php extension 20020429 mysql client version  3.23.49 for linux
command line clients for linux and nt

default character set used in all cases

create table pmcuecontacts(nick varchar(25));
insert into pmcuecontacts(nick)values('<test name/> \\');
select * from pmcuecontacts where nick like '<test name/> \\\\\\';
expected result: empty.
actual result: empty.
select * from pmcuecontacts where nick like '<test name/> \\\\';
expected result: empty.
actual result:: <test name/> \
select * from pmcuecontacts where nick like '<test name/> \\';
expected result: <test name/> \
actual result: <test name/> \
select * from pmcuecontacts where nick ='<test name/> \\\\';
expected result: empty.
actual result: empty.
select * from pmcuecontacts where nick ='<test name/> \\';
expected result: <test name/> \
actual result: <test name/> \
delete from pmcuecontacts;
insert into pmcuecontacts(nick)values('<test name/> \\[]');
select * from pmcuecontacts where nick like '<test name/> \\\\\\\\[]';
expected result: empty.
actual result: empty.
select * from pmcuecontacts where nick like '<test name/> \\\\\\[]';
expected result: empty.
actual result: <test name/> \[]
select * from pmcuecontacts where nick like '<test name/> \\\\[]';
expected result: empty.
actual result: <test name/> \[]
select * from pmcuecontacts where nick like '<test name/> \\[]';
expected result: <test name/> \[]
actual result: empty.
[21 Jan 2006 16:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/1439
[24 Jan 2006 9:34] Valeriy Kravchuk
Thank you for a problem report. Please, read the manual (http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html):

"Note: Because MySQL uses C escape syntax in strings (for example, ‘\n’ to represent a newline character), you must double any ‘\’ that you use in LIKE strings. For example, to search for ‘\n’, specify it as ‘\\n’. To search for ‘\’, specify it as ‘\\\\’; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against."

According to this, there is no bugs in you examples. In 4.1.18-BK I've got:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 4.1.18

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table pmcuecontacts(nick varchar(25));
inQuery OK, 0 rows affected (0.01 sec)

mysql> insert into pmcuecontacts(nick)values('<test name/> \\');
Query OK, 1 row affected (0.00 sec)

mysql> select * from pmcuecontacts;
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.00 sec)

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\\\';
Empty set (0.00 sec)

This is OK.

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\';
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.01 sec)

This is OK, according to the manual.

mysql> select * from pmcuecontacts where nick like '<test name/> \\';
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.00 sec)

This is also not a bug. Look:

mysql> insert into pmcuecontacts values (' |');
Query OK, 1 row affected (0.00 sec)

mysql> select * from pmcuecontacts where nick like '%||%' escape '|';
+------+
| nick |
+------+
|  |   |
+------+
1 row in set (0.00 sec)

mysql> select * from pmcuecontacts where nick like '% |' escape '|';
+------+
| nick |
+------+
|  |   |
+------+
1 row in set (0.00 sec)

mysql> select * from pmcuecontacts where nick ='<test name/> \\\\';
Empty set (0.00 sec)

So, as '\\' was first processed by parser, it is \ for like, and as was demonstrated with other escape character, escape character at the end of value just stands for themself, as they have nothing to escape.

Let's continue:

mysql> select * from pmcuecontacts where nick ='<test name/> \\';
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.00 sec)

This is OK.

mysql> delete from pmcuecontacts;
Query OK, 1 row affected (0.00 sec)

imysql> insert into pmcuecontacts(nick)values('<test name/> \\[]');
Query OK, 1 row affected (0.00 sec)

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\\\\\[]';
Empty set (0.01 sec)

This is OK.

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\\\[]';
+------------------+
| nick             |
+------------------+
| <test name/> \[] |
+------------------+
1 row in set (0.00 sec)

This is also OK, because '\a' is the same as 'a' for LIKE.

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\[]';
+------------------+
| nick             |
+------------------+
| <test name/> \[] |
+------------------+
1 row in set (0.00 sec)

This is OK.

mysql> select * from pmcuecontacts where nick like '<test name/> \\[]';
Empty set (0.00 sec)

And this is OK, too.

mysql> select version();
+-----------+
| version() |
+-----------+
| 4.1.18    |
+-----------+
1 row in set (0.00 sec)

The only thing I can suggest is to document that specific behaviour of escape character at the end of the value, where it stands for itself.
[26 Jan 2006 21:10] Jan S. S.R
OK, I see my misinterpretation by being used to strings being interpreted only once.
Nevertheless the extra rule for \ being at the very of a string end seems a bit inconsistent and confusing to me.
BTW, I would have another 'silly' \-example, here reduced to the max, for which I did not see an explanation in the doc yet, I don't know if it is related:

mysql> select '\\' like '\\';
+----------------+
| '\\' like '\\' |
+----------------+
|              1 | <- example from above
+----------------+

mysql> select '\\' like '\\\\';
+----------------+
| '\\' like '\\' |
+----------------+
|              1 | <- meanwhile explained example from above
+----------------+

mysql> select char(92) like '\\\\';
+----------------+
| '\\' like '\\' |
+----------------+
|              1 | <- this, I understand completely, too
+----------------+

mysql> select char(228) like '\\\\';
+-----------------------+
| char(228) like '\\\\' |
+-----------------------+
|                     1 |           <<--------------------- ä found searching for \ ???
+-----------------------+

mysql> select char(252) like '\\\\';
+-----------------------+
| char(252) like '\\\\' |    <<--------------------- ü not found searching for \
+-----------------------+
|                     0 |
+-----------------------+

mysql> select char(227) like '\\\\';
+-----------------------+
| char(227) like '\\\\' |
+-----------------------+
|                     0 |
+-----------------------+

mysql> select char(229) like '\\\\';
+-----------------------+
| char(229) like '\\\\' |
+-----------------------+
|                     0 |
+-----------------------+

char(228) is the (non-capital) german umlaut ä
You probably can explain why is this found when searching for a \, but the other umlauts (ö,  ü) do not exhibit this behaviour? ... because I might have many of these umlauts in my database (and therefor, possible false positive hits in searches for \ ). Is there something more special to ä than to the other umlauts?

btw, this example was done with iso-latin-1 in:
mysql> select version();
+------------+
| version()  |
+------------+
| 4.1.12-max |
+------------+
under linux
[26 Jan 2006 21:25] Jan S. S.R
mysql> select char(196) like '\\\\';
+-----------------------+
| char(196) like '\\\\' |
+-----------------------+
|                     1 | <----- forgot to test before...
 same hit for 'capital Ä' (almost expected since like is case-insensitive)
[17 Apr 2006 16:59] Paul DuBois
Re-assigning back to Valeriy for re-verification after
further discussion by email.
[17 Apr 2006 17:16] Valeriy Kravchuk
According to internal email discussion, the following is not a documentation request, but possible bug:

mysql> create table pmcuecontacts(nick varchar(25));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into pmcuecontacts(nick)values('<test name/> \\');
Query OK, 1 row affected (0.01 sec)

mysql> select * from pmcuecontacts;
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.00 sec)

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\\\';
Empty set (0.01 sec)

mysql> select * from pmcuecontacts where nick like '<test name/> \\\\';
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.00 sec)

mysql> select * from pmcuecontacts where nick like '<test name/> \\';
+----------------+
| nick           |
+----------------+
| <test name/> \ |
+----------------+
1 row in set (0.01 sec)

The last two results are inconsistent. That two selects should not give same results.

To put it even simpler:

select '\\' like '\\';
select '\\' like '\\\\';

The fact that two statements above both give 1 as a results is, likely, a bug.
[18 Apr 2006 13:23] Jan S. S.R
This bug is now concentrating on the searched strings ending or containing an escape (which was the opening issue) and I thank you for caring.
Is the "a-umlaut=char(228,196) in searched string compared to escape sequences" special treatment shown above on Jan 26 related to this or should one file a separate bug for that effect?
[19 Apr 2006 6:50] Valeriy Kravchuk
> Is the "a-umlaut=char(228,196) in searched string compared to escape
> sequences" special treatment shown above on Jan 26 related to this or 
> should one file a separate bug for that effect?

I think, a separate bug report should be added for that. With a separate test case checked on 4.1.18 and/or 5.0.20 (latest GA versions). Please, add the SHOW VARIABLES LIKE 'char%' and SHOW VARIABLES LIKE 'colla%' results to that report.
[21 Apr 2006 14:19] Georgi Kodinov
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

The fact that :
select '\\' like '\\';
and
select '\\' like '\\\\';
both work and return 1 is a MySQL specific extension. 
This is due to the fact that in the LIKE pattern an escape symbol at the end of the pattern not followed by another symbol is not treated as an error.
[1 May 2006 16:42] Paul DuBois
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).

Additional info:

I have documented the exception for \-escaping
at the end of the LIKE string.

Regarding the incorrect behavior for certain characters,
I have filed a separate report (Bug#19457).
[11 May 2006 19:46] Paul DuBois
Bug#19457 has been closed.