Bug #39808 LIKE .. ESCAPE .. not behaving as expected
Submitted: 2 Oct 2008 12:49 Modified: 2 Oct 2008 19:57
Reporter: Jaka Jančar Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.67, 4.1, 5.0, 5.1, 6.0 bzr OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[2 Oct 2008 12:49] Jaka Jančar
Description:
With LIKE patterns you can use % and _ wildcard characters. If you want to match those literally, you can escape them using an escape character.

If you want to look for the escape character itself, I assume you must also escape it using itself (though this isn't explicitly mentioned in the manual).

Using the escape character just doesn't seem to work as expected to me. I'll try to describe the issue better using examples.

How to repeat:
mysql> SELECT 'X' LIKE 'X%' ESCAPE 'X';
+--------------------------+
| 'X' LIKE 'X%' ESCAPE 'X' |
+--------------------------+
|                        0 | 
+--------------------------+
1 row in set (0.00 sec)

My interpretation of expression: 'X' is a literal percent sign.
Expected 0, got 0. OK.

mysql> SELECT 'X' LIKE '%XX%' ESCAPE 'X';
+----------------------------+
| 'X' LIKE '%XX%' ESCAPE 'X' |
+----------------------------+
|                          1 | 
+----------------------------+
1 row in set (0.00 sec)

My interpretation of the expression: 'X' contains the letter X.
Expected 1, got 1. OK

Now I try to select from a table which has some rows with name field containing the letter X.

Without custom escape clause:
mysql> SELECT name FROM companies WHERE name LIKE '%X%';
...
3 rows in set (0.00 sec)

Works as expected, 3 rows exist which contain letter X in the name.

With custom escape clause:
mysql> SELECT name FROM companies WHERE name LIKE '%XX%' ESCAPE 'X';
Empty set (0.00 sec)

Expected: Same as above, 3 rows.
Got: No rows!

Oddly enough, using 3 X characters works:
mysql> SELECT name FROM companies WHERE name LIKE '%XXX%' ESCAPE 'X';
...
3 rows in set (0.00 sec)

Using 4 'X' characters doesn't work.
[2 Oct 2008 17:40] Sveta Smirnova
Thank you for the report.

Please provide output of SELECT name FROM companies WHERE name LIKE '%XXX%' ESCAPE 'X';
[2 Oct 2008 18:56] Jaka Jančar
After some more research, it seems to be a charset/collation problem. If I use utf8_general_ci instead of utf8_unicode_ci it works as expected. Here's a complete test:

# utf_unicode_ci
################
mysql> SET NAMES utf8 COLLATE utf8_unicode_ci;
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> DROP TABLE likeTest;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE likeTest (
    ->     string VARCHAR(255)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Query OK, 0 rows affected (0.02 sec)

mysql> 
mysql> INSERT INTO likeTest VALUES ('I\'m a string with an X!');
Query OK, 1 row affected (0.01 sec)

mysql> 
mysql> SELECT string FROM likeTest WHERE string LIKE '%XX%' ESCAPE 'X';
Empty set (0.00 sec)

mysql> SELECT string FROM likeTest WHERE string LIKE '%XXX%' ESCAPE 'X';
+-------------------------+
| string                  |
+-------------------------+
| I'm a string with an X! | 
+-------------------------+
1 row in set (0.00 sec)

# utf_general_ci
################
mysql> SET NAMES utf8 COLLATE utf8_general_ci;
Query OK, 0 rows affected (0.00 sec)

mysql> 
mysql> DROP TABLE likeTest;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE likeTest (
    ->     string VARCHAR(255)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> INSERT INTO likeTest VALUES ('I\'m a string with an X!');
Query OK, 1 row affected (0.00 sec)

mysql> 
mysql> SELECT string FROM likeTest WHERE string LIKE '%XX%' ESCAPE 'X';
+-------------------------+
| string                  |
+-------------------------+
| I'm a string with an X! | 
+-------------------------+
1 row in set (0.00 sec)

mysql> SELECT string FROM likeTest WHERE string LIKE '%XXX%' ESCAPE 'X';
Empty set (0.00 sec)
[2 Oct 2008 19:57] Sveta Smirnova
Thank you for the report.

Verified as described.
[24 Nov 2008 14:04] Alexander Barkov
utf8_general_ci works fine.
Looks like a bug in utf8_unicode_ci implementation.

mysql> SELECT _utf8'X' collate utf8_general_ci LIKE '%XX%' ESCAPE 'X';
+---------------------------------------------------------+
| _utf8'X' collate utf8_general_ci LIKE '%XX%' ESCAPE 'X' |
+---------------------------------------------------------+
|                                                       1 |
+---------------------------------------------------------+
1 row in set (0.00 sec)
-- Expected result

mysql> SELECT _utf8'X' collate utf8_unicode_ci LIKE '%XX%' ESCAPE 'X';
+---------------------------------------------------------+
| _utf8'X' collate utf8_unicode_ci LIKE '%XX%' ESCAPE 'X' |
+---------------------------------------------------------+
|                                                       0 |
+---------------------------------------------------------+
1 row in set (0.00 sec)
-- Wrong result
[13 Jul 2010 12:50] Damien Tournoud
This is definitely still an issue:

SELECT '|' COLLATE utf8_unicode_ci LIKE '|' ESCAPE '|';
=> 0 (correct, the pattern is invalid)

SELECT '|' COLLATE utf8_unicode_ci LIKE '||' ESCAPE '|';
=> 1 (correct)

SELECT '|' COLLATE utf8_unicode_ci LIKE '|||' ESCAPE '|';
=> 0 (correct, the pattern is invalid)

SELECT '|' COLLATE utf8_unicode_ci LIKE '||%' ESCAPE '|';
=> 1 (correct)

SELECT '|' COLLATE utf8_unicode_ci LIKE '%||' ESCAPE '|';
=> 0 (incorrect)

SELECT '|' COLLATE utf8_unicode_ci LIKE '%||%' ESCAPE '|';
=> 0 (incorrect)

SELECT '|' COLLATE utf8_unicode_ci LIKE '%|||%' ESCAPE '|';
=> 1 (incorrect)

(on MySQL 5.1.47)
[1 Oct 2020 16:43] Ruslan Stelmachenko
Still an issue in 2020.

SHOW VARIABLES LIKE  'ver%';

Variable_name          |Value                       
-----------------------|----------------------------
version                |5.7.28                      
version_comment        |MySQL Community Server (GPL)
version_compile_machine|x86_64                      
version_compile_os     |Linux                       

SHOW VARIABLES LIKE  'char%';

Variable_name           |Value                     
------------------------|--------------------------
character_set_client    |utf8mb4                   
character_set_connection|utf8mb4                   
character_set_database  |utf8mb4                   
character_set_filesystem|binary                    
character_set_results   |                          
character_set_server    |utf8mb4                   
character_set_system    |utf8                      
character_sets_dir      |/usr/share/mysql/charsets/

SELECT '|' COLLATE utf8mb4_unicode_520_ci LIKE '%||' ESCAPE '|'; -- should return 1, returns 0

SELECT '|' COLLATE utf8mb4_unicode_520_ci LIKE '%||%' ESCAPE '|'; -- should return 1, returns 0

SELECT '|' COLLATE utf8mb4_unicode_520_ci LIKE '%|||%' ESCAPE '|'; -- should return 0, returns 1