Bug #84118 Escaping of \ and % in LIKE causes weird matches
Submitted: 8 Dec 2016 15:52 Modified: 8 Dec 2016 20:51
Reporter: Paul Clark Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.6/5.7/8.0 OS:Ubuntu (16.04)
Assigned to: CPU Architecture:Any
Tags: escaping, like, wildcards

[8 Dec 2016 15:52] Paul Clark
Description:
Using \ in LIKE to escape \ itself and the wildcard % causes odd results when matching against fields containing \ and %.

'\\%' acts as if it was '\\\%'
'\\\%' acts as if it was '\\%'
'\\\\%' acts as if it was '\\%'
'\\\\\%' acts as if it was '\\\%'

The latter two look like double-unescaping but I can find no rhyme nor reason in the first two at all :-)

How to repeat:
-- in any old database

drop table if exists foo;
create table foo
(
  id varchar(255)
);

insert into foo values('bar%');
insert into foo values('bar%1');
insert into foo values('bart');
insert into foo values('bar\\t');
insert into foo values('bar\\%');

-- Wildcard % - fetches all
select id from foo where id like 'bar%';
-- OK

-- Straight escaped % - just bar%
select id from foo where id like 'bar\%';
-- OK

-- Escaped \ then wildcard % - should fetch bar\t and bar\%
select id from foo where id like 'bar\\%';
-- ERROR: Only fetches bar\%

-- Escaped \ then escaped % - should fetch bar\%
select id from foo where id like 'bar\\\%';
-- ERROR: Fetches both bar\t and bar\% as one before should have done

-- Two escaped \ then wildcard % - shouldn't fetch anything
select id from foo where id like 'bar\\\\%';
-- ERROR: Fetches both bar\t and bar\%

-- Two escaped \ then escaped % - shouldn't fetch anything
select id from foo where id like 'bar\\\\\%';
-- ERROR: Fetches bar\%
[8 Dec 2016 20:51] MySQL Verification Team
Thank you for the bug report. Verified as described.
[19 Apr 21:52] Michal Vorisek
Here is another repro for MySQL 5.6/5.7 which is affected:
```
CREATE TABLE `user` (
  `id` INT UNSIGNED AUTO_INCREMENT NOT NULL,
  `name` VARCHAR(255),
  PRIMARY KEY(`id`)
) DEFAULT CHARACTER SET utf8mb4 COLLATE `utf8mb4_unicode_ci` ENGINE = InnoDB;

select * from `user`
where `name` like 'li\\ma' escape '';
-- 1:1 search, no escape, expecting 1 row

select * from `user`
where `name` like 'li%\\ma' escape '';
-- the same with % (matching 0 or more bytes), expecting 1 row

select * from `user`
where `name` like 'li\\ma' escape '\\';
-- backslash escapes "m", expecting 0 rows

select * from `user`
where `name` like 'li%\\ma' escape '\\';
-- the same with %, expecting 1 row

select * from `user`
where `name` like 'li\\\\ma' escape '\\';
-- backslash escapes "\", expecting 1 row

select * from `user`
where `name` like 'li%\\\\ma' escape '\\';
-- the same with %, expecting 1 row
```

online repro: https://dbfiddle.uk/TfJBmQPq

MySQL 8.0 produces valid results: https://dbfiddle.uk/bw-5gL4y

Can this issue be please fixed in MySQL 5.7?
[20 Apr 10:00] Michal Vorisek
Here is the problem isolated:

```
select
cast('Li\\ma' as CHAR CHARACTER SET utf8mb4) collate utf8mb4_unicode_ci
like
cast('li%\\\\ma' as CHAR CHARACTER SET utf8mb4) collate utf8mb4_unicode_ci
escape '\\';

select
cast('Li\\ma' as CHAR CHARACTER SET utf8mb4) collate utf8mb4_general_ci
like
cast('li%\\\\ma' as CHAR CHARACTER SET utf8mb4) collate utf8mb4_general_ci
escape '\\';
```

https://dbfiddle.uk/74Ezq22N

In both cases, the result has to be 1. However, the utf8mb4_unicode_ci collation is currently broken with LIKE on MySQL 5.x.