Bug #59973 LIKE wildcard match fails in some cases
Submitted: 6 Feb 2011 8:04 Modified: 9 Apr 2013 14:34
Reporter: Apua Paquola Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Data Types Severity:S1 (Critical)
Version:5.1.41-3ubuntu12.9, 5.1.56-bzr OS:Any (2.6.32, ubuntu)
Assigned to: CPU Architecture:Any
Tags: like, wildcard

[6 Feb 2011 8:04] Apua Paquola
Description:
I found this strange behavior with the LIKE string match operator.

When you do a wildcard search for the first string in the example below, which is a substring of the big string, you get no results.

For the second string, which has just a "C" appended to the first one, you get the expected result.

How to repeat:
drop table if exists l2; 
create table l2 (seq varchar(255));
insert into l2 values('CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC');

select count(*) from l2 where seq like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACA%';
EXPECTED: 1
RETURNS: 0

select count(*) from l2 where seq like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC%';
EXPECTED: 1
RETURNS: 1
[6 Feb 2011 8:32] Apua Paquola
A simpler way to test:

mysql> select 'CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC' like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACA%' as matches;
+---------+
| matches |
+---------+
|       0 |
+---------+

mysql> select 'CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC' like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC%' as matches;
+---------+
| matches |
+---------+
|       1 |
+---------+
[6 Feb 2011 9:19] Peter Laursen
All cases work fine for me with 5.1.54/Windows

Peter
(not a MySQL person)
[6 Feb 2011 9:22] Valeriy Kravchuk
Thank you for the bug report. Verified just as described with current mysql-5.1 on Mac OS X:

macbook-pro:5.1 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.56-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table if exists l2; 
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> create table l2 (seq varchar(255));
Query OK, 0 rows affected (0.08 sec)

mysql> insert into l2
    -> values('CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC');
Query OK, 1 row affected (0.01 sec)

mysql> select count(*) from l2 where seq like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACA%';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from l2 where seq like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC%';
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)
[10 Feb 2011 19:09] Apua Paquola
I think it's client problem. I get the correct result with a 5.5 client connecting to a 5.0 server. However, I get a wrong result with a 5.1 client connecting to the same server. (See below).

Using a 5.1 client, and connecting to an open 5.0 server:

$ mysql --version
mysql  Ver 14.14 Distrib 5.1.49, for debian-linux-gnu (x86_64) using readline 6.1

$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2107870
Server version: 5.0.45 Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select 'CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC' like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACA%' as matches;
+---------+
| matches |
+---------+
|       0 |
+---------+
1 row in set (0.01 sec)

Using a 5.5 client, and connecting to an open 5.0 server:

$ /usr/local/mysql/bin/mysql --version
/usr/local/mysql/bin/mysql  Ver 14.14 Distrib 5.5.9, for Linux (x86_64) using  EditLine wrapper

$ /usr/local/mysql/bin/mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2107907
Server version: 5.0.45 Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select 'CAAAACCACTATGAGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACAC' like '%ACAGGTGCTGGAGAGGATGCGGAGAAATAGGAACA%' as matches;
+---------+
| matches |
+---------+
|       1 |
+---------+
1 row in set (0.01 sec)
[9 Apr 2013 14:34] Paul DuBois
Noted in 5.7.2 changelogs.

The Turbo Boyer-Moore algorithm used for LIKE pattern matches failed
to handle some patterns. The server now uses the original Boyer-Moore
algorithm.