Bug #36488 regexp returns false matches, concatenating with previous rows
Submitted: 3 May 2008 16:24 Modified: 22 May 2008 12:39
Reporter: Scott Noyes Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.38+ OS:Any
Assigned to: Gleb Shchepa CPU Architecture:Any

[3 May 2008 16:24] Scott Noyes
Description:
A REGEXP match can return incorrect rows when the previous row matched the expression and uses CONCAT() with an empty string.

The rows selected include the "Random Random" row, which does not satisfy the regular expression, as shown in the `regexpMatch` column of the select list.

Bug appears in 5.0.38 and later (including 5.1 and 6.0), but not in 5.0.20a.

How to repeat:
DROP TABLE IF EXISTS `network_crm_contacts`;
CREATE TABLE `network_crm_contacts` (
  `first_name` varchar(100) NOT NULL default '',
  `last_name` varchar(100) NOT NULL default ''
);

INSERT INTO `network_crm_contacts` VALUES 
  ('',''),
  ('','First'),
  ('Random','Random');

SELECT 
  first_name, 
  last_name, 
  CONCAT(first_name,' ',last_name),
  CONCAT(first_name,' ',last_name) REGEXP 'First.*' AS regexpMatch
FROM network_crm_contacts 
WHERE CONCAT(first_name,' ',last_name) REGEXP 'First.*';

Output:

+------------+-----------+----------------------------------+-------------+
| first_name | last_name | CONCAT(first_name,' ',last_name) | regexpMatch |
+------------+-----------+----------------------------------+-------------+
|            | First     |  First                           |           1 |
| Random     | Random    | Random Random                    |           0 |
+------------+-----------+----------------------------------+-------------+

Suggested fix:
Possible workaround is to use LIKE instead of regular expression:

WHERE CONCAT(first_name, ' ', last_name) LIKE '%First%'
[3 May 2008 16:39] Valeriy Kravchuk
Thank you for a bug report. Verified just as described with 5.0.60:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot test -P3308
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.60-enterprise-gpl-nt MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE `network_crm_contacts` (
    ->   `first_name` varchar(100) NOT NULL default '',
    ->   `last_name` varchar(100) NOT NULL default ''
    -> );
Query OK, 0 rows affected (0.28 sec)

mysql> INSERT INTO `network_crm_contacts` VALUES
    ->   ('',''),
    ->   ('','First'),
    ->   ('Random','Random');
Query OK, 3 rows affected (0.08 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT
    ->   first_name,
    ->   last_name,
    ->   CONCAT(first_name,' ',last_name),
    ->   CONCAT(first_name,' ',last_name) REGEXP 'First.*' AS regexpMatch
    -> FROM network_crm_contacts
    -> WHERE CONCAT(first_name,' ',last_name) REGEXP 'First.*';
+------------+-----------+----------------------------------+-------------+
| first_name | last_name | CONCAT(first_name,' ',last_name) | regexpMatch |
+------------+-----------+----------------------------------+-------------+
|            | First     |  First                           |           1 |
| Random     | Random    | Random Random                    |           0 |
+------------+-----------+----------------------------------+-------------+
2 rows in set (0.06 sec)

mysql> SELECT
    ->   first_name,
    ->   last_name,
    ->   CONCAT(first_name,' ',last_name),
    ->   CONCAT(first_name,' ',last_name) REGEXP 'First.*' AS regexpMatch
    -> FROM network_crm_contacts
    -> WHERE CONCAT(first_name,' ',last_name) LIKE '%First%';
+------------+-----------+----------------------------------+-------------+
| first_name | last_name | CONCAT(first_name,' ',last_name) | regexpMatch |
+------------+-----------+----------------------------------+-------------+
|            | First     |  First                           |           1 |
+------------+-----------+----------------------------------+-------------+
1 row in set (0.00 sec)
[12 May 2008 8:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46615

ChangeSet@1.2623, 2008-05-12 13:46:46+05:00, gshchepa@host.loc +3 -0
  Fixed bug #36488: regexp returns false matches, concatenating
                    with previous rows.
  
  The WHERE clause containing expression:
    CONCAT(empty_field1, empty_field2, ..., 'literal constant', ...)
      REGEXP 'regular expression'
  may return wrong matches.
  
  Optimization of the CONCAT function has been fixed.
[13 May 2008 13:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46653

ChangeSet@1.2623, 2008-05-13 18:38:52+05:00, gshchepa@host.loc +3 -0
  Fixed bug #36488: regexp returns false matches, concatenating
                    with previous rows.
  
  The WHERE clause containing expression:
    CONCAT(empty_field1, empty_field2, ..., 'literal constant', ...)
      REGEXP 'regular expression'
  may return wrong matches.
  
  Optimization of the CONCAT function has been fixed.
[13 May 2008 15:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46660

ChangeSet@1.2623, 2008-05-13 20:27:46+05:00, gshchepa@host.loc +3 -0
  Fixed bug #36488: regexp returns false matches, concatenating
                    with previous rows.
  
  The WHERE clause containing expression:
    CONCAT(empty_field1, empty_field2, ..., 'literal constant', ...)
      REGEXP 'regular expression'
  may return wrong matches.
  
  Optimization of the CONCAT function has been fixed.
[19 May 2008 8:24] Bugs System
Pushed into 5.0.64
[19 May 2008 8:25] Bugs System
Pushed into 5.1.25-rc
[20 May 2008 0:35] Paul DuBois
Noted in 5.0.64, 5.1.25 changelogs.

A REGEXP match could return incorrect rows when the previous row
matched the expression and used CONCAT() with an empty string.

Setting report to Need Doc Info pending push into 6.0.x.
[22 May 2008 9:49] Bugs System
Pushed into 6.0.6-alpha
[22 May 2008 12:39] Paul DuBois
Noted in 6.0.6 changelog.
[28 Jul 2008 13:51] Bugs System
Pushed into 5.0.62  (revid:sp1r-gshchepa/uchum@host.loc-20080516080012-23021) (pib:2)
(Retry automatic marking, to ensure nothing is missed. cm01)
[28 Jul 2008 16:56] Bugs System
Pushed into 5.1.25-rc  (revid:sp1r-pcrews@pcrews-mac-local.local-20080516105248-04313) (version source revid:sp1r-pcrews@pcrews-mac-local.local-20080516105248-04313) (pib:3)