Bug #2047 FIND_IN_SET( ) returns false positives
Submitted: 8 Dec 2003 10:08 Modified: 8 Dec 2003 10:30
Reporter: [ name withheld ] Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.0.13 OS:Linux (Linux (Redhat 7))
Assigned to: Dean Ellis CPU Architecture:Any

[8 Dec 2003 10:08] [ name withheld ]
Description:
Hi all,

Couldn't find this bug in FIND_IN_SET in the bug database or fixed bug list.  It may have been fixed in 4.1, but I didn't want to upgrade until I knew!

Running 4.0.13 on a Linux, Redhat 7.  Installed from source tarball, not rpm or binary.

I use FIND_IN_SET() to return results containing a certain number in a comma-delimmeted list in a TEXT field.  Sometimes it returns false positives, for example:

mysql> SELECT pssmID_list FROM bccs WHERE FIND_IN_SET(1487, pssmID_list);
+-------------+
| pssmID_list |
+-------------+
| 7771,148    |
| 1487        |
+-------------+
2 rows in set (0.01 sec)

How to repeat:
This apparently happens when the last element of the list is a partial match for the query string.  It is difficult to repeat, since it only happens in some of these cases.  If you use static strings in the query, rather than querying the table, it won't happen.  I suspect it's caused by FIND_IN_SET going past the end of a text string and into the next record.  

To repeat, I recomend writing a script to check for excess returns.  Create a large table of lists of numbers, in which each number should appear in only one row.  Then cound the responses to find_in_set for each number.  If you come back with more than 1, the error has occured.

The lack of repeatability is the reason I marked this serious.  For me, it caused problems in 11 searches out 6000.  But in some applications, this could be serious and hard to notice.

Suggested fix:
As a workaround, you can append a comma to the delimited list:

mysql> SELECT pssmID_list FROM bccs WHERE FIND_IN_SET('1487', CONCAT(pssmID_list, ','));
+-------------+
| pssmID_list |
+-------------+
| 1487        |
+-------------+
1 row in set (0.00 sec)

but this won't work for some people's purposes.  It's worth looking into bounds-checking FIND_IN_SET does on TEXT columns.
[8 Dec 2003 10:30] Dean Ellis
I cannot repeat this using 4.0.17.  You might try a more current release.

Otherwise, please submit a complete test case including the SQL to populate the table with the values necessary to demonstrate the behavior.  A table containing only the values you showed as being returned by FIND_IN_SET does not have the issue with 4.0.17.

Thank you.