Bug #42741 Boolean fulltext query containing a single quote fails after update to 5.1.31
Submitted: 10 Feb 2009 21:36 Modified: 17 Feb 2009 6:27
Reporter: Alexander Dewilde Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: FULLTEXT search Severity:S1 (Critical)
Version:5.1.31 OS:Mac OS X (10.4.11 x86 64bit package)
Assigned to: CPU Architecture:Any

[10 Feb 2009 21:36] Alexander Dewilde
Description:
After updating to 5.1.31 fulltext queries containing a single quote (') no loger work.
Did a full rebuild of all tables. (everything works with utf-8 charset)

How to repeat:
select * from table where match (column) against ("+earth's* +climate*" in boolean mode);

The single quote (') followed by (s) is not honoured and when adding more arguments (climate) the server is being stalled! When operating on a 2 million row database, he processlog shows 'sorting result' forever.
The query refuses to be killed; it is listed as 'killed' but remails active untill mysqld gets brutally killed (-9).
[11 Feb 2009 9:01] Valeriy Kravchuk
Thank you for the problem report. Please, aent/upload mysqldump of the smallest table that demonstrates the problem described.
[11 Feb 2009 10:59] Alexander Dewilde
Same results on a dev box: x86 64bit OS X 10.5.5 (5.1.31-community)
[11 Feb 2009 11:06] Alexander Dewilde
The issue does not occur in 5.0.67

Do we need to recreate all databases from a dump when downgrading to 5.0.67?
[11 Feb 2009 12:29] Alexander Dewilde
Managed to downgrade to 5.0.67
All fulltext queries behave like they should and killing queries works as well.

We'll keep 5.1 in the fridge untill the issues are resolved...
[11 Feb 2009 21:58] Sveta Smirnova
Thank you for the feedback.

Please indicate which "column" do you use in the query for search and how many rows you expect to get.

I assumed "title", but could get row with id="euk".
[12 Feb 2009 15:51] Sveta Smirnova
Thank you for the feedback.

Regarding to search results there was incompatible change in version 5.1.6 (http://dev.mysql.com/doc/refman/5.1/en/news-5-1-6.html):

Incompatible Change: Words with apostrophes are now matched in a FULLTEXT search against non-apostrophe words (for example, a search for Jerry will match against the term Jerry's). Users upgrading to this version must issue REPAIR TABLE ... QUICK statements for tables containing FULLTEXT indexes. (Bug#14194)

Regarding to stalled query which can not be killed. Is it repeatable for you with test data which you provided or table should contain more rows? Which state SHOW PROCESSLIST shows?
[12 Feb 2009 15:54] Sveta Smirnova
> Do we need to recreate all databases from a dump when downgrading to 5.0.67?

Is safer to recreate: there are several incompatibilities in table formats.
[12 Feb 2009 18:46] Alexander Dewilde
The query that refuses to be killed shows foloowing status in the processlist: sorting results

I cannot reproduce the issue right now, since we downgraded to 5.0.67.
[12 Feb 2009 22:41] Alexander Dewilde
querying: select * from table where match (col) against ("+s*" in boolean mode) will match any word beginning with (s)... This is not expected behaviour, ft_min_length is not honored!

querying: select * from table where match (col) against ("+earth's*" in boolean mode) will match any word beginning with (earth) and any word beginning with (s), since the single quote is no longer seen as part of a word... Again, ft_min_length is not honored!

The only workaround so far is to enclose the search string (only works with 1 string):
select * from table where match (col) against ('"+earth\'s*"' in boolean mode)

This makes absolutely no sense.
[16 Feb 2009 9:06] Sveta Smirnova
Thank you for the feedback.

Regarding to issues with 's*' please read at http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html:

 * The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it should be appended to the word to be affected. Words match if they begin with the word preceding the * operator.

If a stopword or too-short word is specified with the truncation operator, it will not be stripped from a boolean query. 

Regarding to query which can not be killed I can not repeat described behavior after version 5.1.17. So I close the report as "Can't repeat". If you experience same behavior in the latest version we need to know how much rows should have table to repeat this (I tried with example provided - 998 rows and 27944 rows without luck).
[16 Feb 2009 18:59] Alexander Dewilde
There is a bug in 5.1:

select title from productstmp where match (title) against("+dave's* +shop*" in boolean mode);

-> shows (sending data) forever in the processlist. (tested on my 1000 row example file)
The query can be killed in the 1000 row database, not in the 2 million row database.

The documentation states when queying on (dave's) both (dave) and (dave's) should match. that's ok.
As soon as you add a second term (boolean with wildcard) the database is stalled.
[16 Feb 2009 19:11] Alexander Dewilde
top command after query ("+dave's* +shop*" in boolean mode):

23538 mysql        0.0%  0:00.23   1    15     29  844K  1216K  1240K    24M 
23535 mysqld     145.5% 18:40.63  12    46     62   16M   188K    18M    62M 

Tried to kill but no luck!

+----+------+-----------+-------+---------+------+--------------+----------------------------------------------------------------------------------------------+
| Id | User | Host      | db    | Command | Time | State        | Info                                                                                         |
+----+------+-----------+-------+---------+------+--------------+----------------------------------------------------------------------------------------------+
|  1 | root | localhost | story | Killed  | 1090 | Sending data | select title from productstmp where match (title) against("+dave's* +shop*" in boolean mode) | 
|  2 | root | localhost | NULL  | Query   |    0 | NULL         | show processlist                                                                             |
[16 Feb 2009 23:06] Alexander Dewilde
After performing a simple 6 row test, it is clear the 5.1 branch contains a bug...

I'm taking this to a new bug report...
[17 Feb 2009 6:27] Sveta Smirnova
Thank you for the feedback.

Marking this as duplicate of bug #42907.