Bug #67049 MySQL server allows min>max, causing hard-to-diagnosis misbehavior
Submitted: 2 Oct 2012 7:48 Modified: 3 Oct 2012 15:13
Reporter: Tianyin Xu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Errors Severity:S3 (Non-critical)
Version:5.5.29 OS:Any
Assigned to: CPU Architecture:Any
Tags: ft_max_word_len, ft_min_word_len, min-max

[2 Oct 2012 7:48] Tianyin Xu
Description:
Hi MySQL,

The root cause of the problem actually is a configuration error. Basically, due to a careless operation: we only changed one of the two parameters with value dependency like:

ft_min_word_len 20
ft_max_word_len 10

This causes the min>max. 

However, MySQL does not check this case but directly assign the values.

The consequence is, for client query using the FULLTEXT index feature, the system will always return the empty set, which is not the correct results (we do have some matching results in the table instead of empty set). 

How to repeat:
(1) create a simple table, e.g.,

mysql> create table employee
    -> (
    -> empno int,
    -> ename varchar(15),
    -> job varchar(50)
    -> ) ENGINE=MYISAM;

mysqlimport --local test employee.txt

mysql> select * from employee;
+-------+------------+----------------------------+
| empno | ename      | job                        |
+-------+------------+----------------------------+
|   100 | John Doe   | abcdedfhijklmnopqrstuvwxyz |
|   200 | John Smith | abcdedfhijklmnopqrstuvwxyz |                 
|   300 | Raj Patel  | abcdedfhijklmnopqrstuvwxyz |                  
+-------+------------+----------------------------+

(2) create fulltext index on the 3rd varchar column, 
mysql> ALTER TABLE employee ADD FULLTEXT(job);

(3) mysql> select * from employee where match(job)  
        -> against('+abcdedfhijklmnopqrstuvwxyz' in boolean mode);

Then, we can see the result is 

"Empty set (0.00 sec)"

Note that the correct result is to print the table.

Suggested fix:
I take a look at the related source code, the related code is as follows:

/* storage/myisam/ft_parser.c */

uchar ft_get_word()(...)
{
    ...
    if (((length >= ft_min_word_len && !is_stopword((char*) word->pos,
                                                    word->len))
         || param->trunc) && length < ft_max_word_len)
    {   
      *start=doc;
      param->type= FT_TOKEN_WORD;
      goto ret;
    }   
    ...
}

So, if ft_min_word_len is larger than ft_max_word_len, there's no way for the execution go into the branch block.

Another identical code snippet is in the same file in ft_simple_get_word() function.

I wonder whether it's possible to add a checker here or in the beginning to check the error and print out an explicit message, which will be very very helpful for users to diagnose the configuration problems.
[3 Oct 2012 13:57] MySQL Verification Team
Thank you for the bug report.

d:\dbs>d:\dbs\5.5\bin\mysql -uroot --port=3541 --prompt="mysql 5.5 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.29 Source distribution

Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.5 >SHOW VARIABLES LIKE "ft_%";
+--------------------------+----------------+
| Variable_name            | Value          |
+--------------------------+----------------+
| ft_boolean_syntax        | + -><()~*:""&| |
| ft_max_word_len          | 10             |
| ft_min_word_len          | 20             |
| ft_query_expansion_limit | 20             |
| ft_stopword_file         | (built-in)     |
+--------------------------+----------------+
5 rows in set (0.00 sec)
[3 Oct 2012 15:13] Tianyin Xu
Thanks a lot, Miguel!

Yes, we should have print out the ft_% variables and noticed the misconfiguration.
But the silent reaction prevents us from suspecting it's because such stupid mistakes :-(