Bug #17705 | FT Index corruption occurs with UTF8 data when ft_min_word_len=3 | ||
---|---|---|---|
Submitted: | 24 Feb 2006 17:57 | Modified: | 31 Mar 2006 7:03 |
Reporter: | Matthew Lord | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: MyISAM storage engine | Severity: | S1 (Critical) |
Version: | 4.1.18 | OS: | independent |
Assigned to: | Michael Widenius | CPU Architecture: | Any |
Tags: | corruption, myisam |
[24 Feb 2006 17:57]
Matthew Lord
[26 Feb 2006 11:23]
Sergey Vojtovich
Matthew, I believe it is not a bug or it has been fixed already. After issuing REPAIR TABLE statement on the dataset DELETE works just fine. Also I've noticed (while reading CSC conversation) that myisamchk was incorrectly used. Could you please make sure that myisamchk was run correctly? Also it is wise to use repair table rather than myisamchk in this regard. Also please make sure that version you use has fixes for bugs: bug#5686, bug#11336, bug#16489. Last one present since Jan 23. Excerpt from manual: If you modify full-text variables that affect indexing (ft_min_word_len, ft_max_word_len, or ft_stopword_file), or if you change the stopword file itself, you must rebuild your FULLTEXT indexes after making the changes and restarting the server. To rebuild the indexes in this case, it is sufficient to do a QUICK repair operation: mysql> REPAIR TABLE tbl_name QUICK; With regard specifically to using the IN BOOLEAN MODE capability, if you upgrade from MySQL 3.23 to 4.0 or later, it is necessary to replace the index header as well. To do this, perform a USE_FRM repair operation: mysql> REPAIR TABLE tbl_name USE_FRM; This is necessary because boolean full-text searches require a flag in the index header that was not present in MySQL 3.23, and that is not added if you do only a QUICK repair. If you attempt a boolean full-text search without rebuilding the indexes this way, the search returns incorrect results. Note that if you use myisamchk to perform an operation that modifies table indexes (such as repair or analyze), the FULLTEXT indexes are rebuilt using the default full-text parameter values for minimum word length, maximum word length, and stopword file unless you specify otherwise. This can result in queries failing. The problem occurs because these parameters are known only by the server. They are not stored in MyISAM index files. To avoid the problem if you have modified the minimum or maximum word length or stopword file values used by the server, specify the same ft_min_word_len, ft_max_word_len, and ft_stopword_file values to myisamchk that you use for mysqld. For example, if you have set the minimum word length to 3, you can repair a table with myisamchk like this: shell> myisamchk --recover --ft_min_word_len=3 tbl_name.MYI
[7 Mar 2006 7:14]
Sergey Vojtovich
I was able to repeat this problem using new dataset.
[15 Mar 2006 11:00]
Sergey Vojtovich
This bug is not about fulltext index, but rather about REPAIR TABLE. I was able to repeat this bug using trivial table: CREATE TABLE `t1` ( `TITLE_DISPLAY` varchar(255) default NULL, KEY `title_display_idx` (`TITLE_DISPLAY`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; As a workaround one can dump that table and load it back. It works just fine this way.
[28 Mar 2006 9:21]
Sergey Vojtovich
New test case for this bug.
Attachment: t1.sql (application/octet-stream, text), 487 bytes.
[30 Mar 2006 9:38]
Michael Widenius
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: The problem had nothing to do with FT index, but instead with compression of keys. The bug can appear in MyISAM for any key where the first part is a CHAR/VARCHAR longer than 254 bytes Fix will be in 4.1.19 an 5.0.20
[31 Mar 2006 7:03]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: Documented bugfix in 4.1.19 and 5.0.20 changelogs. Closed.
[31 Mar 2006 7:15]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/4354