| Bug #31950 | repair table hangs while processing multicolumn utf8 fulltext index | ||
|---|---|---|---|
| Submitted: | 30 Oct 2007 14:28 | Modified: | 15 Nov 2007 15:31 | 
| Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: FULLTEXT search | Severity: | S2 (Serious) | 
| Version: | 5.0.50 | OS: | Any | 
| Assigned to: | Sergey Vojtovich | CPU Architecture: | Any | 
| Tags: | bfsm_2007_11_01, fulltext, hang | ||
   [30 Oct 2007 14:35]
   MySQL Verification Team        
  some debug info
Attachment: bug31950_debug_info.txt (text/plain), 4.56 KiB.
   [1 Nov 2007 13:24]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/36868 ChangeSet@1.2547, 2007-11-01 16:27:01+04:00, svoj@mysql.com +1 -0 BUG#31950 - repair table hangs while processing multicolumn utf8 fulltext index Having a table with broken multibyte characters may cause fulltext parser dead-loop. Since normally it is not possible to insert broken multibyte sequence into a table, this problem may arise only if table is damaged. Affected statements are: - CHECK/REPAIR against damaged table with fulltext index; - boolean mode phrase search against damaged table with or without fulltext inex; - boolean mode searches without index; - nlq searches. No test case for this fix. Affects 5.0 only.
   [14 Nov 2007 9:41]
   Bugs System        
  Pushed into 6.0.4-alpha
   [14 Nov 2007 9:45]
   Bugs System        
  Pushed into 5.1.23-rc
   [14 Nov 2007 9:50]
   Bugs System        
  Pushed into 5.0.52
   [15 Nov 2007 15:31]
   Paul DuBois        
  Noted in 5.0.52 changelog. A column with malformed multi-byte characters could cause the full-text parser to go into an infinite loop.

Description: repair table <table> use_frm hangs in an infinite loop. Here's the stack trace of the thread while it's hung: mysqld-debug.exe!ft_simple_get_word mysqld-debug.exe!ft_parse mysqld-debug.exe!_mi_ft_parse mysqld-debug.exe!_mi_ft_parserecord mysqld-debug.exe!sort_ft_key_read mysqld-debug.exe!find_all_keys mysqld-debug.exe!_create_index_by_sort mysqld-debug.exe!mi_repair_by_sort mysqld-debug.exe!ha_myisam::repair mysqld-debug.exe!ha_myisam::repair mysqld-debug.exe!handler::ha_repair mysqld-debug.exe!mysql_admin_table mysqld-debug.exe!mysql_repair_table mysqld-debug.exe!mysql_execute_command mysqld-debug.exe!mysql_parse mysqld-debug.exe!dispatch_command mysqld-debug.exe!do_command mysqld-debug.exe!handle_one_connection mysqld-debug.exe!pthread_start mysqld-debug.exe!_callthreadstart mysqld-debug.exe!_threadstart byte ft_simple_get_word(CHARSET_INFO *cs, byte **start, const byte *end, FT_WORD *word, my_bool skip_stopwords) { byte *doc= *start; uint mwc, length, mbl; DBUG_ENTER("ft_simple_get_word"); do { for (;; doc+= mbl) <-----------mbl is zero, so this loop goes forever!!! { if (doc >= end) DBUG_RETURN(0); if (true_word_char(cs, *doc)) break; mbl= my_mbcharlen(cs, *(uchar *)doc); } table structure is as follows: mysql> show create table vb_post1\G *************************** 1. row *************************** Table: vb_post1 Create Table: CREATE TABLE `vb_post1` ( `postid` int(10) unsigned NOT NULL auto_increment, `threadid` int(10) unsigned NOT NULL default '0', `parentid` int(10) unsigned NOT NULL default '0', `username` varchar(100) collate utf8_unicode_ci NOT NULL default '', `userid` int(10) unsigned NOT NULL default '0', `title` varchar(250) collate utf8_unicode_ci NOT NULL default '', `dateline` int(10) unsigned NOT NULL default '0', `pagetext` mediumtext collate utf8_unicode_ci NOT NULL, `allowsmilie` smallint(6) NOT NULL default '0', `showsignature` smallint(6) NOT NULL default '0', `ipaddress` varchar(15) collate utf8_unicode_ci NOT NULL default '', `iconid` smallint(5) unsigned NOT NULL default '0', `visible` smallint(6) NOT NULL default '0', `attach` smallint(5) unsigned NOT NULL default '0', `infraction` smallint(5) unsigned NOT NULL default '0', `reportthreadid` int(10) unsigned NOT NULL default '0', PRIMARY KEY (`postid`), KEY `userid` (`userid`), KEY `threadid` (`threadid`,`userid`), KEY `idx_dateline` (`dateline`), FULLTEXT KEY `title` (`title`,`pagetext`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci 1 row in set (0.01 sec) How to repeat: no simple testcase yet. Suggested fix: this seems very similar to the hang in bug #29464 except it's not chinese and the loop is slightly different.