Bug #15350 Fulltext search fails on some language specific characters
Submitted: 30 Nov 2005 15:33 Modified: 13 Dec 2005 13:24
Reporter: Vladimir Suvarina Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.1.14-log OS:Linux (Gentoo)
Assigned to: CPU Architecture:Any

[30 Nov 2005 15:33] Vladimir Suvarina
Description:
Fulltext search using MATCH .. AGAINST construct fails (eg. doesn't return any result) when some language specific characters are present in indexed field.

Field is of type TEXT, collation latin2_czech_cs, those characters are small letter u with circumflex (iso-8859-2 code F9, html &#251) and small letter with acute (iso-8859-2 code F3, html &#243).

Tested using php and mysql command line client

Select using LIKE (eg. like '%some_word%') works fine.

How to repeat:
CREATE TABLE `tblFull` (
  `ID` int(11) NOT NULL auto_increment,
  `FText` text collate latin2_czech_cs NOT NULL,
  PRIMARY KEY  (`ID`),
  FULLTEXT KEY `FText` (`FText`)
) ENGINE=MyISAM DEFAULT CHARSET=latin2 COLLATE=latin2_czech_cs;

INSERT INTO `tblFull` (`ID`, `FText`) VALUES (1, 'kůň'),
(2, 'příliš'),
(3, 'tón');

and then

Select * from tblFull where MATCH(FText) AGAINST ("příliš") //This one works as expected

and

Select * from tblFull where MATCH(FText) AGAINST ("kůň") //This one gets me no result

but 

Select * from tblFull where FText like "kůň" //Works fine as expected

=========

my.cnf says (set by me)

[mysqld]
character-set-server = latin2
default-character-set = latin2
default-collation = latin2_czech_cz
init_connect = 'SET NAMES latin2' //this one is due to php with mysql client icompiled with latin1 as default
[6 Dec 2005 12:16] Valeriy Kravchuk
Thank you for a problem report. Looks like the real problem for you is not some language-specific characters (your fulltext index will not allow you to find 'abc', for example), but the default value of the ft_min_word_len server variable (4). Words with less than 4 characters are not included into your fulltext index.

Please, include ft_min_word_len with the value 3 (or smaller) into you my.cnf fine, rebuild the fulltext index and try once more. Inform about the results. See http://dev.mysql.com/doc/refman/4.1/en/fulltext-fine-tuning.html for the details.
[13 Dec 2005 13:24] Vladimir Suvarina
Stupid me!

I had already set the ft_min_word_length variable and restart mysqld, before I posted the actual bug report, but I forget to rebuild the index :/

I'm very, VERY, sorry for your time.
[30 Aug 2007 11:19] sonal fulkar
i am facing the same problem.. but in my case i have dropped existing index and rebuild them, still i am not able to perform full text search for less than four characters.
to reproduce:
i have set ft_min_word_length=1
restart the mysql server
and rebuild the fulltext indexes for the table
am i missing anything here??
my /etc/my.cnf reads as

[mysqld]
set-variable = max_connections=500
safe-show-database
max_allowed_packet=2M
ft_min_word_len=1

Thanks,