Bug #5926 Fulltext case sensisvity in boolean mode changed?
Submitted: 6 Oct 2004 15:22 Modified: 6 Oct 2004 15:40
Reporter: Sébastien Vanvelthem Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.1.5 OS:Linux (Linux FC2)
Assigned to: CPU Architecture:Any

[6 Oct 2004 15:22] Sébastien Vanvelthem
Description:
As version 4.0.20 a fulltext search in BOOLEAN MODE was not case sensitive, it seems that in 4.1.5 this behaviour has changed

I read the doc, and was not able to determine if it's a correct behaviour? I hope I'm not totally wrong and it'll help. 

Thanks 

PS:

Upgrade was done over an existing 4.0.20, I reconfigured my.cnf to a latin1_bin collation and set ft_min_word_len to 3. Then I rebuild the (fulltext) indexes on the specified tables

How to repeat:
MySQL 4.0.20 - RedHat7.3 - ft_min_len=3
MySQL 4.1.5 - Fedora Core 2 - ft_min_len=3  (UTF-8 collation latin1_bin)

-- query  1 --
SELECT * FROM article 
WHERE MATCH (libelle_1) AGAINST ('rings*' IN BOOLEAN MODE)

   MySQL 4.0.20 -> 53 results
   MySQL 4.1.5 -> 0 results

-- query 2 --
SELECT * FROM article 
WHERE MATCH (libelle_1) AGAINST ('RINGS*' IN BOOLEAN MODE)

   MySQL 4.0.20 -> 53 results
   MySQL 4.1.5 -> 53 results

DDL 4.1.5 ->

CREATE TABLE `article` (
  `id_article` varbinary(10) NOT NULL default '0',
  `libelle_1` varchar(50) collate latin1_bin NOT NULL default '',
  PRIMARY KEY  (`id_article`),
  FULLTEXT KEY `libelle_1` (`libelle_1`),
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_bin;
        
DDL 4.0.20 -> the same except collations...

Suggested fix:
I would like to have one, but no idea ;)
[6 Oct 2004 15:40] Victoria Reznichenko
Hi,

Thank you for the report, but this is not a bug. latin1_bin means that collation is binary according to latin1 encoding. Use case-insensitive collation instead (that end with _ci). More info you can find in the MySQL documentation that is available at:
http://dev.mysql.com/doc/mysql/en/index.html
[6 Oct 2004 16:44] Sébastien Vanvelthem
Thanks a lot. That would be a great idea to put it in the manual in fulltext section, I suppose some people could be confused as unicode feature is relatively new. 

Thanks again

Seb