Bug #3011 Fulltext with UTF8 irrelevant results (or errors)
Submitted: 28 Feb 2004 17:10 Modified: 29 Feb 2004 13:49
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:4.1.1 OS:Linux (Linux)
Assigned to: Sergei Golubchik

[28 Feb 2004 17:10] [ name withheld ]
When using similar words with only different accents/diacritical marks in utf8 
database, the fulltext search gives errors, bad results or at least irrelevant 
relevance values. 

How to repeat:
see attachment
[28 Feb 2004 17:20] [ name withheld ]
example (utf8) - see mainly the 4th query, unfortunately no "storage errors -1" reproducable

Attachment: problem_example.txt (text/plain), 4.31 KiB.

[29 Feb 2004 4:01] Sergei Golubchik
could you please attach a table dump (as created by mysqldump) for these tables ?
[29 Feb 2004 6:23] [ name withheld ]
A question: could there be a problem (it is not the only one) with having two 
TEXT columns in one common FULLTEXT index, while one of these columns can be 
P.S.: I'll try to make a better demonstration table as soon as possible, 
[29 Feb 2004 6:39] [ name withheld ]
an example of a complete session with errors (create database, tables, ...), UTF8 of course

Attachment: example2.txt (text/plain), 3.12 KiB.

[29 Feb 2004 6:45] [ name withheld ]
mysqldump -u root -p test1 --default-character-set=utf8 >example2_dump.txt

Attachment: example2_dump.txt (text/plain), 1.13 KiB.

[29 Feb 2004 7:19] [ name withheld ]
An complete (unmodified) examples session added as well as the dump of the 
[29 Feb 2004 13:49] Sergei Golubchik
Additional info:


I fixed the bug, the fix will come with 4.1.2