Bug #6635 bug in full-text search
Submitted: 15 Nov 2004 12:47 Modified: 8 Mar 2005 18:36
Reporter: Przemyslaw Popielarski Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.1.7 OS:Linux (Linux x86)
Assigned to: Sergei Golubchik CPU Architecture:Any

[15 Nov 2004 12:47] Przemyslaw Popielarski
Description:
Already described it on http://lists.mysql.com/mysql/175789 .

SELECT TYTUL FROM tKsidata
WHERE MATCH (TYTUL) AGAINST ("space")
ORDER BY TYTUL

TYTUL
Shaping Space 2ed
National Air & Space Museum
Light & Space
Free Space Architecture
Does Economics Space Matter ?
Economies of Signs & Space
Industry Space & Competition Contribution of Economists of P
Money & Space Economy
Improving Performance How to Manage White Space on Organisat
State Space Modeling of Time Series
Biogeography Introduction to Space Time and Life
Cognition of Geographic Space
(and so on)

Already done optimize table, analyze table. Table is MyISAM, MySQL 4.1.7
Linux x86 standard-binary.

Any ideas?  This worked in 4.0 perfectly!

Maybe this has something common with charsets and collations?
I'm using latin2 in server's defaults, db, table, connection and everywhere else.

How to repeat:
-

Suggested fix:
-
[19 Nov 2004 15:33] Przemyslaw Popielarski
Of course here we can see the problem with ORDER BY, because results are not sorted. This wasn't clearly said in my previous posting.
[28 Nov 2004 7:24] Hartmut Holzgraefe
Couldn't reproduce this with latin1 settings, can you please add the output of

  show variables like 'char%';
  show variables like 'coll%';

and

  show create table TYTUL;

to this report so that i can try to exactly reproduce it with your settings?

(a dump of the table would be nice to have, too, if possible)
[29 Nov 2004 10:42] Przemyslaw Popielarski
show variables like 'char%';
------------
character_set_client,latin2
character_set_connection,latin2
character_set_database,latin2
character_set_results,latin2
character_set_server,latin2
character_set_system,utf8
character_sets_dir,/usr/local/mysql-standard-4.1.7-pc-linux-i686/share/mysql/charsets/
------------

show variables like 'coll%';
------------
collation_connection,latin2_general_ci
collation_database,latin2_general_ci
collation_server,latin2_general_ci
------------

After some investigation, it seems that there is a problem with coexistence of two types of indexes on the same column. When I have full-text index alone on the column, sorting goes okey. When I add normal btree index to this column, sorting with full-text searching goes wrong.

Look for this test case:

CREATE TABLE `tKsidata2` (
  `KSI` varchar(12) NOT NULL default '',
  `TYTUL` varchar(60) default NULL,
  PRIMARY KEY  (`KSI`),
  KEY `idxTITLE` (`TYTUL`),
  FULLTEXT KEY `ft_TITLE` (`TYTUL`)
) ENGINE=MyISAM DEFAULT CHARSET=latin2;

SELECT TYTUL FROM tKsidata2
WHERE MATCH (TYTUL) AGAINST ("space")
ORDER BY TYTUL;

---------
Tourism Diasporas & Space
Nuts & Bolts of Space & Time
Luxury of Space
Space Japanese Design Solutions for Compact Living
Generalized Analytic Automorphic Forms in Hypercomplex Space
Cambridge Encyclopedia of Space
Making European Space
McLuhan in Space
Primer for Point & Space Groups
Vehicular Electric Power Systems Land Sea Air & Space Vehicl
Space-time Wireless Channels
Space-Time Codes & MIMO Systems
Space/Terrestrial Mobile Networks
Essentials of Linear State Space Systems
Air & Space Power in the New Millennium
---------

And now without btree index:
CREATE TABLE `tKsidata2` (
  `KSI` varchar(12) NOT NULL default '',
  `TYTUL` varchar(60) default NULL,
  PRIMARY KEY  (`KSI`),
  FULLTEXT KEY `ft_TITLE` (`TYTUL`)
) ENGINE=MyISAM DEFAULT CHARSET=latin2

SELECT TYTUL FROM tKsidata2
WHERE MATCH (TYTUL) AGAINST ("space")
ORDER BY TYTUL;

---------
Air & Space Power in the New Millennium
Cambridge Encyclopedia of Space
Essentials of Linear State Space Systems
Generalized Analytic Automorphic Forms in Hypercomplex Space
Luxury of Space
Making European Space
McLuhan in Space
Nuts & Bolts of Space & Time
Primer for Point & Space Groups
Space Japanese Design Solutions for Compact Living
Space-Time Codes & MIMO Systems
Space-time Wireless Channels
Space/Terrestrial Mobile Networks
Tourism Diasporas & Space
Vehicular Electric Power Systems Land Sea Air & Space Vehicl
---------

Dump of the table attached.

Regards!
[29 Nov 2004 10:47] Przemyslaw Popielarski
dump of the table tKsidata2

Attachment: tKsidata2.sql.bz2 (application/octet-stream, text), 160.48 KiB.

[14 Feb 2005 22:54] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[15 Feb 2005 0:23] Przemyslaw Popielarski
It seems that there is a problem with coexistence of
two types of indexes on the same column. When I have full-text index alone on
the column, sorting goes okey. When I add normal btree index to this column,
sorting with full-text searching goes wrong.

Workaround: delete standard index and leave ft index alone.
[8 Mar 2005 12:16] Sergei Golubchik
doing EXPLAIN one can notice something rather odd:

mysql> explain SELECT TYTUL FROM t1 WHERE MATCH (TYTUL) AGAINST ("space") ORDER BY TYTUL;
+----+-------------+-------+----------+---------------+-----------+---------+------+------+-------------+
| id | select_type | table | type     | possible_keys | key       | key_len | ref  | rows | Extra       |
+----+-------------+-------+----------+---------------+-----------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | fulltext | ft_TITLE      | idx_TITLE |       0 |      |    1 | Using where |
+----+-------------+-------+----------+---------------+-----------+---------+------+------+-------------+

MySQL uses "fulltext" join type with key=idx_TITLE !
(apparently optimizer switches to idx_TITLE index the very last moment to avoid filesort.but join type still expects to have fulltext index)
[8 Mar 2005 18:36] Sergei Golubchik
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Fixed in 4.1.11

as a workaround you can use

SELECT TYTUL FROM tKsidata
WHERE MATCH (TYTUL) AGAINST ("space")
ORDER BY concat(TYTUL)
         ^^^^^^
[8 Mar 2005 18:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/22786