Bug #24341 Performance problem when using ORDER BY in fulltext query
Submitted: 15 Nov 2006 18:28 Modified: 21 Nov 2006 16:37
Reporter: Friedemann Buergel Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:4, 5 OS:Any (All)
Assigned to: CPU Architecture:Any
Tags: fulltext, order by, sort

[15 Nov 2006 18:28] Friedemann Buergel
Description:
A heavy performance problem comes up when querying large tables with a fulltext and other columns as soon as ORDER BY is used.

The source of the problem seems to be 
* that querying a large number of rows in fulltext mode is slow even when an index is used
* and that using ORDER BY forces mysql to calculate the complete list of hits for the fulltext column even when the query asks only for a small number of rows.

How to repeat:
Lets say we have a table with just one text and one date column (called text and date). The table has a fulltext index on the text column. On a table with about 30k rows and texts of 15kB average size the query

SELECT * FROM mytable
WHERE  MATCH (text) AGAINST ('someword' )
ORDER BY date 
LIMIT 10

will produce the effect: When the number of hits in the text column is about 10k, the query will take about 10 seconds on an average class server.

Suggested fix:
It is possible to circumvent the problem, but only in a cumbersome fashion. In the above example, an additional column could be added that holds an extract of the date coded as keywords. The number of rows can then be reduced by adding something like 

AND MATCH (date2) AGAINST ( 'datekey200611' )

in order to reduce the total number of matches (say from 10k to about 100). Adding additional AND criteria one just has to take care that these match the results that would come first in the ORDER BY order. The method is therefore not limited to the date column case. But it is cumbersome, especially since the additional criteria may limit the final number of results below the desired amount. 

Therefore it would be very desirable to have mysql server solve the problem internally. The circumvention described could give a hint on how to address the issue.
[21 Nov 2006 16:37] Valeriy Kravchuk
Sorry, but this is a documented limitation of MySQL optimizer. Please, read the manual (http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html):

" In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE  clause. These cases include the following:

...
- The key used to fetch the rows is not the same as the one used in the ORDER BY:

SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
..."

So, even separate index on date column will not help. We have to use workarounds, like those you described.