| Bug #24341 | Performance problem when using ORDER BY in fulltext query | ||
|---|---|---|---|
| Submitted: | 15 Nov 2006 18:28 | Modified: | 21 Nov 2006 16:37 |
| Reporter: | Friedemann Buergel | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 4, 5 | OS: | Any (All) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | fulltext, order by, sort | ||
[21 Nov 2006 16:37]
Valeriy Kravchuk
Sorry, but this is a documented limitation of MySQL optimizer. Please, read the manual (http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html): " In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause. These cases include the following: ... - The key used to fetch the rows is not the same as the one used in the ORDER BY: SELECT * FROM t1 WHERE key2=constant ORDER BY key1; ..." So, even separate index on date column will not help. We have to use workarounds, like those you described.

Description: A heavy performance problem comes up when querying large tables with a fulltext and other columns as soon as ORDER BY is used. The source of the problem seems to be * that querying a large number of rows in fulltext mode is slow even when an index is used * and that using ORDER BY forces mysql to calculate the complete list of hits for the fulltext column even when the query asks only for a small number of rows. How to repeat: Lets say we have a table with just one text and one date column (called text and date). The table has a fulltext index on the text column. On a table with about 30k rows and texts of 15kB average size the query SELECT * FROM mytable WHERE MATCH (text) AGAINST ('someword' ) ORDER BY date LIMIT 10 will produce the effect: When the number of hits in the text column is about 10k, the query will take about 10 seconds on an average class server. Suggested fix: It is possible to circumvent the problem, but only in a cumbersome fashion. In the above example, an additional column could be added that holds an extract of the date coded as keywords. The number of rows can then be reduced by adding something like AND MATCH (date2) AGAINST ( 'datekey200611' ) in order to reduce the total number of matches (say from 10k to about 100). Adding additional AND criteria one just has to take care that these match the results that would come first in the ORDER BY order. The method is therefore not limited to the date column case. But it is cumbersome, especially since the additional criteria may limit the final number of results below the desired amount. Therefore it would be very desirable to have mysql server solve the problem internally. The circumvention described could give a hint on how to address the issue.