Bug #51370 | Weighted Fulltext Query by Document Area Inefficient | ||
---|---|---|---|
Submitted: | 22 Feb 2010 0:03 | Modified: | 22 Feb 2010 6:18 |
Reporter: | Jaimie Sirovich | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: FULLTEXT search | Severity: | S4 (Feature request) |
Version: | 5.1 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | benchmarks, fulltext, speed, weight |
[22 Feb 2010 0:03]
Jaimie Sirovich
[22 Feb 2010 0:26]
Jaimie Sirovich
Thought I should note that the best solutions are the latter 2 (C&D). AFAIK there is no other way that works well over 100k+ rows. D can be implemented with a subquery + outer ORDER BY IFF a limit offset is set within range 1..N. -- Jaimie Sirovich President SEO Egghead, Inc. RELEASED: Professional Search Engine Optimization with PHP & ASP.NET (Wrox Press) http://www.seoegghead.com/our-seo-book/search-engine-optimization-with-php.seo http://www.seoegghead.com/our-seo-book/search-engine-optimization-with-asp-net.seo
[22 Feb 2010 5:59]
Valeriy Kravchuk
Thank you for the feature request.
[22 Feb 2010 6:18]
Jaimie Sirovich
Am I correct in asserting there is no (efficient) way to do this currently? If not, I will document the better solutions to help others on here a little better. Right now, it appears the best solution is to use one aggregate column to index, then resort the top N results (factoring in title as a multiplicative factor), where N is a constant like 500. So long as you don't care about discrepancies in paging by non multiples of N, you're fine, and the results are good. Smaller databases do fine with the ORDER BY FT_1 * FT_2 solution. I should add I noticed one other oddity. A query of the form: SELECT * FROM table WHERE FULLTEXT_COND ORDER BY FULLTEXT_COND seems to order the data again. This is not a bug, but it's a missing optimization. I'll file that as another bug after I confirm it. Thanks again, Jaimie.