Bug #20111 | named columns used in subqueries hinder optimisation of LIKE, BETWEEN, range | ||
---|---|---|---|
Submitted: | 28 May 2006 10:57 | Modified: | 24 Jan 2014 13:26 |
Reporter: | Daniel Treplin | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 4.1.19-log | OS: | Linux (Linux) |
Assigned to: | CPU Architecture: | Any |
[28 May 2006 10:57]
Daniel Treplin
[29 May 2006 8:34]
Daniel Treplin
Further testing showed that the problem (not using the key, but scanning the whole index file) occurs with any range condition within a subquery, no matter how the subquery is written (named colums or not, like, between, gt/lt, whatever ), if the range boundaries are passed from the outer query.
[30 May 2006 22:01]
Valeriy Kravchuk
Thank you for a problem report. Why do you think that the index is not used? It should be used, according to EXPLAIN results... Have you checked with SHOW STATUS?
[31 May 2006 8:20]
Daniel Treplin
Hi, thanks for your answer. Please take a close look to the explain values of the dependent subquery of example 1. It says "possible keys: NULL", no key usage. Please also take a look to example 2, dependent subquery, column "type". It says "index", but correcly optimised, it should read "range". In both cases, type reads "index" what means full scan of the index file, but no key lookup, what results in a speed difference in order of magnitude. When I do the subquery part by hand as simple query, the query behaves as it should. (watch the type: it reads "range"): mysql> explain select count(*) from t1 where ind like "a.b.%"; +----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+ | 1 | SIMPLE | t1 | range | t1_c1 | t1_c1 | 255 | NULL | 175 | Using where; Using index | +----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+ 1 row in set (0.00 sec) All this leads me to the conclusion, that there is a problem in the constant propagation phase of the query optimiser when handling subqueries. To me, it looks like it does not recognise the values referencing to the outer query as constants, as, in my opinion, it should, because they are constant within a dependent subquery like this. But if they are not propagated as constants, mysql, as documented in many places, will do to a scan instead of a key lookup for all sorts of range conditions, including like.
[31 May 2006 8:33]
Daniel Treplin
Sorry, I forgot a correction when replacing my real field names with abstract values. The example query should read: explain select count(*) from t1 where c1 like "a.b.%";
[23 Jun 2006 15:59]
Valeriy Kravchuk
These (as well as many other) problems with DEPENDENT subqueries optimisation will be fixed in MySQL 5.2.
[24 Oct 2012 22:55]
Daniel Eloff
What's the status of this bug? Have the performance issues been resolved?
[24 Jan 2014 13:26]
Manyi Lu
Roy: This problem will not be fixed by subquery enhancements added to 5.6. First, it is a scalar subquery which is not covered by the optimizations. However, I think that the scalar subquery can be rewritten to an IN subquery which makes it possible for the optimizer to evaluate the query faster. But still, the desired index is not selected by the optimizer. Rewrite applied: and (select url from ... limit 1) is not null --> and 1 IN (select 1 from ...)