Bug #35674 Range optimizer ignores conditions on inner tables in semi-join IN subqueries
Submitted: 29 Mar 2008 15:16 Modified: 16 Nov 2010 3:51
Reporter: Sergey Petrunya Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-bk OS:Any
Assigned to: Sergey Petrunya CPU Architecture:Any

[29 Mar 2008 15:16] Sergey Petrunya
Description:
Range optimizer ignores conditions on inner tables in semi-join IN subqueries. This causes the optimizer to miss good QEPs, WL#2980 had been no intent to have this property. In fact, the reverse was assumed.

How to repeat:
# Create the test tables

create table ten (a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

create table t1(a int, b int, filler char(100), key(a));
insert into t1 
select
  A.a + 10*(B.a + 10* C.a),
  A.a + 10*(B.a + 10* C.a),
  'filler'
from
  ten A, ten B, ten C;

create table t2 like t1;
insert into t2 select * from t1;

# This produces a poor 1000 x 1000 execution plan:

explain select * from t1
where 
  t1.a < 200 and
  t1.b in (select t2.b from t2 where t2.a<10)\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: t1
         type: ALL
possible_keys: a
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000
        Extra: Using where; Start temporary
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: t2
         type: ALL
possible_keys: a
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000
        Extra: Using where; End temporary; Using join buffer
2 rows in set (0.01 sec)

#
# The optimizer could use a join order of range(t2.a < 20), ALL(t1).
# 
# If we try a corresponding inner join, we see that it is indeed used:
#

explain select * from t1, t2 where t1.a < 200 and t1.b=t2.b and t2.a<20\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: range
possible_keys: a
          key: a
      key_len: 5
          ref: NULL
         rows: 20
        Extra: Using index condition; Using MRR
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ALL
possible_keys: a
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000
        Extra: Using where; Using join buffer
2 rows in set (0.00 sec)

Suggested fix:
Make the range optimizer make use of subquery's WHERE.
[29 Mar 2008 15:54] MySQL Verification Team
Thank you for the bug report. Verified as described.
[5 Apr 2008 21:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/44954

ChangeSet@1.2623, 2008-04-06 01:24:29+04:00, sergefp@mysql.com +3 -0
  BUG#35674: Range optimizer ignores conditions on inner tables in semi-join IN subqueries
  - Let range optimizer use such conditions.
[28 May 2008 10:01] Bugs System
Pushed into 6.0.6-alpha
[30 May 2008 18:31] Paul DuBois
Noted in 6.0.6 changelog.

The range optimizer ignored conditions on inner tables in semi-join
IN subqueries, causing the the optimizer to miss good query execution
plans.
[16 Aug 2010 6:41] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:08] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[16 Nov 2010 3:51] Paul DuBois
Noted in 5.6.1 changelog.