| Bug #39756 | EXPLAIN for uncorrelated NOT EXISTS query is slow | ||
|---|---|---|---|
| Submitted: | 30 Sep 2008 12:59 | Modified: | 5 May 2014 5:54 |
| Reporter: | Roy Lyseng | Email Updates: | |
| Status: | Can't repeat | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 6.0,5.4, 5.1 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
[30 Sep 2008 13:41]
Valeriy Kravchuk
Verified just as described:
C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3311 test
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 6.0.6-alpha-community MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create table t1(
-> id integer primary key,
-> u integer not null unique,
-> v integer not null,
-> vi integer not null, index(vi),
-> s char(1));
Query OK, 0 rows affected (0.08 sec)
mysql> create table t2(
-> id integer primary key,
-> u integer not null unique,
-> v integer not null,
-> vi integer not null, index(vi),
-> s char(1));
Query OK, 0 rows affected (0.09 sec)
mysql> insert into t1 values(10, 10, 10, 10, 'l');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(10, 10, 10, 10, 'r');
Query OK, 1 row affected (0.06 sec)
mysql> insert into t1 values(20, 20, 20, 20, 'l');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(30, 30, 30, 30, 'r');
Query OK, 1 row affected (0.06 sec)
mysql> insert into t1 values(40, 40, 40, 40, 'l');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t1 values(41, 41, 40, 40, 'l');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t1 values(50, 50, 50, 50, 'l');
Query OK, 1 row affected (0.03 sec)
mysql> insert into t1 values(51, 51, 50, 50, 'l');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(50, 50, 50, 50, 'r');
Query OK, 1 row affected (0.03 sec)
mysql> insert into t1 values(60, 60, 60, 60, 'l');
Query OK, 1 row affected (0.03 sec)
mysql> insert into t1 values(61, 61, 60, 60, 'l');
Query OK, 1 row affected (0.06 sec)
mysql> insert into t2 values(60, 60, 60, 60, 'r');
Query OK, 1 row affected (0.11 sec)
mysql> insert into t2 values(61, 61, 60, 60, 'r');
Query OK, 1 row affected (0.08 sec)
mysql> insert into t1 values(70, 70, 70, 70, 'l');
Query OK, 1 row affected (0.11 sec)
mysql> insert into t2 values(70, 70, 70, 70, 'r');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(71, 71, 70, 70, 'r');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(80, 80, 80, 80, 'r');
Query OK, 1 row affected (0.05 sec)
mysql> insert into t2 values(81, 81, 80, 80, 'r');
Query OK, 1 row affected (0.05 sec)
mysql> explain select * from t1 where not exists(select * from t2)\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Impossible WHERE
*************************** 2. row ***************************
id: 2
select_type: SUBQUERY
table: t2
type: index
possible_keys: NULL
key: PRIMARY
key_len: 4
ref: NULL
rows: 9
Extra: Using index
2 rows in set (0.00 sec)
[14 Jul 2009 22:28]
Patrick Crews
The explain query is the same in 5.1
[14 Sep 2010 11:44]
Tor Didriksen
Maybe the solution is as simple as this.
It breaks a handful of 'explain' results in the test suite,
nothing else that I can see.
// Item_exists_subselect
virtual bool is_expensive_processor(uchar *arg)
{
return !is_correlated;
}

Description: Anuncorrelated NOT EXISTS query is not marked as expensive by the optimizer, hence the optimizer will attempt to execute the subquery during optimization, as witnessed by the following stack trace: #0 sub_select (join=0x9c0d670, join_tab=0x9c04a10, end_of_records=false) at sql_select.cc:13668 #1 0x083b12a4 in do_select (join=0x9c0d670, fields=0x9c0341c, table=0x0, procedure=0x0) at sql_select.cc:13456 #2 0x083cda2c in JOIN::exec (this=0x9c0d670) at sql_select.cc:2833 #3 0x082def35 in subselect_single_select_engine::exec (this=0x9c03d58) at item_subselect.cc:2305 #4 0x082e389e in Item_subselect::exec (this=0x9c03cc0) at item_subselect.cc:285 #5 0x082dc825 in Item_exists_subselect::val_bool (this=0x9c03cc0) at item_subselect.cc:845 #6 0x082a3415 in Item_func_not::val_int (this=0x9c03d80) at item_cmpfunc.cc:275 #7 0x08288612 in eval_const_cond (cond=0x9c03d80) at item_func.cc:63 #8 0x083b0543 in remove_eq_conds (thd=0x9babbb0, cond=0x9c03d80, cond_value=0x9c0cb80) at sql_select.cc:11115 #9 0x083ba92f in optimize_cond (join=0x9c0b688, conds=0x9c03d80, join_list=0x9bad19c, cond_value=0x9c0cb80) at sql_select.cc:10953 #10 0x083c4262 in JOIN::optimize (this=0x9c0b688) at sql_select.cc:1447 #11 0x083c8673 in mysql_select (thd=0x9babbb0, rref_pointer_array=0x9bad1dc, tables=0x9c02f48, wild_num=1, fields=@0x9bad16c, conds=0x9c03d80, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764740, result=0x9c03e60, unit=0x9bace40, select_lex=0x9bad0d8) at sql_select.cc:3009 #12 0x083c8b8e in mysql_explain_union (thd=0x9babbb0, unit=0x9bace40, result=0x9c03e60) at sql_select.cc:19413 The subquery execution triggered by the EXPLAIN statement is also causing some rather confusing explain output (check the "Extra" information in the 1. row below): *************************** 1. row *************************** id: 1 select_type: PRIMARY table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Impossible WHERE *************************** 2. row *************************** id: 2 select_type: SUBQUERY table: t2 type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 9 Extra: Using index BUG#14295 is about a similar but different problem Notice also that if t2 is empty the "Extra" text will be different. How to repeat: create table t1( id integer primary key, u integer not null unique, v integer not null, vi integer not null, index(vi), s char(1)); create table t2( id integer primary key, u integer not null unique, v integer not null, vi integer not null, index(vi), s char(1)); insert into t1 values(10, 10, 10, 10, 'l'); insert into t2 values(10, 10, 10, 10, 'r'); insert into t1 values(20, 20, 20, 20, 'l'); insert into t2 values(30, 30, 30, 30, 'r'); insert into t1 values(40, 40, 40, 40, 'l'); insert into t1 values(41, 41, 40, 40, 'l'); insert into t1 values(50, 50, 50, 50, 'l'); insert into t1 values(51, 51, 50, 50, 'l'); insert into t2 values(50, 50, 50, 50, 'r'); insert into t1 values(60, 60, 60, 60, 'l'); insert into t1 values(61, 61, 60, 60, 'l'); insert into t2 values(60, 60, 60, 60, 'r'); insert into t2 values(61, 61, 60, 60, 'r'); insert into t1 values(70, 70, 70, 70, 'l'); insert into t2 values(70, 70, 70, 70, 'r'); insert into t2 values(71, 71, 70, 70, 'r'); insert into t2 values(80, 80, 80, 80, 'r'); insert into t2 values(81, 81, 80, 80, 'r'); explain select * from t1 where not exists(select * from t2) Suggested fix: Mark all uncorrelated subqueries that cannot be resolved with maximum 1 row lookup per contained table as "expensive"