Bug #58928 "not exists" optimization not applied if relevant table is not single inner one
Submitted: 14 Dec 2010 20:58 Modified: 14 Dec 2010 21:43
Reporter: Guilhem Bichot Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1-bugteam OS:Any
Assigned to: CPU Architecture:Any
Tags: performance

[14 Dec 2010 20:58] Guilhem Bichot
Description:
"not exists" optimization is explained here:
http://dev.mysql.com/doc/refman/5.5/en/explain-output.html

All tables have a single field "i" which is declared "not null".
Query with one inner table in the outer join (t2):

mysql> explain select * from  t1 LEFT JOIN (t2) ON 1 WHERE t2.i IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                   |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    1 |                         |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where; Not exists |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------------------+

We see that the "not exists" optimization was used, which is good.
Now add a second inner table next to t2:

mysql> explain select * from  t1 LEFT JOIN (t2 JOIN t3 ON 1) ON 1 WHERE t2.i IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    1 |             |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where |
|  1 | SIMPLE      | t3    | ALL  | NULL          | NULL | NULL    | NULL |    1 |             |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
"not exists" optimization is not used. Though in theory it should apply in this case too (if a row is found in t2, t2 shouldn't be scanned more). The same if the WHERE involves t3:
mysql> explain select * from  t1 LEFT JOIN (t2 JOIN t3 ON 1) ON 1 WHERE t3.i IS NULL;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    1 |             |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    1 |             |
|  1 | SIMPLE      | t3    | ALL  | NULL          | NULL | NULL    | NULL |    1 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+

How to repeat:
create table t1 (i int not null) engine = innodb;
insert into t1 values (0);
create table t2 (i int not null) engine = innodb;
insert into t2 values (0);
create table t3 (i int not null) engine = innodb;
insert into t3 values (0);
explain select * from  t1 LEFT JOIN (t2) ON 1 WHERE t2.i IS NULL;
explain select * from  t1 LEFT JOIN (t2 JOIN t3 ON 1) ON 1 WHERE t2.i IS NULL;
explain select * from  t1 LEFT JOIN (t2 JOIN t3 ON 1) ON 1 WHERE t3.i IS NULL;

Suggested fix:
A first problem.
When we have two inner tables, get_best_combination() resets not_exists_optimize to 0 for all tables because they all have *on_expr_ref==NULL, because at this stage the ON condition is attached to the "nested join" object, not directly to the base tables.
It's only later, in make_outerjoin_info(), that the ON condition is attached to the base tables, actually to the first inner table.
I imagine there are other problems, especially if the WHERE involves t3 (as ON is not attached to t3 but to t2).
[14 Dec 2010 21:43] MySQL Verification Team
Thank you for the bug report.