Bug #40192 Extra null complemented rows when outer join uses Blocked Nested Loops
Submitted: 21 Oct 2008 3:28 Modified: 22 Nov 2010 1:17
Reporter: Igor Babaev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-BKA OS:Any
Assigned to: Igor Babaev CPU Architecture:Any

[21 Oct 2008 3:28] Igor Babaev
Description:
When an outer join employs a join buffer to join the first inner table by Blocked Nested Loops algorithm extra null complemented rows may be generated if the where clause contains conditions that can be pushed down to this table.

If executing an outer join the first inner table is joined by a Batch Key Access algorithm no extra null complemented rows are generated.

How to repeat:
Create and populate tables t1 and t2 with the following commands:

create table t1 (a int, b int);
insert into t1 values (2, 20), (3, 30), (1, 10);
create table t2 (a int, c int);
insert into t2 values (1, 101), (3, 102), (1, 100);

Executing the query 
select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null
we expect 2 rows in the result set as we have:

mysql> select * from t1 left join t2 on t1.a=t2.a;
+------+------+------+------+
| a    | b    | a    | c    |
+------+------+------+------+
|    1 |   10 |    1 |  101 |
|    3 |   30 |    3 |  102 |
|    1 |   10 |    1 |  100 |
|    2 |   20 | NULL | NULL |
+------+------+------+------+

We really have a correct answer if no join buffer is used:

mysql> set join_cache_level=1;
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    3 |             |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    3 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
2 rows in set (0.01 sec)

mysql> select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+------+------+------+------+
| a    | b    | a    | c    |
+------+------+------+------+
|    2 |   20 | NULL | NULL |
|    3 |   30 |    3 |  102 |
+------+------+------+------+
2 rows in set (0.01 sec)

But if a join buffer is used the returned result set contains an extra null complemented row:  

mysql> set join_cache_level=6;
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+------+------+------+------+
| a    | b    | a    | c    |
+------+------+------+------+
|    3 |   30 |    3 |  102 |
|    2 |   20 | NULL | NULL |
|    1 |   10 | NULL | NULL |
+------+------+------+------+
3 rows in set (0.00 sec)

We can see that there is no such problem if BKA is employed:

mysql> create index i_a on t2(a);
Query OK, 3 rows affected (0.49 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> explain select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+----+-------------+-------+------+---------------+------+---------+-----------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref       | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+-----------+------+--------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL      |    3 |                                |
|  1 | SIMPLE      | t2    | ref  | i_a           | i_a  | 5       | test.t1.a |    2 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+-----------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> select * from t1 left join t2 on t1.a=t2.a where t2.c=102 or t2.c is null;
+------+------+------+------+
| a    | b    | a    | c    |
+------+------+------+------+
|    3 |   30 |    3 |  102 |
|    2 |   20 | NULL | NULL |
+------+------+------+------+
2 rows in set (0.00 sec)
[23 Oct 2008 18:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/56931

2662 Igor Babaev	2008-10-23
      Fixed bug #40192.
      When Blocked Nested Loops (BNL) algorithm is used for 
      for a regular join each record of the joined table first
      is checked by the condition pushed to this table. Only
      after this the matches for this record in the join buffer
      are looked for. 
      This does not work for an outer join operation as the 
      condition pushed to an inner table may include predicates
      from the where clause which should be applied only after
      a match has been found.
[14 Dec 2008 11:07] Bugs System
Pushed into 6.0.8-alpha  (revid:igor@mysql.com-20081023183352-bdsn1q3yklgw4u0d) (version source revid:igor@mysql.com-20081023183352-bdsn1q3yklgw4u0d) (pib:5)
[14 Jan 2009 0:36] Paul DuBois
Correction: This is pushed into 6.0.9.
[14 Jan 2009 0:54] Paul DuBois
Noted in 6.0.9 changelog.

When an outer join employed a join buffer to join the first inner
table by the Blocked Nested Loops algorithm, extra NULL-complemented 
rows could be generated if the WHERE clause contained conditions that
can be pushed down to this table.
[16 Aug 2010 6:40] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:24] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 1:17] Paul DuBois
Noted in 5.6.1 changelog.
[23 Nov 2010 2:29] Paul DuBois
Correction: No 5.6.1 changelog entry. Bug does not appear in any released 5.6.x version.