Bug #83248 Partition pruning is not working with LEFT JOIN
Submitted: 3 Oct 2016 15:57 Modified: 3 Oct 2016 17:12
Reporter: Valeriy Kravchuk Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.0, 5.7.15 OS:Any
Assigned to: CPU Architecture:Any
Tags: Optimizer, partition, pruning

[3 Oct 2016 15:57] Valeriy Kravchuk
Description:
It seems that partition pruning does not work as expected with LEFT JOIN:

mysql> explain select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d LEFT JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+----------------------+------+----------+--------------------------+
| id | select_type | table | partitions                            | type   | possible_keys | key     | key_len | ref                  | rows | filtered | Extra                    |
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+----------------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711                               | ALL    | b_id          | NULL    | NULL    | NULL                 |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711                               | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.b_id,const    |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160709,p160710,p160711,p160712,p9999 | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.c_id,test.a.d |    1 |   100.00 | Using index              |
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+----------------------+------+----------+--------------------------+
3 rows in set, 1 warning (0,10 sec)

Note that all partitions are accessed in the c table. Compare to the INNER JOIN:

mysql> explain select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d INNER JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711    | ALL    | b_id,c_id     | NULL    | NULL    | NULL              |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.b_id,const |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.c_id,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
3 rows in set, 1 warning (0,02 sec)

It's enough to just repeat condition in the ON clause to get pruning in case of LEFT JOIN as well:

mysql> explain select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d LEFT JOIN c on a.c_id = c.c_id and a.d = c.d and  a.d BETWEEN '2016-07-11' AND '2016-07-11' where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711    | ALL    | b_id          | NULL    | NULL    | NULL              |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.b_id,const |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | test.a.c_id,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+--------------------------+
3 rows in set, 1 warning (0,00 sec)

See below on the way to create and populate the tables.

How to repeat:
create table a (
a_id bigint auto_increment not null,
d date,
b_id bigint,
c_id bigint,
primary key (a_id, d),
key(b_id, d),
key(c_id, d)
)
/*!50500 PARTITION BY RANGE COLUMNS(d)
(PARTITION p160709 VALUES LESS THAN ('2016-07-10') ENGINE = InnoDB,
PARTITION p160710 VALUES LESS THAN ('2016-07-11') ENGINE = InnoDB,
PARTITION p160711 VALUES LESS THAN ('2016-07-12') ENGINE = InnoDB,
PARTITION p160712 VALUES LESS THAN ('2016-07-13') ENGINE = InnoDB,
PARTITION p9999 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */;

create table b (
b_id bigint auto_increment not null,
d date,
primary key (b_id, d))
/*!50500 PARTITION BY RANGE COLUMNS(d)
(PARTITION p160709 VALUES LESS THAN ('2016-07-10') ENGINE = InnoDB,
PARTITION p160710 VALUES LESS THAN ('2016-07-11') ENGINE = InnoDB,
PARTITION p160711 VALUES LESS THAN ('2016-07-12') ENGINE = InnoDB,
PARTITION p160712 VALUES LESS THAN ('2016-07-13') ENGINE = InnoDB,
PARTITION p9999 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */;

create table c (
c_id bigint auto_increment not null,
d date,
primary key (c_id, d))
/*!50500 PARTITION BY RANGE COLUMNS(d)
(PARTITION p160709 VALUES LESS THAN ('2016-07-10') ENGINE = InnoDB,
PARTITION p160710 VALUES LESS THAN ('2016-07-11') ENGINE = InnoDB,
PARTITION p160711 VALUES LESS THAN ('2016-07-12') ENGINE = InnoDB,
PARTITION p160712 VALUES LESS THAN ('2016-07-13') ENGINE = InnoDB,
PARTITION p9999 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */;

insert into c(d) values ( date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY));
insert into c(d) values ( date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY));
replace into c(d) select date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY) from c t1, c t2, c t3, c t4, c t5, c t6, c t7, c t8, c t9, c t10, c t11, c t12, c t13, c t14, c t15, c t16, c t17, c t18;

insert into b(d) values ( date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY));
insert into b(d) values ( date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY));
replace into b(d) select date_add('2016-07-13', INTERVAL -floor(rand()*5) DAY) from b t1, b t2, b t3, b t4, b t5, b t6, b t7, b t8, b t9, b t10, b t11, b t12, b t13, b t14, b t15, b t16, b t17, b t18;

insert into a(d, b_id, c_id) select b.d, b_id, c_id from b LEFT JOIN c on b.d = c.d and mod(c_id, 3) = 0 limit 1000000;

explain partitions select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d LEFT JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;

explain partitions select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d INNER JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;

Suggested fix:
Make sure that even with LEFT JOIN partition pruning is still applied to table at the right of the LEFT JOIN clause.
[3 Oct 2016 16:56] Miguel Solorzano
See https://bugs.mysql.com/bug.php?id=26630.
[3 Oct 2016 17:12] Miguel Solorzano
Thank you for the bug report.

mysql 5.7 > explain partitions select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d LEFT JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+------------------+------+----------+--------------------------+
| id | select_type | table | partitions                            | type   | possible_keys | key     | key_len | ref              | rows | filtered | Extra                    |
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+------------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711                               | ALL    | b_id          | NULL    | NULL    | NULL             |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711                               | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.b_id,const  |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160709,p160710,p160711,p160712,p9999 | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.c_id,p1.a.d |    1 |   100.00 | Using index              |
+----+-------------+-------+---------------------------------------+--------+---------------+---------+---------+------------------+------+----------+--------------------------+
3 rows in set, 2 warnings (0.17 sec)

mysql 5.7 >
mysql 5.7 > explain partitions select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d INNER JOIN c on a.c_id = c.c_id and a.d = c.d where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref             | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711    | ALL    | b_id,c_id     | NULL    | NULL    | NULL            |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.b_id,const |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.c_id,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
3 rows in set, 2 warnings (0.00 sec)

mysql 5.7 > explain select a.a_id , a.b_id, c.c_id from a JOIN b on b.b_id = a.b_id and a.d = b.d LEFT JOIN c on a.c_id = c.c_id and a.d = c.d and  a.d BETWEEN '2016-07-11' AND '2016-07-11' where a.d BETWEEN '2016-07-11' AND '2016-07-11' LIMIT 10;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref             | rows | filtered | Extra                    |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
|  1 | SIMPLE      | a     | p160711    | ALL    | b_id          | NULL    | NULL    | NULL            |    1 |   100.00 | Using where              |
|  1 | SIMPLE      | b     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.b_id,const |    1 |   100.00 | Using where; Using index |
|  1 | SIMPLE      | c     | p160711    | eq_ref | PRIMARY       | PRIMARY | 11      | p1.a.c_id,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+--------------------------+
3 rows in set, 1 warning (0.00 sec)

mysql 5.7 > SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+---------------------------------------+
| Variable_name           | Value                                 |
+-------------------------+---------------------------------------+
| innodb_version          | 5.7.17                                |
| protocol_version        | 10                                    |
| slave_type_conversions  |                                       |
| tls_version             | TLSv1,TLSv1.1                         |
| version                 | 5.7.17                                |
| version_comment         | Source distribution PULL: 2016-SEP-27 |
| version_compile_machine | x86_64                                |
| version_compile_os      | Win64                                 |
+-------------------------+---------------------------------------+
8 rows in set (0.08 sec)
[7 Sep 2017 15:45] Arnaud Adant
Any comment from the optimizer team ? This looks like a serious bug when using partitioning.