Bug #26630 Partition pruning not working against joins
Submitted: 26 Feb 2007 11:21 Modified: 16 Oct 2008 17:59
Reporter: Lim tienaik Email Updates:
Status: Verified
Category:Server: Optimizer Severity:S5 (Performance)
Version:5.1.18-BK, 5.1.14-beta OS:Linux (linux)
Assigned to: Target Version:
Tags: Partition pruning
Triage: Triaged: D3 (Medium)

[26 Feb 2007 11:21] Lim tienaik
Description:
All partitions are use although its not required, the optimizer should choose the
partition that contains the data instead of everything.

How to repeat:
create table tbl_test
(
num int,
val int
)
partition by range(num)
(
partition p0 values less than (5),
partition p1 values less than (10),
partition p2 values less than (15),
partition p3 values less than (20)
);
create table tbl_co
(
num int,
reg int
);
insert into tbl_test values
(1,12),
(2,11),
(3,123),
(4,342),
(5,23412),
(6,343),
(7,65),
(8,67),
(9,87),
(10,132),
(11,1234),
(12,1245),
(13,12468),
(14,12342),
(15,15),
(16,1),
(17,1223);
insert into tbl_co values
(1,2),
(2,3),
(3,5),
(4,2),
(5,6),
(6,9),
(7,6),
(8,4),
(9,7),
(10,7),
(11,10),
(12,2),
(13,1),
(14,1),
(15,8),
(16,8),
(17,1);
explain partitions
select * from tbl_test  t
inner join tbl_co c on t.num = c.num
where reg = 8;

Suggested fix:
add check for partition pruning on joins
[7 Mar 2007 18:26] Valeriy Kravchuk
Thank you for a problem report. I was able to repeat the behaviour described with latest
5.1.17-BK on Linux:

openxs@suse:~/dbs/5.1> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.17-beta Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table tbl_test
    -> (
    -> num int,
    -> val int
    -> )
    -> partition by range(num)
    -> (
    -> partition p0 values less than (5),
    -> partition p1 values less than (10),
    -> partition p2 values less than (15),
    -> partition p3 values less than (20)
    -> );
Query OK, 0 rows affected (0.02 sec)

mysql> create table tbl_co
    -> (
    -> num int,
    -> reg int
    -> );
Query OK, 0 rows affected (0.03 sec)

mysql> insert into tbl_test values
    -> (1,12),
    -> (2,11),
    -> (3,123),
    -> (4,342),
    -> (5,23412),
    -> (6,343),
    -> (7,65),
    -> (8,67),
    -> (9,87),
    -> (10,132),
    -> (11,1234),
    -> (12,1245),
    -> (13,12468),
    -> (14,12342),
    -> (15,15),
    -> (16,1),
    -> (17,1223);
Query OK, 17 rows affected (0.01 sec)
Records: 17  Duplicates: 0  Warnings: 0

mysql> insert into tbl_co values
    -> (1,2),
    -> (2,3),
    -> (3,5),
    -> (4,2),
    -> (5,6),
    -> (6,9),
    -> (7,6),
    -> (8,4),
    -> (9,7),
    -> (10,7),
    -> (11,10),
    -> (12,2),
    -> (13,1),
    -> (14,1),
    -> (15,8),
    -> (16,8),
    -> (17,1);
Query OK, 17 rows affected (0.01 sec)
Records: 17  Duplicates: 0  Warnings: 0

mysql> explain partitions
    -> select * from tbl_test  t
    -> inner join tbl_co c on t.num = c.num
    -> where reg = 8;
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
| id | select_type | table | partitions  | type | possible_keys | key  | key_len
 | ref  | rows | Extra       |
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
|  1 | SIMPLE      | t     | p0,p1,p2,p3 | ALL  | NULL          | NULL | NULL
 | NULL |   17 |             |
|  1 | SIMPLE      | c     | NULL        | ALL  | NULL          | NULL | NULL
 | NULL |   17 | Using where |
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
2 rows in set (0.00 sec)

As tbl_test is selected as outer (and why not, with 17 rows in both tables?), partitions
prunning is not needed: we just select all rows from partitioned table, then select
corresponding one from tbl_co and test condition on it.

Theoreticall prunning is possible if we select rows with reg=8 from tbl_co, and then, for
each row we might be able to access one and only one partition. Let's try:

mysql> explain partitions select * from tbl_co c straight_join tbl_test t on t.
num = c.num where reg = 8;
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
| id | select_type | table | partitions  | type | possible_keys | key  | key_len
 | ref  | rows | Extra       |
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
|  1 | SIMPLE      | c     | NULL        | ALL  | NULL          | NULL | NULL
 | NULL |   17 | Using where |
|  1 | SIMPLE      | t     | p0,p1,p2,p3 | ALL  | NULL          | NULL | NULL
 | NULL |   17 | Using where |
+----+-------------+-------+-------------+------+---------------+------+--------
-+------+------+-------------+
2 rows in set (0.00 sec)

I suspect pruning was used, but how one can show that in EXPLAIN? For each row from
tbl_co you will need some partition, and you do not know what exact one!

If we do know, then we can see the pruning:

mysql> explain partitions select * from tbl_co c straight_join tbl_test t on t.
num = c.num where reg = 8 and t.num < 5;
+----+-------------+-------+------------+------+---------------+------+---------
+------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len
| ref  | rows | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------
+------+------+-------------+
|  1 | SIMPLE      | c     | NULL       | ALL  | NULL          | NULL | NULL
| NULL |   17 | Using where |
|  1 | SIMPLE      | t     | p0         | ALL  | NULL          | NULL | NULL
| NULL |    4 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------
+------+------+-------------+
2 rows in set (0.00 sec)

So, I think, it is not a bug (I mean, you can not tell that this is a bug based on
EXPLAIN PARTITIONS results). If you can prove that no pruning is used to find individual
rows from tbl_test in my example with straight_join then I agree, it is something to
consider.

Your comments are welcomed.
[7 Apr 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[13 Apr 2007 13:49] Valeriy Kravchuk
I tried to check again, with latest 5.1.18-BK, and based on the following results
(continue my previous tests):

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from tbl_co c straight_join tbl_test t on t.num = c.num where r
eg = 8;
+------+------+------+------+
| num  | reg  | num  | val  |
+------+------+------+------+
|   15 |    8 |   15 |   15 |
|   16 |    8 |   16 |    1 |
+------+------+------+------+
2 rows in set (0.01 sec)

mysql> show status like 'Handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 0     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 39    |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
15 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from tbl_co c straight_join tbl_test t on t.num = c.num where r
eg = 8 and t.num < 5;
Empty set (0.00 sec)

mysql> show status like 'Handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 0     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 18    |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
15 rows in set (0.00 sec)

I can see a clear difference in Handler_read_rnd_next value (18 when partition prunning
was surely used vs. 39 when it was NOT likely used in joining individual rows from outer
table with partitioned one). So, I think, there is a problem to consider.
[2 May 2007 6:25] Igor Babaev
This is a serious problem, not easy to resolve.
Marked as to be fixed later (after 5.2)
[16 Dec 2008 19:46] Omer BarNir
triage: forgot to set e/r values with last comment - updating now
[3 Jan 2009 19:59] Sergey Petrunya
First note: If you re-run the EXPLAINs from the example on a recent 5.1 you'll see that
"Using join buffer" is used for the second table. It was there before, too, it was just
not shown in EXPLAIN. This means that the query plan actually is 

1. Read records from the first table into join buffer until it is full;
2. Read records from the second table and join them with what is in the buffer. 

That is, tables are read almost independently and we hardly can prune anything. The only
option is  to do this:

1. Read records from the first table into join buffer until it is full;
2A. Walk through the buffer and see which partitions we'll need to scan in the second
table
2. Scan the needed partitions in the second table and join them with what is in the
buffer. 

... which could be not worthwhile, and also I think there are more urgently needed things
in partitioning.
[3 Jan 2009 20:15] Sergey Petrunya
Please note that for ref-type access it will actually work (can be checked with counters
or tracing), although EXPLAIN will look like it doesn't:

mysql> explain partitions select * from tbl_co c straight_join tbl_test t on t. num =
c.num where reg = 8\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 17
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: p0,p1,p2,p3
         type: ref
possible_keys: num
          key: num
      key_len: 5
          ref: sjm3.c.num
         rows: 2
        Extra: 
2 rows in set (0.01 sec)
[10 Apr 2009 14:56] shashi reddy
i am not sure if i understand this right but...

oracle seem to have a feature called partition-wise join which significantly improves the
performance of joins on partitioned tables(if both tables are partitioned by the same key
field)it performs the join  by spitting the join into smaller joins(partitions) thus
processing them in parallel.

i tried this in mysql by doing a join on partitioned tables(with equal number of
partitions) put i see no performance improvement...

thoughts?

shashi
[11 Apr 2009 12:17] Sergey Petrunya
Hi Shashi, 

Agree that scanning partitions in parallel could speed up certain queries, and joins in
particular. This is not supported by MySQL at the moment, though.  There is understanding
that this would be a nice feature to implement, but so far we don't commit to releasing it
in any particular version.
[30 Apr 2010 22:11] Andres Chaves
Hi,
Is this bug planned to be solved in any release?

As a datawarehouse developer the partitioning feature is useful but only if it can  work
with star schemas which need the join heavely.

Thanks,
[2 Aug 2010 12:03] Rajesh Sharma
Hi,

I am also facing this similar problem, is there any fixes for this issue in latest
version of mysql.

If yes then please let me know the mysql version.

Thanks,

Rajesh Sharma
[21 Feb 2011 15:59] Roberto Spadim
any news?