Bug #50025 pruning error when partition by month using to_days()
Submitted: 2 Jan 2010 12:25 Modified: 8 Jan 2010 23:26
Reporter: h h Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Partitions Severity:S2 (Serious)
Version:5.1.42, 5.1.43-bzr OS:Any (windows, Mac OS X)
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: partition, pruning

[2 Jan 2010 12:25] h h
Description:
i create partition table by month as artical "Partitioning with Dates in MySQL 5.1" indicated.
create table employ (
  `Id` varchar(50) NOT NULL,
  `Name` varchar(50) DEFAULT NULL,
   thedate date not null,
   `mId` varchar(50) NOT NULL,   
   KEY `thedate` (`Thedate`),
   KEY `date_mid` (`Thedate`, mid)
)ENGINE=InnoDB DEFAULT CHARSET=gb2312
PARTITION BY RANGE(to_days(thedate))
(
PARTITION P1 VALUES LESS THAN (to_days('2001-02-01')),
PARTITION P2 VALUES LESS THAN (to_days('2001-03-01')),
PARTITION P3 VALUES LESS THAN (to_days('2001-04-01')),
PARTITION p4 VALUES LESS THAN MAXVALUE
);

i execute explain statement as follows:
explain partitions select * from employ where thedate >= '2001-2-2' and thedate <= '2001-2-3'

the result is that
...
partitions: P1,P2
...

but i think the query should only search partition P2, not both P1 and P2.

How to repeat:
just create the table above and execute explain sql:
explain partitions select * from employ where thedate >= '2001-2-2' and thedate <= '2001-2-3'

Suggested fix:
I found the year(...) is work well when pruning, but the to_days(..) has the bug to search more partitions than should. please fix it. because the partition by month is prefered when data volume is very huge. I want to partition by month, but fear the performance of query pruning bug.
[2 Jan 2010 12:28] Valeriy Kravchuk
Thank you for the problem report. What exact version, 5.1.x, do you use? 5.1.42 is the latest.
[2 Jan 2010 12:55] h h
i used the 5.1.42, today i download from mysql website. thanks
[2 Jan 2010 13:12] Valeriy Kravchuk
Verified just as described with recent 5.1.43 on Mac OS X:

77-52-7-73:5.1 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.43-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table employ (
    ->   `Id` varchar(50) NOT NULL,
    ->   `Name` varchar(50) DEFAULT NULL,
    ->    thedate date not null,
    ->    `mId` varchar(50) NOT NULL,   
    ->    KEY `thedate` (`Thedate`),
    ->    KEY `date_mid` (`Thedate`, mid)
    -> )ENGINE=InnoDB DEFAULT CHARSET=gb2312
    -> PARTITION BY RANGE(to_days(thedate))
    -> (
    -> PARTITION P1 VALUES LESS THAN (to_days('2001-02-01')),
    -> PARTITION P2 VALUES LESS THAN (to_days('2001-03-01')),
    -> PARTITION P3 VALUES LESS THAN (to_days('2001-04-01')),
    -> PARTITION p4 VALUES LESS THAN MAXVALUE
    -> );
Query OK, 0 rows affected (0.44 sec)

mysql> insert into employ values(1, 'a', '2001-02-01', 'b');
Query OK, 1 row affected (0.01 sec)

mysql> insert into employ values(2, 'aa', '2001-03-01', 'bb');
Query OK, 1 row affected (0.00 sec)

mysql> insert into employ values(0, 'aaaa', '2001-01-01', 'bbbb');
Query OK, 1 row affected (0.00 sec)

mysql> explain partitions select * from employ where thedate >= '2001-2-2' and thedate <=
    -> '2001-2-3'
    -> ;
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
| id | select_type | table  | partitions | type  | possible_keys    | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | employ | P1,P2      | range | thedate,date_mid | thedate | 3       | NULL |    2 | Using where |
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
1 row in set (0.02 sec)

mysql> explain partitions select * from employ where thedate >= '2001-02-02' and thedate <= '2001-02-03';
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
| id | select_type | table  | partitions | type  | possible_keys    | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | employ | P1,P2      | range | thedate,date_mid | thedate | 3       | NULL |    2 | Using where |
+----+-------------+--------+------------+-------+------------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

Maybe related to bug #49742.
[8 Jan 2010 23:26] Mattias Jonsson
This is not a bug, see bug#20577.

Since TO_DAYS() evaluates to NULL for invalid dates like '2001-02-31' and NULLs for the partitioning expression is stored in the first partition, it must search the first partition too. I would recommend to create a special NULL partition first to catch these, eg 'partition pNULL values less than (0)', it would also increase performance since it would not need to search through valid but not matching dates.