Description:
There was a catch for the null partition in partition pruning,
in case that the partition function "F(part_col)" produced null.
However, in most cases except that COND_VALUE in
"part_col cmp COND_VALUE" is a BAD one, the null partition would be
never matching.
If the null partition is non-matching and bigger than any really
matching one, rows estimation, namely ha_partition::records_in_range(),
would give the false illusion that the table is empty, which leads to
suboptimial plans.
How to repeat:
CREATE TABLE t1 (
id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
memo varchar(1024) NOT NULL DEFAULT '',
sid char(32) NOT NULL DEFAULT '',
create_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id,create_time),
KEY idx_create_time (create_time),
KEY idx_sid_create_time (sid, create_time)
)
PARTITION BY RANGE (to_days(`create_time`))
(PARTITION p202011 VALUES LESS THAN (738125) ENGINE = InnoDB,
PARTITION p202012 VALUES LESS THAN (738156) ENGINE = InnoDB,
PARTITION p202101 VALUES LESS THAN (738187) ENGINE = InnoDB,
PARTITION p202102 VALUES LESS THAN (738215) ENGINE = InnoDB,
PARTITION p202103 VALUES LESS THAN (738246) ENGINE = InnoDB,
PARTITION p202104 VALUES LESS THAN (738276) ENGINE = InnoDB,
PARTITION p202105 VALUES LESS THAN (738307) ENGINE = InnoDB);
INSERT INTO t1 VALUES
(1, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(2, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(3, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(4, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(5, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(6, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(7, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(8, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(9, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(10, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(11, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(12, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(13, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(14, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(15, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(16, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(17, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(18, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(19, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(20, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(21, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(22, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(23, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(24, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(25, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(26, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(100, "abcabcabc4444444444", "123456789012345678", "2021-03-07 22:46:20"),
(101, "abcabcabc3333333333", "123456789012345678", "2021-03-07 23:14:59"),
(102, "abcabcabc2222222222", "123456789012345678", "2021-03-08 00:23:10"),
(103, "abcabcabc1111111111", "123456789012345678", "2021-03-08 10:08:01");
ANALYZE TABLE t1;
mysql> explain partitions select * from t1 where sid = '123456789012345678' and create_time >= '2021-03-01 00:00:07.861';
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
| 1 | SIMPLE | t1 | p202011,p202103,p202104,p202105 | range | idx_create_time,idx_sid_create_time | idx_create_time | 5 | NULL | 1 | Using where |
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
1 row in set (0.00 sec)
Suggested fix:
In general, the catch for the null partition could be avoided at the
end of partition iteration, if COND_VALUE is a GOOD one.
However, since part_expr is shared (part of TABLE_SHARE), to do the
null test, a deep copy should be obtained and replaced with COND_VALUE,
which is non-trivial work in commnity code.
Anyway, inclusion of non-matching null partition could be compensated
with a wasted index dive.
Description: There was a catch for the null partition in partition pruning, in case that the partition function "F(part_col)" produced null. However, in most cases except that COND_VALUE in "part_col cmp COND_VALUE" is a BAD one, the null partition would be never matching. If the null partition is non-matching and bigger than any really matching one, rows estimation, namely ha_partition::records_in_range(), would give the false illusion that the table is empty, which leads to suboptimial plans. How to repeat: CREATE TABLE t1 ( id bigint(20) unsigned NOT NULL AUTO_INCREMENT, memo varchar(1024) NOT NULL DEFAULT '', sid char(32) NOT NULL DEFAULT '', create_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (id,create_time), KEY idx_create_time (create_time), KEY idx_sid_create_time (sid, create_time) ) PARTITION BY RANGE (to_days(`create_time`)) (PARTITION p202011 VALUES LESS THAN (738125) ENGINE = InnoDB, PARTITION p202012 VALUES LESS THAN (738156) ENGINE = InnoDB, PARTITION p202101 VALUES LESS THAN (738187) ENGINE = InnoDB, PARTITION p202102 VALUES LESS THAN (738215) ENGINE = InnoDB, PARTITION p202103 VALUES LESS THAN (738246) ENGINE = InnoDB, PARTITION p202104 VALUES LESS THAN (738276) ENGINE = InnoDB, PARTITION p202105 VALUES LESS THAN (738307) ENGINE = InnoDB); INSERT INTO t1 VALUES (1, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (2, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (3, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (4, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (5, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (6, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (7, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (8, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (9, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (10, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (11, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (12, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (13, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (14, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (15, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (16, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (17, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (18, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (19, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (20, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (21, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (22, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (23, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (24, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (25, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (26, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"), (100, "abcabcabc4444444444", "123456789012345678", "2021-03-07 22:46:20"), (101, "abcabcabc3333333333", "123456789012345678", "2021-03-07 23:14:59"), (102, "abcabcabc2222222222", "123456789012345678", "2021-03-08 00:23:10"), (103, "abcabcabc1111111111", "123456789012345678", "2021-03-08 10:08:01"); ANALYZE TABLE t1; mysql> explain partitions select * from t1 where sid = '123456789012345678' and create_time >= '2021-03-01 00:00:07.861'; +----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+ | 1 | SIMPLE | t1 | p202011,p202103,p202104,p202105 | range | idx_create_time,idx_sid_create_time | idx_create_time | 5 | NULL | 1 | Using where | +----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+ 1 row in set (0.00 sec) Suggested fix: In general, the catch for the null partition could be avoided at the end of partition iteration, if COND_VALUE is a GOOD one. However, since part_expr is shared (part of TABLE_SHARE), to do the null test, a deep copy should be obtained and replaced with COND_VALUE, which is non-trivial work in commnity code. Anyway, inclusion of non-matching null partition could be compensated with a wasted index dive.