Description:
There was a catch for the null partition in partition pruning,
in case that the partition function "F(part_col)" produced null.
However, in most cases except that COND_VALUE in
"part_col cmp COND_VALUE" is a BAD one, the null partition would be
never matching.
If the null partition is non-matching and bigger than any really
matching one, rows estimation, namely ha_partition::records_in_range(),
would give the false illusion that the table is empty, which leads to
suboptimial plans.
How to repeat:
CREATE TABLE t1 (
id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
memo varchar(1024) NOT NULL DEFAULT '',
sid char(32) NOT NULL DEFAULT '',
create_time datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id,create_time),
KEY idx_create_time (create_time),
KEY idx_sid_create_time (sid, create_time)
)
PARTITION BY RANGE (to_days(`create_time`))
(PARTITION p202011 VALUES LESS THAN (738125) ENGINE = InnoDB,
PARTITION p202012 VALUES LESS THAN (738156) ENGINE = InnoDB,
PARTITION p202101 VALUES LESS THAN (738187) ENGINE = InnoDB,
PARTITION p202102 VALUES LESS THAN (738215) ENGINE = InnoDB,
PARTITION p202103 VALUES LESS THAN (738246) ENGINE = InnoDB,
PARTITION p202104 VALUES LESS THAN (738276) ENGINE = InnoDB,
PARTITION p202105 VALUES LESS THAN (738307) ENGINE = InnoDB);
INSERT INTO t1 VALUES
(1, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(2, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(3, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(4, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(5, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(6, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(7, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(8, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(9, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(10, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(11, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(12, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(13, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(14, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(15, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(16, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(17, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(18, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(19, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(20, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(21, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(22, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(23, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(24, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(25, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(26, "abcabcabc0000000000", "123456789012345678", "2020-11-01 00:00:00"),
(100, "abcabcabc4444444444", "123456789012345678", "2021-03-07 22:46:20"),
(101, "abcabcabc3333333333", "123456789012345678", "2021-03-07 23:14:59"),
(102, "abcabcabc2222222222", "123456789012345678", "2021-03-08 00:23:10"),
(103, "abcabcabc1111111111", "123456789012345678", "2021-03-08 10:08:01");
ANALYZE TABLE t1;
mysql> explain partitions select * from t1 where sid = '123456789012345678' and create_time >= '2021-03-01 00:00:07.861';
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
| 1 | SIMPLE | t1 | p202011,p202103,p202104,p202105 | range | idx_create_time,idx_sid_create_time | idx_create_time | 5 | NULL | 1 | Using where |
+----+-------------+-------+---------------------------------+-------+-------------------------------------+-----------------+---------+------+------+-------------+
1 row in set (0.00 sec)
Suggested fix:
In general, the catch for the null partition could be avoided at the
end of partition iteration, if COND_VALUE is a GOOD one.
However, since part_expr is shared (part of TABLE_SHARE), to do the
null test, a deep copy should be obtained and replaced with COND_VALUE,
which is non-trivial work in commnity code.
Anyway, inclusion of non-matching null partition could be compensated
with a wasted index dive.