Description:
While small ranges have very accurate, larger ranges are usually of by a factor of 2. Using an DBT-3 database:
mysql> select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-17';
+----------+
| count(*) |
+----------+
| 6440 |
+----------+
1 row in set (0.01 sec)
mysql> explain select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-17';
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+--------------------------+
| 1 | SIMPLE | orders | NULL | ref | i_o_orderdate | i_o_orderdate | 4 | const | 6440 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+------+---------------+---------------+---------+-------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-18';
+----------+
| count(*) |
+----------+
| 12677 |
+----------+
1 row in set (0.00 sec)
mysql> explain select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-18';
+----+-------------+--------+------------+-------+---------------+---------------+---------+------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------------+---------+------+-------+----------+--------------------------+
| 1 | SIMPLE | orders | NULL | range | i_o_orderdate | i_o_orderdate | 4 | NULL | 22946 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+---------------+---------------+---------+------+-------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
We see that while the estimate for 1 day is exact, the estimate for 2 days are almost twice as big as the reality.
How to repeat:
select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-17';
explain select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-17';
select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-18';
explain select count(*) from orders where o_orderdate between '1994-06-17' and '1994-06-18';
Suggested fix:
In btr_estimate_n_rows_in_range_low(), estimate is multiplied by 2 if range is large. Stop that! (Or find a smaller more appropriate factor to use)