Description:
We are running into the following deterministic performance issue.
This query `WHERE host_id = 47 AND status = 1 AND start_date > "2020-01-15 ORDER BY host_id, status, start_date, id " LIMIT 3` is bounded in the sense that the work that MySQL does internally (as observed by Handler_read_next) is ~3.
However, changing the above query from `status = 1` to `status IS NULL` breaks the performance of this query, and it's no longer bounded.
How to repeat:
`example_reservation` table has a single composite index on (host_id, status, start_date)
Table | example_reservation
Create Table | CREATE TABLE `example_reservation` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`host_id` bigint(20) NOT NULL,
`status` bigint(20) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`nights` bigint(20) DEFAULT NULL,
`guest_count` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tri_index` (`host_id`,`status`,`start_date`)
) ENGINE=InnoDB AUTO_INCREMENT=1025 DEFAULT CHARSET=utf8
We have a bunch of entries with status IS NULL, and status = 1
mysql root@127.0.0.1:uds_qsplit>
SELECT count(1) FROM example_reservation
WHERE host_id = 47 AND status IS NULL AND start_date > "2020-11-15";
+----------+
| count(1) |
+----------+
| 24 |
+----------+
SELECT count(1) FROM example_reservation
WHERE host_id = 47 AND status = 1 AND start_date > "2020-11-15";
+----------+
| count(1) |
+----------+
| 269 |
+----------+
EXPECTED - the following query is efficient, i.e. MySQL issues ~3 Handler_read in order to return the LIMIT 3 rows
mysql root@127.0.0.1:uds_qsplit> FLUSH STATUS;
SELECT id, host_id, status, start_date
FROM example_reservation
FORCE INDEX (tri_index)
WHERE host_id = 47 AND status = 1 AND start_date > "2020-01-15"
ORDER BY host_id, status, start_date, id
LIMIT 3;
SHOW SESSION STATUS LIKE 'Handler_read%';
Query OK, 0 rows affected
Time: 0.002s
+-----+---------+--------+---------------------+
| id | host_id | status | start_date |
+-----+---------+--------+---------------------+
| 167 | 47 | 1 | 2020-11-25 20:20:12 |
| 221 | 47 | 1 | 2020-11-25 20:20:12 |
| 341 | 47 | 1 | 2020-11-25 20:20:12 |
+-----+---------+--------+---------------------+
3 rows in set
Time: 0.083s
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Handler_read_first | 0 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 2 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
+-----------------------+-------+
7 rows in set
Time: 0.071s
NOT EXPECTED - the same query is NOT EFFICIENT, if changing status = 1 to status IS NULL. Why is that?
mysql root@127.0.0.1:uds_qsplit> FLUSH STATUS;
SELECT id, host_id, status, start_date
FROM example_reservation FORCE INDEX (tri_index)
WHERE host_id = 47 AND status IS NULL AND start_date > "2020-11-15" ORDER BY host_id, status, start_date, id
LIMIT 3;
SHOW SESSION STATUS LIKE 'Handler_read%';
Query OK, 0 rows affected
Time: 0.002s
+------+---------+--------+---------------------+
| id | host_id | status | start_date |
+------+---------+--------+---------------------+
| 1021 | 47 | <null> | 2020-11-18 00:00:00 |
| 1020 | 47 | <null> | 2020-11-19 00:00:00 |
| 1001 | 47 | <null> | 2020-11-20 00:00:00 |
+------+---------+--------+---------------------+
3 rows in set
Time: 0.093s
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Handler_read_first | 0 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 24 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
+-----------------------+-------+
7 rows in set
Time: 0.067s
EXPECTED - removing status from the ORDER BY clause makes above query EFFICIENT. Why is that?
mysql root@127.0.0.1:uds_qsplit> FLUSH STATUS;
SELECT id, host_id, status, start_date
FROM example_reservation FORCE INDEX (tri_index)
WHERE host_id = 47 AND status IS NULL AND start_date > “2020-11-15"
ORDER BY host_id, start_date, id
LIMIT 3;
SHOW SESSION STATUS LIKE 'Handler_read%';
Query OK, 0 rows affected
Time: 0.002s
+------+---------+--------+---------------------+
| id | host_id | status | start_date |
+------+---------+--------+---------------------+
| 1021 | 47 | <null> | 2020-11-18 00:00:00 |
| 1020 | 47 | <null> | 2020-11-19 00:00:00 |
| 1001 | 47 | <null> | 2020-11-20 00:00:00 |
+------+---------+--------+---------------------+
3 rows in set
Time: 0.091s
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Handler_read_first | 0 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 2 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
+-----------------------+-------+
7 rows in set
Time: 0.069s