Description:
By default, the optimizer doesn't use derived or materialized to make subqueries perform better.
How to repeat:
1. Create the Test Table
CREATE TABLE `t1` (
`id` INT NOT NULL,
`name` VARCHAR(100),
`category` CHAR(15),
`t1_timestamp` DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ix_t1` (`name`, `category`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
2- Insert some amount of rows
3-
mysql [localhost:8040] {msandbox} (test) > explain select * from t1 main where id in (select max(id) from t1 s where name = 'node' group by category);
+----+-------------+-------+------------+------+---------------+-------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+--------+----------+-------------+
| 1 | PRIMARY | main | NULL | ALL | NULL | NULL | NULL | NULL | 114184 | 100.00 | Using where |
| 2 | SUBQUERY | s | NULL | ref | ix_t1 | ix_t1 | 403 | const | 1 | 100.00 | Using index |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+--------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
4- If we use JOIN instead of subquery, it works well
mysql [localhost:8040] {msandbox} (test) > explain SELECT main.* FROM t1 main JOIN (SELECT MAX(id) AS max_id FROM t1 WHERE name = 'node' GROUP BY category) subquery ON main.id = subquery.max_id;
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where |
| 1 | PRIMARY | main | NULL | eq_ref | PRIMARY | PRIMARY | 4 | subquery.max_id | 1 | 100.00 | NULL |
| 2 | DERIVED | t1 | NULL | ref | ix_t1 | ix_t1 | 403 | const | 1 | 100.00 | Using index |
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
5- If we enable subquery_to_derived, it uses derived
mysql [localhost:8040] {msandbox} (test) > SET @@optimizer_switch='subquery_to_derived=on';
Query OK, 0 rows affected (0.00 sec)
mysql [localhost:8040] {msandbox} (test) > explain select * from t1 main where id in (select max(id) from t1 s where name = 'node' group by category);
+----+-------------+------------+------------+--------+---------------+---------+---------+------------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+---------------+---------+---------+------------------------+------+----------+------------------------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Using where |
| 1 | PRIMARY | main | NULL | eq_ref | PRIMARY | PRIMARY | 4 | derived_1_2.Name_exp_1 | 1 | 100.00 | NULL |
| 2 | DERIVED | s | NULL | ref | ix_t1 | ix_t1 | 403 | const | 1 | 100.00 | Using index; Using temporary |
+----+-------------+------------+------------+--------+---------------+---------+---------+------------------------+------+----------+------------------------------+
3 rows in set, 1 warning (0.00 sec)
mysql [localhost:8040] {msandbox} (test) > show warnings;
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `test`.`main`.`id` AS `id`,`test`.`main`.`name` AS `name`,`test`.`main`.`category` AS `category`,`test`.`main`.`t1_timestamp` AS `t1_timestamp` from `test`.`t1` `main` join (/* select#2 */ select distinct max(`test`.`s`.`id`) AS `Name_exp_1` from `test`.`t1` `s` where (`test`.`s`.`name` = 'node') group by `test`.`s`.`category`) `derived_1_2` where (`test`.`main`.`id` = `derived_1_2`.`Name_exp_1`) |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
Suggested fix:
MariaDB directly use maretialized to optimize the query, that might help:
mysql [localhost:11402] {msandbox} (test) > explain select * from t1 main where id in (select max(id) from t1 s where name = 'node' group by category);
+------+--------------+-------------+--------+-----------------------+-----------------------+---------+---------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+-------------+--------+-----------------------+-----------------------+---------+---------------------+------+--------------------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | main | eq_ref | PRIMARY | PRIMARY | 4 | <subquery2>.max(id) | 1 | |
| 2 | MATERIALIZED | s | ref | ix_t1 | ix_t1 | 403 | const | 1 | Using where; Using index |
+------+--------------+-------------+--------+-----------------------+-----------------------+---------+---------------------+------+--------------------------+
3 rows in set (0.001 sec)