Description:
Consider having the table:
CREATE TABLE `prices` (
`id_product` int(11) NOT NULL default '0',
`id_shop` mediumint(9) NOT NULL default '0',
`seton` date NOT NULL default '0000-00-00',
`price` decimal(7,2) NOT NULL default '0.00',
PRIMARY KEY (`id_product`,`id_shop`,`seton`),
KEY `id_product` (`id_product`),
KEY `id_shop` (`id_shop`),
KEY `id_shop_2` (`id_shop`, `seton`),
KEY `id_product_2` (`id_product`,`price`),
KEY `seton` (`seton`),
KEY `price` (`price`)
) ENGINE=InnoDB
(let it be a price list for a set of shops offering (optionally) intersecting set of products; every price change for every product is stored, zero price means a shop stops selling a product. I need to query a minimal price for a specific product (120) for a date month ago).
The query is:
SELECT MIN(p.price) FROM prices p WHERE (p.id_shop, p.seton) IN (SELECT pg.id_shop, MAX(pg.seton) FROM prices pg WHERE pg.id_product = 120 AND pg.seton <= DATE_SUB(CURDATE(), INTERVAL 30 DAY) AND pg.price <> 0 GROUP BY pg.id_shop);
This query seems to run too long though both parts are fast enough.
if you run EXPLAIN
+----+--------------------+-------+-------+---------------------------------------------+---------+---------+-------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------------------------------------+---------+---------+-------+-------+--------------------------+
| 1 | PRIMARY | p | index | NULL | price | 4 | NULL | 56162 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | pg | ref | PRIMARY,id_product,id_product_2,seton,price | PRIMARY | 4 | const | 1 | Using where |
+----+--------------------+-------+-------+---------------------------------------------+---------+---------+-------+-------+--------------------------+
you'll see 3 strange things:
1) inner query is marked as "DEPENDENT SUBQUERY" though it's not
2) no key is used to optimize WHERE in outer query at all
3) inner query key choise seems to be always PRIMARY regardless of number of price records for each product (when adding price and seton keys is wise)
The first optimizer assumtion increases execution times dramatically as inner query is ran for every row of the table (i have ~60,000) despite the fact inner query resultset is of 10 rows and
the outer query can simply match the rows via aprop indexes
Trying FORCE INDEX and rewriting it using JOINs gives no performance boost (EXPLAIN shows no difference in choises)
How to repeat:
CREATE TABLE `prices` (
`id_product` int(11) NOT NULL default '0',
`id_shop` mediumint(9) NOT NULL default '0',
`seton` date NOT NULL default '0000-00-00',
`price` decimal(7,2) NOT NULL default '0.00',
PRIMARY KEY (`id_product`,`id_shop`,`seton`),
KEY `id_product` (`id_product`),
KEY `id_shop` (`id_shop`),
KEY `id_shop_2` (`id_shop`, `seton`),
KEY `id_product_2` (`id_product`,`price`),
KEY `seton` (`seton`),
KEY `price` (`price`)
) ENGINE=InnoDB;
EXPLAIN SELECT MIN(p.price) FROM prices p WHERE (p.id_shop, p.seton) IN (SELECT pg.id_shop, MAX(pg.seton) FROM prices pg WHERE pg.id_product = 120 AND pg.seton <= DATE_SUB(CURDATE(), INTERVAL 30 DAY) AND pg.price <> 0 GROUP BY pg.id_shop);
SELECT MIN(p.price) FROM prices p WHERE (p.id_shop, p.seton) IN (SELECT pg.id_shop, MAX(pg.seton) FROM prices pg WHERE pg.id_product = 120 AND pg.seton <= DATE_SUB(CURDATE(), INTERVAL 30 DAY) AND pg.price <> 0 GROUP BY pg.id_shop);
^C
Suggested fix:
none so far