Bug #37902 Bug (2) in the query optimizer (explain)
Submitted: 6 Jul 2008 15:14 Modified: 27 Jul 2008 18:42
Reporter: Alejandro Cusumano Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.45, 5.0.68 OS:MacOS
Assigned to: CPU Architecture:Any

[6 Jul 2008 15:14] Alejandro Cusumano
Description:
A query in mysql takes a very long time to explain (~25/40 seconds), and less time to run (variable between 1 < t < 25 secs), while in PostgreSQL it takes less than a second for both operations.

Details follow.
Note: the data has been transferred to PostgreSQL with a commercial converter. There is
no 100% guarantee that the data is exactly the same, but it's very unlikely that a
difference in the data can cause such a difference in the query timings.

---------------------------------------------

The query is:

explain
select * from (
  select o.dbx_tx, count(*) AS null_cnt
  from pizza_orders o join pizza_line_items li on o.id = li.order_id 
  where li.type = 'ProductLineItem' and li.product_id is null
        and o.id >= 700000
  group by o.dbx_tx
) sn
left join (
  select o.dbx_tx, count(*) AS no_null_cnt
  from pizza_orders o join pizza_line_items li on o.id = li.order_id 
  where li.type = 'TicketLineItem' and exists (select * from pizza_line_items li2 where li2.order_id = li.order_id and type = 'ProductLineItem' and product_id is null)
        and o.id >= 700000
  group by o.dbx_tx
) snn
on sn.dbx_tx = snn.dbx_tx
order by sn.dbx_tx;

---------------------------------------------

The tables are:

CREATE TABLE `pizza_orders` (
  `id` int(11) NOT NULL,
  `dbx_tx` varchar(10) default NULL,
  PRIMARY KEY  (`id`),
  KEY `dbx_tx` (`dbx_tx`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

CREATE TABLE `pizza_line_items` (
  `id` int(11) NOT NULL,
  `order_id` int(11) default NULL,
  `type` varchar(255) default NULL,
  `product_id` int(11) default NULL,
  PRIMARY KEY  (`id`),
  KEY `order_id` (`order_id`),
  KEY `product_id` (`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

---------------------------------------------

The dataset has been sent via FTP.

---------------------------------------------

How to repeat:
Details given in the description.

Suggested fix:
Too unpolite to write in a public forum.
[6 Jul 2008 17:49] Valeriy Kravchuk
Thank you for a problem report. What exact version of MySQL server you had used? What is the the file name of a test case data for this bug on FTP server?
[6 Jul 2008 18:07] Alejandro Cusumano
The filename is in the format suggested by the instructions - 'bug-data-37902.zip'

Version updated in report metadata.
[27 Jul 2008 18:42] Valeriy Kravchuk
EXPLAIN is slow because MySQL has to "materialize" derived tables while executing EXPLAIN. This is a known problem. This also explain while subsequent execution is much faster - data are mostly still in the buffer(s).

I'd consider a (new) bug here the fact that after this SELECT (left derived table in LEFT JOIN):

mysql> explain select o.dbx_tx, count(*) AS null_cnt   from pizza_orders o join pizza_line_items li on o.id = li.order_id    where li.type = 'ProductLineItem' and li.product_id is null         and o.id >= 700000   group by o.dbx_tx
    -> ;
+----+-------------+-------+-------+---------------------+----------+---------+-----------+--------+----------------------------------------------+
| id | select_type | table | type  | possible_keys       | key      | key_len | ref       | rows   | Extra                                        |
+----+-------------+-------+-------+---------------------+----------+---------+-----------+--------+----------------------------------------------+
|  1 | SIMPLE      | o     | range | PRIMARY             | PRIMARY  | 4       | NULL      | 201413 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | li    | ref   | order_id,product_id | order_id | 5       | test.o.id |      1 | Using where                                  |
+----+-------------+-------+-------+---------------------+----------+---------+-----------+--------+----------------------------------------------+
2 rows in set (0.04 sec)

mysql> select o.dbx_tx, count(*) AS null_cnt   from pizza_orders o join pizza_line_items li on o.id = li.order_id    where li.type = 'ProductLineItem' and li.product_id is null         and o.id >= 700000   group by o.dbx_tx
    -> ;
Empty set (1 min 27.82 sec)
 
server does NOT stop executing right part of LEFT JOIN immediately. As as result, it executes a pointless SELECT with dependent subquery, where the overall query spends most of the time.