Bug #8558 Poor query plan with merge tables
Submitted: 16 Feb 2005 21:51 Modified: 16 Feb 2005 23:32
Reporter: Gregert Johnson Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:4.0.18-standard OS:Linux (Linux 2.4.22 (Mandrake 9))
Assigned to: Sergei Golubchik CPU Architecture:Any

[16 Feb 2005 21:51] Gregert Johnson
Description:
We have found it to be necessary to provide hints to the query optimizer in order to reduce the execution times of joins that involve merge tables.  The difference in execution times is by orders of magnitude, typically from 10 minutes down to 1 or 2 seconds.  Our speculation is that the fact that the join in question involves merge tables may be an issue for the optimizer.

Specifics:

Table definitions:

CREATE TABLE `call` (
  `call_key` bigint(20) unsigned NOT NULL default '0',
  `alternate_db_key` int(10) unsigned NOT NULL default '0',
  `start_time` int(11) NOT NULL default '-1',
  `start_time_ext` int(11) NOT NULL default '-1',
  `time_key` int(11) NOT NULL default '-1',
  `end_time` int(11) NOT NULL default '-1',
  `end_time_ext` int(11) NOT NULL default '-1',
  .
  .
  .
  PRIMARY KEY  (`call_key`),
  KEY `ak1` (`alternate_db_key`),
  KEY `ie1` (`start_time`,`start_time_ext`),
  KEY `ie2` (`end_time`,`end_time_ext`)
) TYPE=MyISAM

CREATE TABLE `cdr_media` (
  `call_key` bigint(20) unsigned NOT NULL default '0',
  `seq` smallint(5) unsigned NOT NULL default '0',
  `probe_id` int(10) unsigned NOT NULL default '0',
  `start_time` int(11) NOT NULL default '-1',
  `start_time_ext` int(11) NOT NULL default '-1',
  `time_key` int(11) NOT NULL default '-1',
  `end_time` int(11) NOT NULL default '-1',
  `end_time_ext` int(11) NOT NULL default '-1',
  .
  .
  .
  `packets_received` int(10) unsigned NOT NULL default '0',
  .
  .
  .
  PRIMARY KEY  (`call_key`,`seq`),
  KEY `ie2` (`probe_id`),
  KEY `ie3` (`start_time`,`start_time_ext`)
) TYPE=MRG_MyISAM INSERT_METHOD=LAST UNION=(cdr_media_p9,cdr_media_p10,cdr_media_p11,cdr_media_p12,cdr_media_p13,cdr_media_p14,
cdr_media_p15,cdr_media_p16,cdr_media_p17,cdr_media_p18,cdr_media_p19,cdr_media_p20,cdr_media_p21,
cdr_media_p22,cdr_media_p23,cdr_media_p24,cdr_media_p25,cdr_media_p26,cdr_media_p27,cdr_media_p28)

Table sizes:

mysql> SELECT COUNT(*) FROM call;
+----------+
| COUNT(*) |
+----------+
|  6478115 |
+----------+

mysql> SELECT COUNT(*) FROM cdr_media;
+----------+
| COUNT(*) |
+----------+
|  7849908 |
+----------+

Query:

SELECT MIN(cdr.packets_received), MAX(cdr_packets_received)
FROM call c, cdr_media cdr
WHERE c.end_time BETWEEN 1108573200 AND 1108576799
  AND cdr.call_key = c.call_key;

The ANALYSIS procedure had been run on all tables involved in the join (i.e on 'call', 'cdr_media_p9', ... 'cdr_media_p28').

Query plan produced by optimizer:

+-------+--------+---------------+---------+---------+--------------+---------+-------------+
| table | type   | possible_keys | key     | key_len | ref          | rows    | Extra       |
+-------+--------+---------------+---------+---------+--------------+---------+-------------+
| cdr   | ALL    | PRIMARY       | NULL    |    NULL | NULL         | 7848994 |             |
| c     | eq_ref | PRIMARY,ie2   | PRIMARY |       8 | cdr.call_key |       1 | Using where |
+-------+--------+---------------+---------+---------+--------------+---------+-------------+

Modified query:

SELECT MIN(cdr.packets_received), MAX(cdr_packets_received)
FROM call c USE INDEX(ie2), cdr_media cdr
WHERE c.end_time BETWEEN 1108573200 AND 1108576799
  AND cdr.call_key = c.call_key;

Modified plan:

+-------+-------+---------------+---------+---------+------------+-------+-------------+
| table | type  | possible_keys | key     | key_len | ref        | rows  | Extra       |
+-------+-------+---------------+---------+---------+------------+-------+-------------+
| c     | range | ie2           | ie2     |       4 | NULL       | 10050 | Using where |
| cdr   | ref   | PRIMARY       | PRIMARY |       8 | c.call_key | 78492 |             |
+-------+-------+---------------+---------+---------+------------+-------+-------------+

So, the question is: why did the optimizer elect to do a complete table scan of table cdr (all 7.8 million rows), combined with a random access of the call table (c) for each of those rows?  The hint to use the index call.ie2 reversed the order of table access, with 10 thousand c table rows driving random accesses into the cdr table.

Could it be that the optimizer judged that because the cdr table contained 20 merged tables in its union, each random access into it (as in the modified plan) would require searching 20 tables, and would thus be prohibitive?  If so, it was evidently spectacularly wrong in its judgement!

How to repeat:
Running the query with the large data sets we have gives the reported results consistently.  If needed, test data can be provided.

Suggested fix:
None.
[16 Feb 2005 23:32] Sergei Golubchik
Thank you for taking the time to report a problem.  Unfortunately
you are not using a current version of the product your reported a
problem with -- the problem might already be fixed. Please download
a new version from http://www.mysql.com/downloads/

If you are able to reproduce the bug with one of the latest versions,
please change the version on this bug report to the version you
tested and change the status back to "Open".  Again, thank you for
your continued support of MySQL.

Additional info:

This optimization was implemented in 4.1.
According to http://dev.mysql.com/doc/mysql/en/news-4-1-0.html

* Added record_in_range() method to MERGE tables to be able
to choose the right index when there are many to choose from.