| Bug #33030 | MySQL 5.1 query optimizer regression | ||
|---|---|---|---|
| Submitted: | 6 Dec 2007 4:07 | Modified: | 25 Dec 2007 15:30 |
| Reporter: | Venu Anuganti | Email Updates: | |
| Status: | Can't repeat | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.1.16 and latest 5.1.22 | OS: | Any |
| Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
| Tags: | regression | ||
[6 Dec 2007 4:40]
Valeriy Kravchuk
Thank you for a problem report. Can you upload (as private file) dump of tables data to demonstrate the behaviour described?
[6 Dec 2007 4:53]
Venu Anuganti
You don't need a table data to run a explain and see the difference between the versions
[6 Dec 2007 4:59]
Valeriy Kravchuk
Yes, I saw a difference in EXPLAIN, on empty tables. But to call it a verified optimizer bug we need to prove that different plan leads to slower execution. On empty tables they are all equally good (or bad). I had also found a bit similar (but simpler) case where optimizer currently do NOT use index for ORDER BY ... LIMIT N. It is described in bug #28404. That bug is already fixed, but fix will be included in 5.1.23. So, is it possible for you to upload some data? If no, please, send the results of: SELECT COUNT(*) FROM urls u INNER JOIN urls_categories uc USING (url_id); SHOW TABLE STATUS LIKE urls; SHOW TABLE STATUS LIKE urls_categories;
[7 Dec 2007 16:54]
Valeriy Kravchuk
Verified just as described (on Solaris 10) with 5.1.22 on the data provided (see next private comment):
mysql> show table status\G
*************************** 1. row ***************************
Name: urls
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1493703
Avg_row_length: 118
Data_length: 176881664
Max_data_length: 0
Index_length: 288243712
Data_free: 0
Auto_increment: 3001829
Create_time: 2007-12-07 06:26:27
Update_time: NULL
Check_time: NULL
Collation: ujis_japanese_ci
Checksum: NULL
Create_options:
Comment:
*************************** 2. row ***************************
Name: urls_categories
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 3121309
Avg_row_length: 46
Data_length: 146440192
Max_data_length: 0
Index_length: 306184192
Data_free: 0
Auto_increment: NULL
Create_time: 2007-12-07 06:26:46
Update_time: NULL
Check_time: NULL
Collation: ujis_japanese_ci
Checksum: NULL
Create_options:
Comment:
2 rows in set (0.09 sec)
mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: u
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 1493703
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: uc
type: ref
possible_keys: PRIMARY,rd_index
key: PRIMARY
key_len: 4
ref: test.u.url_id
rows: 1
Extra:
2 rows in set (0.00 sec)
Query is executed for 1 min 23.36 sec (to get 10 rows). FORCE INDEX does NOT work:
mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE INDEX FOR ORDER BY (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10;
+----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+
| 1 | SIMPLE | u | ALL | PRIMARY | NULL | NULL | NULL | 1493703 | Using temporary; Using filesort |
| 1 | SIMPLE | uc | ref | PRIMARY,rd_index | PRIMARY | 4 | test.u.url_id | 1 | |
+----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+
2 rows in set (0.00 sec)
mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE INDEX FOR JOIN (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10;
+----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+
| 1 | SIMPLE | uc | ALL | NULL | NULL | NULL | NULL | 3121309 | Using filesort |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | test.uc.url_id | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+
2 rows in set (0.00 sec)
But actually using this index is very beneficial, as can be demonstrated by the following, a bit modified query that makes this index "covering" (and has to use STRAIGHT_JOIN hint):
mysql> EXPLAIN SELECT SQL_NO_CACHE u.*, uc.modified_on, uc.workflow, uc.category_id FROM urls_categories uc STRAIGHT_JOIN urls u WHERE uc.url_id = u.url_id ORDER BY uc.modified_on LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: uc
type: index
possible_keys: PRIMARY,rd_index
key: search_index
key_len: 6
ref: NULL
rows: 3121309
Extra: Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: u
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.uc.url_id
rows: 1
Extra:
2 rows in set (0.00 sec)
It runs amazingly fast (0.00 sec), and additional 10 table accesses to get other columns should NOT make any big difference, comparing to scanning the ENTIRE table and sorting.
[7 Dec 2007 17:03]
Valeriy Kravchuk
According to last Igor's comment at http://bugs.mysql.com/bug.php?id=28404: "Now we can use an index for ORDER BY only if 1. the index it's covering or 2. ORDER BY is used with LIMIT N and the cost of N random accesses is less than the cost of filesort. The latter is always true for small N and big tables." As we can see, item 1 "works", while item 2 is now ignored in case of join. Separate problem (#2) here is that the above comments are NOT presented at http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html or at http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html Problem #3 is that FORCE INDEX FOR ORDER BY is silently ignored.
[25 Dec 2007 15:30]
Evgeny Potemkin
Repeatable on 5.1.22. Already fixed in 5.1.23.
mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc
USING (url_id) ORDER BY uc.modified_on LIMIT 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: uc
type: index
possible_keys: PRIMARY,rd_index
key: search_index
key_len: 6
ref: NULL
rows: 10
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: u
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.uc.url_id
rows: 1
Extra:
2 rows in set (0.00 sec)
mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc F
ORCE INDEX FOR JOIN (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT
10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: uc
type: index
possible_keys: NULL
key: search_index
key_len: 6
ref: NULL
rows: 10
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: u
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: test.uc.url_id
rows: 1
Extra:
2 rows in set (0.00 sec)
mysql> select version();
+-----------------+
| version() |
+-----------------+
| 5.1.23-rc-debug |
+-----------------+
1 row in set (0.00 sec)

Description: Here is a case where the same query runs fine in about 0 secs with MySQL server 5.1.16 and the same one takes about 1m 20secs with the latest 5.1.22 version on the same box using the same hardware and configuration. How to repeat: Lets consider the following two tables urls and urls_categories… | urls | CREATE TABLE `urls` ( `url_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `url` VARBINARY(2000) NOT NULL, `url_status` ENUM(‘EXISTS’,‘DEAD’,‘N/A’) NOT NULL, `yservice_id` SMALLINT(5) UNSIGNED DEFAULT NULL, `created_by` SMALLINT(5) UNSIGNED NOT NULL, `modified_by` SMALLINT(5) UNSIGNED NOT NULL, `created_on` DATETIME NOT NULL, `modified_on` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `nocheck` TINYINT(1) NOT NULL DEFAULT ‘0′, `url_md5` CHAR(32) NOT NULL, PRIMARY KEY (`url_id`), UNIQUE KEY `url_unique` (`url_md5`), KEY `modified_on` (`modified_on`), KEY `url` (`url`(50)), KEY `yservice_id` (`yservice_id`), KEY `modified_by` (`modified_by`), KEY `created_by` (`created_by`) ) ENGINE=INNODB DEFAULT CHARSET=latin1 | | urls_categories | CREATE TABLE `urls_categories` ( `url_id` INT(10) UNSIGNED NOT NULL, `category_id` TINYINT(3) UNSIGNED NOT NULL, `workflow` ENUM(‘NEW’,‘ACTIVATED’,‘DEACTIVATED’,‘REMOVED’) NOT NULL, `searchtype_id` TINYINT(3) UNSIGNED NOT NULL, `work_id` TINYINT(3) UNSIGNED NOT NULL, `created_by` SMALLINT(5) UNSIGNED NOT NULL, `modified_by` SMALLINT(5) UNSIGNED NOT NULL, `created_on` DATETIME NOT NULL, `modified_on` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `comment` VARCHAR(200) NOT NULL, PRIMARY KEY (`url_id`,`category_id`), KEY `search_index` (`modified_on`,`workflow`,`category_id`), KEY `rd_index` (`url_id`,`category_id`,`workflow`), KEY `category_id` (`category_id`), KEY `work_id` (`work_id`), KEY `modified_by` (`modified_by`), KEY `created_by` (`created_by`), KEY `searchtype_id` (`searchtype_id`) ) ENGINE=INNODB DEFAULT CHARSET=latin1 | And here is the explain output for a simple inner join query when ran with MySQL version 5.1.16: mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: index possible_keys: PRIMARY,rd_index key: search_index key_len: 6 ref: NULL rows: 3121058 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows IN SET (0.00 sec) and here is what you get when ran with MySQL 5.1.22: mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: ALL possible_keys: PRIMARY,rd_index key: NULL key_len: NULL ref: NULL rows: 3121045 Extra: USING filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows IN SET (0.00 sec) as you can cleary see, for urls_categories table, the key search_index is missing and optimizer is not picking that. Even if I force to use the key, still no effect: mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE key(search_index,PRIMARY,rd_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: ALL possible_keys: PRIMARY,rd_index key: NULL key_len: NULL ref: NULL rows: 3121045 Extra: USING filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows IN SET (0.00 sec) mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE key(search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3121045 Extra: USING filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows IN SET (0.00 sec) As a result of the above bad index use, the query takes about 1m 20 secs with 5.1.22 where as the same one runs in about 0.0 secs as shown below.. with 5.1.22: mysql> SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10; … … 10 rows IN SET (1 min 14.55 sec) with 5.1.16: mysql> SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10; … … 10 rows IN SET (0.00 sec) Suggested fix: Optimizer to pick the right index...