Bug #33030 | MySQL 5.1 query optimizer regression | ||
---|---|---|---|
Submitted: | 6 Dec 2007 4:07 | Modified: | 25 Dec 2007 15:30 |
Reporter: | Venu Anuganti | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.1.16 and latest 5.1.22 | OS: | Any |
Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
Tags: | regression |
[6 Dec 2007 4:07]
Venu Anuganti
[6 Dec 2007 4:40]
Valeriy Kravchuk
Thank you for a problem report. Can you upload (as private file) dump of tables data to demonstrate the behaviour described?
[6 Dec 2007 4:53]
Venu Anuganti
You don't need a table data to run a explain and see the difference between the versions
[6 Dec 2007 4:59]
Valeriy Kravchuk
Yes, I saw a difference in EXPLAIN, on empty tables. But to call it a verified optimizer bug we need to prove that different plan leads to slower execution. On empty tables they are all equally good (or bad). I had also found a bit similar (but simpler) case where optimizer currently do NOT use index for ORDER BY ... LIMIT N. It is described in bug #28404. That bug is already fixed, but fix will be included in 5.1.23. So, is it possible for you to upload some data? If no, please, send the results of: SELECT COUNT(*) FROM urls u INNER JOIN urls_categories uc USING (url_id); SHOW TABLE STATUS LIKE urls; SHOW TABLE STATUS LIKE urls_categories;
[7 Dec 2007 16:54]
Valeriy Kravchuk
Verified just as described (on Solaris 10) with 5.1.22 on the data provided (see next private comment): mysql> show table status\G *************************** 1. row *************************** Name: urls Engine: InnoDB Version: 10 Row_format: Compact Rows: 1493703 Avg_row_length: 118 Data_length: 176881664 Max_data_length: 0 Index_length: 288243712 Data_free: 0 Auto_increment: 3001829 Create_time: 2007-12-07 06:26:27 Update_time: NULL Check_time: NULL Collation: ujis_japanese_ci Checksum: NULL Create_options: Comment: *************************** 2. row *************************** Name: urls_categories Engine: InnoDB Version: 10 Row_format: Compact Rows: 3121309 Avg_row_length: 46 Data_length: 146440192 Max_data_length: 0 Index_length: 306184192 Data_free: 0 Auto_increment: NULL Create_time: 2007-12-07 06:26:46 Update_time: NULL Check_time: NULL Collation: ujis_japanese_ci Checksum: NULL Create_options: Comment: 2 rows in set (0.09 sec) mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: u type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 1493703 Extra: Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: uc type: ref possible_keys: PRIMARY,rd_index key: PRIMARY key_len: 4 ref: test.u.url_id rows: 1 Extra: 2 rows in set (0.00 sec) Query is executed for 1 min 23.36 sec (to get 10 rows). FORCE INDEX does NOT work: mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE INDEX FOR ORDER BY (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10; +----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+ | 1 | SIMPLE | u | ALL | PRIMARY | NULL | NULL | NULL | 1493703 | Using temporary; Using filesort | | 1 | SIMPLE | uc | ref | PRIMARY,rd_index | PRIMARY | 4 | test.u.url_id | 1 | | +----+-------------+-------+------+------------------+---------+---------+---------------+---------+---------------------------------+ 2 rows in set (0.00 sec) mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc FORCE INDEX FOR JOIN (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10; +----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+ | 1 | SIMPLE | uc | ALL | NULL | NULL | NULL | NULL | 3121309 | Using filesort | | 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | test.uc.url_id | 1 | | +----+-------------+-------+--------+---------------+---------+---------+----------------+---------+----------------+ 2 rows in set (0.00 sec) But actually using this index is very beneficial, as can be demonstrated by the following, a bit modified query that makes this index "covering" (and has to use STRAIGHT_JOIN hint): mysql> EXPLAIN SELECT SQL_NO_CACHE u.*, uc.modified_on, uc.workflow, uc.category_id FROM urls_categories uc STRAIGHT_JOIN urls u WHERE uc.url_id = u.url_id ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: index possible_keys: PRIMARY,rd_index key: search_index key_len: 6 ref: NULL rows: 3121309 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows in set (0.00 sec) It runs amazingly fast (0.00 sec), and additional 10 table accesses to get other columns should NOT make any big difference, comparing to scanning the ENTIRE table and sorting.
[7 Dec 2007 17:03]
Valeriy Kravchuk
According to last Igor's comment at http://bugs.mysql.com/bug.php?id=28404: "Now we can use an index for ORDER BY only if 1. the index it's covering or 2. ORDER BY is used with LIMIT N and the cost of N random accesses is less than the cost of filesort. The latter is always true for small N and big tables." As we can see, item 1 "works", while item 2 is now ignored in case of join. Separate problem (#2) here is that the above comments are NOT presented at http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html or at http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html Problem #3 is that FORCE INDEX FOR ORDER BY is silently ignored.
[25 Dec 2007 15:30]
Evgeny Potemkin
Repeatable on 5.1.22. Already fixed in 5.1.23. mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: index possible_keys: PRIMARY,rd_index key: search_index key_len: 6 ref: NULL rows: 10 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows in set (0.00 sec) mysql> EXPLAIN SELECT SQL_NO_CACHE * FROM urls u INNER JOIN urls_categories uc F ORCE INDEX FOR JOIN (search_index) USING (url_id) ORDER BY uc.modified_on LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: uc type: index possible_keys: NULL key: search_index key_len: 6 ref: NULL rows: 10 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: u type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: test.uc.url_id rows: 1 Extra: 2 rows in set (0.00 sec) mysql> select version(); +-----------------+ | version() | +-----------------+ | 5.1.23-rc-debug | +-----------------+ 1 row in set (0.00 sec)