Bug #904 query jumped from 4 seconds to several minutes, join order changed in explain
Submitted: 23 Jul 2003 20:53 Modified: 7 Aug 2003 22:21
Reporter: Shane Allen Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S1 (Critical)
Version:4.0.14 OS:Linux (debian linux)
Assigned to: Igor Babaev CPU Architecture:Any

[23 Jul 2003 20:53] Shane Allen
Description:
4.0.13:
mysql> explain SELECT rcd.uid, SUM(ua.t_amount)/COUNT(l.listing_id) AS payments, COUNT(l.listing_id)/COUNT(ua.t_amount) AS listing_count FROM reg_common_data AS rcd LEFT JOIN user_account AS ua ON (ua.uid=rcd.uid) LEFT JOIN listing AS l ON (l.seller_uid=rcd.uid) LEFT JOIN featured_sellers AS fs ON (rcd.uid=fs.uid) WHERE rcd.suspended IS NULL AND ua.uid IS NOT NULL AND l.listing_id IS NOT NULL AND l.status = 'active' AND ua.t_type = 'credit' AND ua.payment_type IN ('mo', 'check', 'cc') AND ua.t_time > 20030623000000 AND fs.uid IS NULL GROUP BY uid HAVING listing_count > 0 ORDER BY payments DESC LIMIT 20;
+-------+--------+---------------+------------+---------+---------+-------+----------------------------------------------+
| table | type   | possible_keys | key        | key_len | ref     | rows  | Extra                                        |
+-------+--------+---------------+------------+---------+---------+-------+----------------------------------------------+
| rcd   | index  | NULL          | PRIMARY    |       4 | NULL    | 96722 | Using where; Using temporary; Using filesort |
| ua    | ref    | uid           | uid        |       4 | rcd.uid |    15 | Using where                                  |
| l     | ref    | seller_uid    | seller_uid |       4 | rcd.uid |     8 | Using where                                  |
| fs    | eq_ref | PRIMARY       | PRIMARY    |       4 | rcd.uid |     1 | Using where; Using index; Not exists         |
+-------+--------+---------------+------------+---------+---------+-------+----------------------------------------------+
4 rows in set (0.00 sec)     

mysql> SELECT rcd.uid, SUM(ua.t_amount)/COUNT(l.listing_id) AS payments, COUNT(l.listing_id)/COUNT(ua.t_amount) AS listing_count FROM reg_common_data AS rcd LEFT JOIN user_account AS ua ON (ua.uid=rcd.uid) LEFT JOIN listing AS l ON (l.seller_uid=rcd.uid) LEFT JOIN featured_sellers AS fs ON (rcd.uid=fs.uid) WHERE rcd.suspended IS NULL AND ua.uid IS NOT NULL AND l.listing_id IS NOT NULL AND l.status = 'active' AND ua.t_type = 'credit' AND ua.payment_type IN ('mo', 'check', 'cc') AND ua.t_time > 20030623000000 AND fs.uid IS NULL GROUP BY uid HAVING listing_count > 0 ORDER BY payments DESC LIMIT 20;

<snip results>

20 rows in set (4.71 sec)

4.0.14:
mysql> explain SELECT rcd.uid, SUM(ua.t_amount)/COUNT(l.listing_id) AS payments, COUNT(l.listing_id)/COUNT(ua.t_amount) AS listing_count FROM reg_common_data AS rcd LEFT JOIN user_account AS ua ON (ua.uid=rcd.uid) LEFT JOIN listing AS l ON (l.seller_uid=rcd.uid) LEFT JOIN featured_sellers AS fs ON (rcd.uid=fs.uid) WHERE rcd.suspended IS NULL AND ua.uid IS NOT NULL AND l.listing_id IS NOT NULL AND l.status = 'active' AND ua.t_type = 'credit' AND ua.payment_type IN ('mo', 'check', 'cc') AND ua.t_time > 20030623000000 AND fs.uid IS NULL GROUP BY uid HAVING listing_count > 0 ORDER BY payments DESC LIMIT 20;
+-------+--------+---------------------------+---------+---------+--------------+-------+----------------------------------------------+
| table | type   | possible_keys             | key     | key_len | ref          | rows  | Extra                                        |
+-------+--------+---------------------------+---------+---------+--------------+-------+----------------------------------------------+
| l     | ref    | PRIMARY,status,seller_uid | status  |       1 | const        | 67825 | Using where; Using temporary; Using filesort |
| rcd   | eq_ref | PRIMARY                   | PRIMARY |       4 | l.seller_uid |     1 | Using where                                  |
| ua    | ref    | uid                       | uid     |       4 | rcd.uid      |    16 | Using where                                  |
| fs    | eq_ref | PRIMARY                   | PRIMARY |       4 | rcd.uid      |     1 | Using where; Using index; Not exists         |
+-------+--------+---------------------------+---------+---------+--------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)

running the select statement results in a query that sits in the 'Copying to temp table' state for a lot of minutes, at which point I eventually kill it so I can continue working.

I have run optimize and analyze on all tables in the query, and tested after each optimize and again after the analyze statement, and it did not affect the explain.

Note: the 4.0.13 explain statement comes from our live databases, which accounts for the differences in row counts; if such a difference would potentially cause the difference in explain statements, then I can try re-installing 4.0.13 in our dev environment -- the problem, however, did rear its head after the upgrade, so I am inclined to think the differences are not sufficient to affect query speed; if anything, dev should be faster than production, no?

In case it helps, I have verified that the tables in dev and production describe identically.

How to repeat:
If desired, I can attach copies of dev databases to the ticket; they will come to about 100MB total.
[25 Jul 2003 7:52] Indrek Siitan
Yes, it would be really helpful if you could provide the data.

The bugs database doesn't support uploading files over 1MB in size,
so you need to either make the files available on your WWW/FTP site
and let us know of the location (mail to tfr@mysql.com) or upload
them to our FTP at:

  http://support.mysql.com/pub/mysql/secret

Thanks.
[28 Jul 2003 11:22] Shane Allen
I will be sending an email to the address provided earlier with the location of the download. FYI, I have been unable to contact support.mysql.com via FTP for the weekend; it is refusing connections, and making an http request (as directed in the previous comment on this ticket) generates a 404.
[29 Jul 2003 4:34] Indrek Siitan
Thanks for the data - I have been able to verify the behaviour.
[6 Aug 2003 18:12] Shane Allen
Do you guys by any chance have a timeline for finding and/or fixing this bug? We're waiting on a production rollout, because we don't know how many other places on our site this bug could affect.

If you don't yet have a timeline, please let me know, so that we can try to code around the bug (we need to get some of the 4.0.14 bugfixes rolled out). If you have any details at all with regards to what causes it, it would be helpful for us to know them should we need to identify other problem spots on our site.

Thanks
[7 Aug 2003 22:21] Igor Babaev
This is not a bug.  
Due to the latest changes in the optimizer an outer join can be converted to  
an inner join. It happens when the where condition effectively filters out  
NULL rows of the inner table. The regular search for the best execution plan  
is applied to the result of conversion. The optimizer choice is based on the  
statistics it can collect. Unfortunately it can not use statistics on  
non-indexed column, because it's not collected now. That's why even when we 
have a condition with a good selectivity (a restrictive condition) the 
optimizer doesn't take it into account at all. So, if the condition imposes 
restrictions for the columns of an inner table of the execution plan it is 
checked only when rows for outer tables are already fetched and in most cases 
uselessly.  
To resolve this problem it's recommended to create indexes for columns  
used in restrictive conditions.   
When I created an index for the column t_time in the table user_account I  
immediately got a good execution plan for the query:  
+-------+--------+---------------------------+------------+---------+---------+------+----------------------------------------------+  
| table | type   | possible_keys             | key        | key_len | ref      
| rows | Extra                                        |  
+-------+--------+---------------------------+------------+---------+---------+------+----------------------------------------------+  
| ua    | range  | uid,t_time                | t_time     |       9 | NULL     
| 1112 | Using where; Using temporary; Using filesort |  
| rcd   | eq_ref | PRIMARY                   | PRIMARY    |       3 | ua.uid   
|    1 | Using where                                  |  
| l     | ref    | PRIMARY,status,seller_uid | seller_uid |       4 | rcd.uid  
|   10 | Using where                                  |  
| fs    | eq_ref | PRIMARY                   | PRIMARY    |       4 | rcd.uid  
|    1 | Using where; Using index; Not exists         |  
+-------+--------+---------------------------+------------+---------+---------+------+----------------------------------------------+  
4 rows in set (0.00 sec)  
  
With this plan the execution took practically no time.