Bug #2565 | subselect poor performance (shockingly) | ||
---|---|---|---|
Submitted: | 29 Jan 2004 17:14 | Modified: | 18 Feb 2004 3:41 |
Reporter: | Carson Battlelite | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S1 (Critical) |
Version: | 4.1.1 and above | OS: | ANY PLATFORM |
Assigned to: | Oleksandr Byelkin | CPU Architecture: | Any |
[29 Jan 2004 17:14]
Carson Battlelite
[2 Feb 2004 15:35]
Carson Battlelite
Have tested this on Windows 2000 - same performance problems
[2 Feb 2004 16:10]
Carson Battlelite
generates three data files and an sql file to load the data
Attachment: buggy.pl (text/plain), 2.21 KiB.
[2 Feb 2004 16:10]
Carson Battlelite
I grabbed 4.1.1a-alpha-nt and had the same performance problems - generated from buggy.pl I will attempt to add buggy.pl to the files so that people don't have to cut and paste
[2 Feb 2004 16:18]
Carson Battlelite
I have found that rewriting the query not to use subqueries performs just as badly i.e. SELECT count(*) FROM sess s, inf i, act a WHERE s.id = i.session_id AND i.action_id = a.id and a.code = 'GOOD'
[18 Feb 2004 3:41]
Oleksandr Byelkin
Thank you for bugreport, it was really helpful to testing subqueries performance! Of course list of constants will be faster then using tables (especially because MySQL sort constants before serching). I think subqueries should be compared with JOINs in this case to be identical. If this query will be rewriten with JOINs it will be slowly then subquery. I have tested following 2 queries: SELECT count(*) FROM sess s, inf i WHERE s.id = i.session_id AND i.action_id IN (SELECT id FROM act WHERE code = 'GOOD'); results are (sec.) 7.43, 9.04, 8.33, 8.07 SELECT count(*) FROM sess s, inf i, act WHERE s.id = i.session_id AND i.action_id = act.id and act.code = 'GOOD' results are (sec.) 9.01, 9.58, 9.66, 10.24 i.e. subquery is faster then join in this case (because of special optimisation for subqueries like "value IN (SELECT key_field FROM single_table WHERE some_conditions)" because such subqueries are widely used. In some of 5.x version we plan to add optimisation for IN subqueries with hash tables and in this case it will be about same performance as constant list. But you can't compare constant list with subquery, because: 1. subqery you can use everywere, constant list only if you completely sure that query with constants fit in maximum length of command 2. for subqueries all tables locked at one moment (or it will be done alwais as one transaction), for variant with constant and autocommit mode you can get inconsistent results, because tables can be changed between 2 queries (it is not important for you application as I understand but it is still argument :)