Bug #21262 | Subquery hangs server with mysqld using 100% cpu - see also bug #18765 | ||
---|---|---|---|
Submitted: | 24 Jul 2006 23:35 | Modified: | 1 Sep 2006 18:35 |
Reporter: | Harry Clauson | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S2 (Serious) |
Version: | 5.0.22 | OS: | Linux (Linux) |
Assigned to: | Paul DuBois | CPU Architecture: | Any |
Tags: | subquery slow hang |
[24 Jul 2006 23:35]
Harry Clauson
[25 Jul 2006 22:21]
Sveta Smirnova
Thank you for the report. Verified on Linux using last BK sources as reporter described.
[31 Aug 2006 15:48]
Timour Katchaounov
Hi Harry, Is there any particular reason you first add a lot of (14000) rows, and then delete most of them leaving 4000 rows. How is that different from directly inserting just 4000 rows?
[31 Aug 2006 17:44]
Harry Clauson
You are correct in that it shouldn't make any difference - but due to this bug it does. I observe that the time the query "hangs" is longer when there are a lot of deleted rows, and shorter if there are less. This leads me to believe that the query is doing a physical scan of every row in the table (including deleted ones). However the query will independently run faster or slower depending on how many non-deleted rows there are, even with the same no. of deleted rows. If I run the test with half the no. of deleted rows, the query is twice as fast. But if I run the test with the same no. of deleted rows but half the non-deleted rows, the query also runs twice as fast. The two combine such that a large no. of non-deleted rows causes the query to run so long that it would appear to be an infinite loop - but it's not. What this seems to indicate is that the behavior of the query involves not just the no. of rows in total, but the product of the no. of rows and the no. of undeleted rows. This is as if it is doing something like one full table scan per undeleted row! I hope this helps you.
[1 Sep 2006 13:18]
Georgi Kodinov
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php Additional info: The query : select * from locx where loc= (select ( if(@s:=119, (select loc from locx where store=@s), null)) as loc); is slow because the subquery contained in it : select ( if(@s:=119, (select loc from locx where store=@s), null)) as loc executes once for each row of the outer context. This subquery is compiled as UNCACHEABLE SUBQUERY. This is evident from the EXPLAIN : explain select * from locx lo where loc= (select (if(@s:=119,(select loc from locx where store=@s),null)) as loc); -------------- explain select * from locx lo where loc= (select (if(@s:=119,(select loc from locx where store=@s),null)) as loc) -------------- +----+----------------------+-------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+----------------------+-------+------+---------------+------+---------+------+------+-------------+ | 1 | PRIMARY | lo | ALL | NULL | NULL | NULL | NULL | 400 | Using where | | 3 | UNCACHEABLE SUBQUERY | locx | ALL | NULL | NULL | NULL | NULL | 400 | Using where | +----+----------------------+-------+------+---------------+------+---------+------+------+-------------+ "UNCACHEABLE SUBQUERY" here means the subquery is re-evaluated for each row of the outer context. This differs from how correlated subqueries (see Section 13.2.8.7. of the manual : http://dev.mysql.com/doc/refman/5.0/en/correlated-subqueries.html) are executed e.g: mysql> explain select * from locx lo where loc= (select (if(@s:=119,(select loc from locx where store=lo.store),null)) as loc); -------------- explain select * from locx lo where loc= (select (if(@s:=119,(select loc from locx where store=lo.store),null)) as loc) -------------- +----+--------------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ | 1 | PRIMARY | lo | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | Using where | | 3 | DEPENDENT SUBQUERY | locx | eq_ref | store | store | 4 | test.lo.store | 1 | | +----+--------------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ A "DEPENDENT SUBQUERY" is re-evaluated only once for each set of different values of the variables from its outer context. Cacheability of subqueries is subject to the restrictions detailed in section 5.14.1. of the manual (http://dev.mysql.com/doc/refman/5.0/en/query-cache-how.html). Quoting user variables is one such restriction.
[1 Sep 2006 18:35]
Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products. Adding note to EXPLAIN section of manual: DEPENDENT SUBQUERY evaluation differs from UNCACHEABLE SUBQUERY evaluation. For DEPENDENT SUBQUERY, the subquery is re-evaluated only once for each set of different values of the variables from its outer context. For UNCACHEABLE SUBQUERY, the subquery is re-evaluated for each row of the outer context. Cacheability of subqueries is subject to the restrictions detailed in (query-cache-how). For example, referring to user variables makes a subquery uncacheable.