Bug #27452 | Correlated subquery produces inefficient optimizer plan for INFORMATION_SCHEMA | ||
---|---|---|---|
Submitted: | 26 Mar 2007 17:35 | Modified: | 17 Nov 2008 19:38 |
Reporter: | Jay Pipes | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.0.40-BK, 5.0.24a-debian-log, 5.1.31-bzr, 6.0.7-alpha, 6.0.9-bzr | OS: | Linux |
Assigned to: | CPU Architecture: | Any | |
Tags: | I_S, performance, subqueries, subquery |
[26 Mar 2007 17:35]
Jay Pipes
[26 Mar 2007 19:12]
Valeriy Kravchuk
Thank you for a problem report. Yes, I verified that, even on latest 5.0.40-BK, query with correlated subquery runs substantially slower. But this is expected - correlated subquery is executed many times, while derived table is built only once. Until optimizer will automagically rewrite query with dependent subquery using derived table, nothing can be done. This will be eventually done, but not any time soon.
[16 Oct 2008 15:05]
Valeriy Kravchuk
Please, check if MySQL 6.0.6 solves this problem for you.
[17 Nov 2008 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[17 Nov 2008 15:22]
Jay Pipes
Problem not only still exists; it is a LOT worse. The execution time for the non-correlated subquery is ~1.5 seconds. The time for the correlated subquery is ~1 min. 22 seconds. This is repeatable. Below, the output of the EXPLAIN and the SELECTs themselves. You may notice that the output of the SELECT is actually different between the two results. This is not because the SELECTs produce different result sets, but because InnoDB provides different index statistics in each run, and therefore the selectivity % numbers vary slightly. Also note that there is an erroneous result in the Extra column of the first EXPLAIN output which says Impossible WHERE noticed, but there is still a result... Built new 6.0.7-alpha with compile-pentium-debug-max-no-ndb. Started db with: $> cd mysql-test $> ./mtr --start-and-exit Loaded databases from forge and planetmysql and a couple other things for testing and started client with: $> cd ../client $> ./mysql --user=root --port=9306 --socket=../mysql-test/var/tmp/master.sock < ~/all.db $> ./mysql --user=root --port=9306 --socket=../mysql-test/var/tmp/master.sock Output of the SQL commands and EXPLAINs is attached.
[17 Nov 2008 15:23]
Jay Pipes
Commands to reproduce and output showing problem
Attachment: bug27452.txt (text/plain), 13.45 KiB.
[17 Nov 2008 16:19]
Sergey Petrunya
Neither of the mentioned subqueries are handled by 6.0's subquery optimization. The FROM subquery should receive better handling after WL#3485 is done (this is being worked on by Evgen Potemkin, current target is 6.x) The subquery in select list should be executed better when WL#1117 is done (which we intend to do but noone is working on this ATM).
[17 Nov 2008 16:27]
Sergey Petrunya
Jay Pipes wrote: > Also note that there is an erroneous result in the Extra column of the first EXPLAIN > output which says Impossible WHERE noticed, but there is still a result... Confirm, the first EXPLAIN doesn't seem to be correct, we should analyze and fix this
[17 Nov 2008 16:31]
Sergey Petrunya
mysql> explain ... *************************** 1. row *************************** id: 1 select_type: PRIMARY table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Impossible WHERE noticed after reading const tables This seems to be repeatable on any database, including the default db created by mysql-test-run.
[17 Nov 2008 16:31]
Sergey Petrunya
I was using 6.0-bk. Valeriy, could you please check if this problem shows up on 5.x also?
[17 Nov 2008 16:35]
Sergey Petrunya
Regarding I_S, I don't see any apparent I_S processing problems. We should check if I_S tables are filled only one time when they are accessed from within a subquery.
[17 Nov 2008 19:38]
Valeriy Kravchuk
Verified with recent 6.0.9 from bzr. 5.1.31 from bzr is also affected: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.1.31-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> show tables; Empty set (0.00 sec) mysql> create table t1 (c1 int auto_increment primary key, c2 int, key(c2)); Query OK, 0 rows affected (0.02 sec) mysql> insert into t1 values(1,1); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 (c2) select rand() * 1000 from t1; Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 ... mysql> insert into t1 (c2) select rand() * 1000 from t1; Query OK, 128 rows affected (0.01 sec) Records: 128 Duplicates: 0 Warnings: 0 mysql> alter table t1 engine=InnoDB; Query OK, 256 rows affected (0.06 sec) Records: 256 Duplicates: 0 Warnings: 0 mysql> update t1 set c2=10; Query OK, 256 rows affected (0.07 sec) Rows matched: 256 Changed: 256 Warnings: 0 mysql> analyze table t1; +---------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +---------+---------+----------+----------+ | test.t1 | analyze | status | OK | +---------+---------+----------+----------+ 1 row in set (0.00 sec) mysql> SELECT t.TABLE_SCHEMA , t.TABLE_NAME , s.INDEX_NAME , s.COLUMN_NAME , s.SEQ_IN_INDEX , s2.max_columns AS `COLS_IN_INDEX` , s.CARDINALITY AS `CARD` , t.TABLE_ROWS AS `ROWS` , ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `SEL %` FROM INFORMATION_SCHEMA.STATISTICS s INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME INNER JOIN ( SELECT TABLE_SCHEMA , TABLE_NAME , INDEX_NAME , MAX(SEQ_IN_INDEX) AS max_columns FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA != 'mysql' GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME ) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME WHERE t.TABLE_SCHEMA != 'mysql' AND t.TABLE_ROWS > 10 AND s.CARDINALITY IS NOT NULL AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 ORDER BY `SEL %`, TABLE_SCHEMA, TABLE_NAME LIMIT 10; +--------------+------------+------------+-------------+--------------+---------------+------+------+-------+ | TABLE_SCHEMA | TABLE_NAME | INDEX_NAME | COLUMN_NAME | SEQ_IN_INDEX | COLS_IN_INDEX | CARD | ROWS | SEL % | +--------------+------------+------------+-------------+--------------+---------------+------+------+-------+ | test | t1 | c2 | c2 | 1 | 1 | 2 | 256 | 0.78 | +--------------+------------+------------+-------------+--------------+---------------+------+------+-------+ 1 row in set (0.18 sec) mysql> EXPLAIN SELECT t.TABLE_SCHEMA , t.TABLE_NAME , s.INDEX_NAME , s.COLUMN_NAME , s.SEQ_IN_INDEX , s2.max_columns AS `COLS_IN_INDEX` , s.CARDINALITY AS `CARD` , t.TABLE_ROWS AS `ROWS` , ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `SEL %` FROM INFORMATION_SCHEMA.STATISTICS s INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME INNER JOIN ( SELECT TABLE_SCHEMA , TABLE_NAME , INDEX_NAME , MAX(SEQ_IN_INDEX) AS max_columns FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA != 'mysql' GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME ) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME WHERE t.TABLE_SCHEMA != 'mysql' AND t.TABLE_ROWS > 10 AND s.CARDINALITY IS NOT NULL AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 ORDER BY `SEL %`, TABLE_SCHEMA, TABLE_NAME LIMIT 10\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Impossible WHERE noticed after reading const tables *************************** 2. row *************************** id: 2 select_type: DERIVED table: STATISTICS type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Using where; Open_frm_only; Scanned all databases; Using temporary; Using filesort 2 rows in set (0.01 sec)