| Bug #14253 | Optimizer uses partial indexes for subqueries, where it should not | ||
|---|---|---|---|
| Submitted: | 24 Oct 2005 9:26 | Modified: | 31 Dec 2009 12:44 |
| Reporter: | Domas Mituzas | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 5.0, 4.1, 6.0.14-bzr | OS: | Linux (Linux) |
| Assigned to: | CPU Architecture: | Any | |
[25 Oct 2005 16:03]
Domas Mituzas
SELECT a FROM t1 WHERE EXISTS (SELECT b FROM t2 WHERE a=b AND c IN (10,20))
Attachment: 14253.gz (application/x-gzip, text), 193.87 KiB.
[9 Sep 2006 21:23]
Igor Babaev
The same deficiency of the current optimizer can be demonstrated with just a join: mysql> EXPLAIN SELECT a FROM t1,t2 WHERE a=b AND c IN (10,20); +----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10002 | | | 1 | SIMPLE | t2 | ref | b | b | 5 | test.t1.a | 11 | Using where; Using index | +----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+ 2 rows in set (0.00 sec) We need here a combination of the 'ref' access method for the first component and the 'range' access method for the second component of the index. Yet this would require a new serious development.
[31 Dec 2009 12:44]
Valeriy Kravchuk
This problem is still repeatable with recent 6.0.14 from bzr:
77-52-7-73:6.0-codebase openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 6.0.14-alpha-debug Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> EXPLAIN SELECT a FROM t1,t2 WHERE a=b AND c IN (10,20);
+----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 10002 | Using where |
| 1 | SIMPLE | t2 | ref | b | b | 5 | test.t1.a | 11 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-----------+-------+--------------------------+
2 rows in set (0.00 sec)
mysql> show create table t2\G
*************************** 1. row ***************************
Table: t2
Create Table: CREATE TABLE `t2` (
`b` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL,
KEY `b` (`b`,`c`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Description: Optimizer chooses to use partial indexes in subquery, if range conditions exist, that suits only some values. Overall performance is degraded multiple times because of this decision. How to repeat: SELECT 1 FROM t1 WHERE EXISTS ( SELECT 1 FROM t2 WHERE t1.m_id=t2.m_id AND t2.a = 2 AND b IN (1,2,3) ) *************************** 1. row *************************** id: 1 select_type: PRIMARY table: t1 type: index possible_keys: NULL key: pk key_len: 8 ref: NULL rows: 10000 Extra: Using where; Using index *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: t2 type: ref possible_keys: m1_a_b key: m1_a_b key_len: 6 ref: t1.m_id.m_id,const rows: 3 Extra: Using where; Using index 2 rows in set (0.00 sec) Though, if we supply ref for 'b', then we get full index used and much better performance: SELECT 1 FROM t1 WHERE EXISTS ( SELECT 1 FROM t2 WHERE t1.m_id=t2.m_id AND t2.a = 2 AND b IN (1) ) *************************** 1. row *************************** id: 1 select_type: PRIMARY table: t1 type: index possible_keys: NULL key: pk key_len: 8 ref: NULL rows: 10000 Extra: Using where; Using index *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: t2 type: ref possible_keys: m1_a_b key: m1_a_b key_len: 9 ref: t1.m_id.m_id,const,const rows: 3 Extra: Using where; Using index 2 rows in set (0.00 sec) Suggested fix: Use full index for range scans in subqueries...