Bug #7080 | Optimizer choose incorrect index | ||
---|---|---|---|
Submitted: | 7 Dec 2004 15:23 | Modified: | 29 Jan 2005 6:37 |
Reporter: | Victoria Reznichenko | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 4.0 | OS: | Linux (Linux) |
Assigned to: | CPU Architecture: | Any | |
Tags: | cost model |
[7 Dec 2004 15:23]
Victoria Reznichenko
[14 Dec 2004 17:15]
MySQL Verification Team
This is the output og the first query: mysql> explain select distinct table1.col4,table1.col3,table1.col5,table1.col1 -> from table1 where ((table1.col1 = 15992) and table1.col3 in(3951,3953,3956,3957,3958,3959,3960,3961,3962,3964,3965,3966,3967,3971,3974,3975,3977,3979,3981,3984,3987,3988,3990,3991,3992,3993,3994,3997,3999,4000,4001,4004,4005,4007,4009,4010,4011,4012,4013,4014,4015,4016,4017,4018,4019,4020,4021,4022,4023,4024,4025,4026,4027,4028)) and table1.col2 = 254 order by table1.col4; +--------+-------+----------------------------------+---------+---------+------+------+----------------------------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------+-------+----------------------------------+---------+---------+------+------+----------------------------------------------+ | table1 | range | PRIMARY,ind0,ind1,ind2,ind3,ind4 | PRIMARY | 10 | NULL | 54 | Using where; Using temporary; Using filesort | +--------+-------+----------------------------------+---------+---------+------+------+----------------------------------------------+ 1 row in set (0.01 sec) PRIMARY key is used. This query doesn't use PRIMARY, it uses ind3 key. mysql> explain select distinct table1.col4,table1.col3,table1.col5,table1.col1 -> from table1 where ((table1.col1 in (15992, 15991)) and table1.col3 in(3951,3953,3956,3957,3958,3959,3960,3961,3962,3964,3965,3966,3967,3971,3974,3975,3977,3979,3981,3984,3987,3988,3990,3991,3992,3993,3994,3997,3999,4000,4001,4004,4005,4007,4009,4010,4011,4012,4013,4014,4015,4016,4017,4018,4019,4020,4021,4022,4023,4024,4025,4026,4027,4028)) and table1.col2 = 254 order by table1.col4; +--------+------+----------------------------------+------+---------+-------+------+----------------------------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------+------+----------------------------------+------+---------+-------+------+----------------------------------------------+ | table1 | ref | PRIMARY,ind0,ind1,ind2,ind3,ind4 | ind3 | 2 | const | 1 | Using where; Using temporary; Using filesort | +--------+------+----------------------------------+------+---------+-------+------+----------------------------------------------+ 1 row in set (0.02 sec)
[14 Dec 2004 17:22]
MySQL Verification Team
script to populate the table
Attachment: insert.pl (application/octet-stream, text), 395 bytes.
[14 Dec 2004 17:23]
MySQL Verification Team
table1.sql has CREATE TABLE statement. insert.pl - perl script to populate table with test data.
[26 Dec 2004 8:53]
Igor Babaev
After having run the command ANALYZE table table1 and getting statistics on indexes, the optimizer returned the following plan: mysql> explain select distinct table1.col4,table1.col3,table1.col5,table1.col1 -> from table1 where ((table1.col1 in (15992, 15991)) and table1.col3 -> in(3951,3953,3956,3957,3958,3959,3960,3961,3962,3964,3965,3966,3967,3971,3974, -> 3975,3977,3979,3981,3984,3987,3988,3990,3991,3992,3993,3994,3997,3999,4000,4001, -> 4004,4005,4007,4009,4010,4011,4012,4013,4014,4015,4016,4017,4018,4019,4020,4021, -> 4022,4023,4024,4025,4026,4027,4028)) and table1.col2 = 254 order by table1.col4; +--------+------+----------------------------------+------+---------+-------+------+-----------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------+------+----------------------------------+------+---------+-------+------+-----------------------------+ | table1 | ref | PRIMARY,ind0,ind1,ind2,ind3,ind4 | ind2 | 2 | const | 382 | Using where; Using filesort | +--------+------+----------------------------------+------+---------+-------+------+-----------------------------+ The execution by this plan took me 0.09 sec. In fact it's worse than the execution by the plan using the primary index. So the choice of the optimizer is not the best, but it happens only because it's very hard to take into account the impact of using OS input/output caches. Currently the optimizer just ignores them. The the future virsions we are going to employ a more subtle cost model for the optimizer.
[29 Jan 2005 6:37]
Igor Babaev
The reported choice of the optimizer is due to some inconsistencies of the cost model currently employed by it. This model makes him prefer access by index ind2 to multiple lookups by primary keys. We are going to introduce the new cost model in version 5.1.