Bug #71189 | Manual does not provide enough details on how loose index scan really works | ||
---|---|---|---|
Submitted: | 20 Dec 2013 14:56 | Modified: | 20 Dec 2013 20:02 |
Reporter: | Valeriy Kravchuk | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S3 (Non-critical) |
Version: | 5.5, any | OS: | Any |
Assigned to: | Assigned Account | CPU Architecture: | Any |
Tags: | GROUP BY, loose index scan, manual, Optimizer |
[20 Dec 2013 14:56]
Valeriy Kravchuk
[20 Dec 2013 14:58]
Valeriy Kravchuk
tgb.sql dump to load
Attachment: tgb.sql (application/octet-stream, text), 112.05 KiB.
[20 Dec 2013 16:40]
Valeriy Kravchuk
More things to explain. Load attached file: C:\Program Files\MySQL\MySQL Server 5.6\bin>mysql -uroot -proot -P3312 test <p:\ percona\tgb.sql Warning: Using a password on the command line interface can be insecure. and then do this: C:\Program Files\MySQL\MySQL Server 5.6\bin>mysql -uroot -proot -P3312 test Warning: Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 19 Server version: 5.5.33 MySQL Community Server (GPL) Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> delete from tgb where id1 > 1000 and id1 <>6783; Query OK, 7118 rows affected (0.39 sec) mysql> select count(*) from tgb where id1=6783; +----------+ | count(*) | +----------+ | 204 | +----------+ 1 row in set (0.04 sec) mysql> select count(*) from tgb; +----------+ | count(*) | +----------+ | 1074 | +----------+ 1 row in set (0.02 sec) mysql> select count(distinct id1) from tgb; +---------------------+ | count(distinct id1) | +---------------------+ | 583 | +---------------------+ 1 row in set (0.07 sec) mysql> analyze table tgb; +----------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------+---------+----------+----------+ | test.tgb | analyze | status | OK | +----------+---------+----------+----------+ 1 row in set (0.00 sec) mysql> explain select sql_no_cache id1, max(id2) from tgb where id1=6783 and id2 <7000 group by id1; +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ | 1 | SIMPLE | tgb | range | PRIMARY | PRIMARY | 8 | NULL | 140 | Using where; Using index | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ 1 row in set (0.04 sec) So, here tight index scan is used, and it's expensive enough (check). Now: mysql> delete from tgb where id1 > 100 and id1 <>6783; Query OK, 784 rows affected (0.10 sec) mysql> select count(*) from tgb where id1=6783; +----------+ | count(*) | +----------+ | 204 | +----------+ 1 row in set (0.04 sec) mysql> select count(*) from tgb; +----------+ | count(*) | +----------+ | 290 | +----------+ 1 row in set (0.03 sec) mysql> select count(distinct id1) from tgb; +---------------------+ | count(distinct id1) | +---------------------+ | 63 | +---------------------+ 1 row in set (0.06 sec) The data we are selecting are the same, just table got smaller, with less different id1 values. Now: mysql> explain select sql_no_cache id1, max(id2) from tgb where id1=6783 and id2 <7000 group by id1; +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ | 1 | SIMPLE | tgb | range | PRIMARY | PRIMARY | 8 | NULL | 140 | Using where; Using index | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+--------------------------+ 1 row in set (0.00 sec) Still loose index scan is NOT used mysql> alter table tgb engine=InnoDB; Query OK, 290 rows affected (0.68 sec) Records: 290 Duplicates: 0 Warnings: 0 (or just ANALYZE TABLE to get some new statistics), and: mysql> explain select sql_no_cache id1, max(id2) from tgb where id1=6783 and id2 <7000 group by id1; +----+-------------+-------+-------+---------------+---------+---------+------+- -----+---------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+---------------------------------------+ | 1 | SIMPLE | tgb | range | PRIMARY | PRIMARY | 8 | NULL | 70 | Using where; Using index for group-by | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+---------------------------------------+ 1 row in set (0.04 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select sql_no_cache id1, max(id2) from tgb where id1=6783 and id2<7000 gr oup by id1; +------+----------+ | id1 | max(id2) | +------+----------+ | 6783 | 6918 | +------+----------+ 1 row in set (0.00 sec) mysql> show session status like 'Handler_%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 4 | | Handler_read_last | 1 | | Handler_read_next | 0 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_savepoint | 0 | | Handler_savepoint_rollback | 0 | | Handler_update | 0 | | Handler_write | 0 | +----------------------------+-------+ 16 rows in set (0.04 sec) Now loose index scan is used and it's super efficient. Ideally manual should explain both Handler_% values above and reasons why the same loose index scan path was NOT used initially.
[20 Dec 2013 20:02]
Sveta Smirnova
Thank you for the reasonable documentation request.