Bug #41871 | Select + Index + Limit + offset on large MyIsam tables give very bad performance | ||
---|---|---|---|
Submitted: | 5 Jan 2009 15:47 | Modified: | 2 Oct 2009 7:27 |
Reporter: | Bouvet Arnaud | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: MyISAM storage engine | Severity: | S5 (Performance) |
Version: | 5.1.30 | OS: | Any (MS Windows, Mac OS X) |
Assigned to: | CPU Architecture: | Any | |
Tags: | select MyIsam index offset performance |
[5 Jan 2009 15:47]
Bouvet Arnaud
[6 Jan 2009 10:09]
Valeriy Kravchuk
Thank you for a problem report. Please, execute the following: show session status like 'Handler_read%'; before the original query and after it. Then do the same before the query with IGNORE INDEX and after it. Send the results.
[6 Jan 2009 10:39]
Bouvet Arnaud
Please find below results of the manipulation: mysql> show session status like 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 0 | | Handler_read_next | 0 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+-------+ 6 rows in set (0.02 sec) mysql> select * from cdr where cnx<6000000 limit 10000000,10; +----------+---------+---------+--------------------+-------------------+------+----------+ | autonum | cnx | pdp | imsi | imei | cell | cdr_hour | +----------+---------+---------+--------------------+-------------------+------+----------+ | 10000001 | 3000001 | 1000001 | NULL | NULL | 6713 | 12 | ... etc ... | 10000010 | 3000004 | 1000002 | NULL | NULL | 5187 | 12 | +----------+---------+---------+--------------------+-------------------+------+----------+ 10 rows in set (1 min 19.36 sec) mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 10000009 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+----------+ 6 rows in set (0.00 sec) --> With a the same connection mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 10000009 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+----------+ 6 rows in set (0.00 sec) mysql> select * from cdr ignore index (idx1) where cnx<6000000 limit 10000000,10; +----------+---------+---------+--------------------+-------------------+------+----------+ | autonum | cnx | pdp | imsi | imei | cell | cdr_hour | +----------+---------+---------+--------------------+-------------------+------+----------+ | 10000001 | 3000001 | 1000001 | NULL | NULL | 6713 | 12 | ... etc ... | 10000010 | 3000004 | 1000002 | NULL | NULL | 5187 | 12 | +----------+---------+---------+--------------------+-------------------+------+----------+ 10 rows in set (1.61 sec) mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 10000009 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 10000010 | +-----------------------+----------+ 6 rows in set (0.00 sec) --> With a new connection mysql> connect su_perf_oneindex; Connection id: 354 Current database: su_perf_oneindex mysql> show session status like 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 0 | | Handler_read_next | 0 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+-------+ 6 rows in set (0.00 sec) mysql> select * from cdr ignore index (idx1) where cnx<6000000 limit 10000000,10; +----------+---------+---------+--------------------+-------------------+------+----------+ | autonum | cnx | pdp | imsi | imei | cell | cdr_hour | +----------+---------+---------+--------------------+-------------------+------+----------+ | 10000001 | 3000001 | 1000001 | NULL | NULL | 6713 | 12 | ... etc ... | 10000010 | 3000004 | 1000002 | NULL | NULL | 5187 | 12 | +----------+---------+---------+--------------------+-------------------+------+----------+ 10 rows in set (1.59 sec) mysql> mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 0 | | Handler_read_next | 0 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 10000010 | +-----------------------+----------+ 6 rows in set (0.00 sec) mysql>
[6 Jan 2009 11:18]
Valeriy Kravchuk
Please, send also the results of: select count(*) from cdr; But even without them we can see that both statements had to read 1000010 rows. In the first case it was read of index to find the first record that satisfied condition and then scanning next 1000009 index records, with the need to read the entire row from data file (you use SELECT * !) each time, and each time from a new position. So, you had, besides reading indexes from key buffer, to read individual data row in some specific location, 1000010 times. In the second care you just read data rows one by one, and it happened so that all first 1000010 rows satisfied condition. Sequential read of data file is much faster. Hence the result. Optimizer just does NOT know that every row it will read sequentially will satisfy condition. I see no bug here.
[14 Jan 2009 9:24]
Bouvet Arnaud
Thanks for your response. The cdr table contains: mysql> select count(*) from cdr; +----------+ | count(*) | +----------+ | 20000000 | +----------+ 1 row in set (0.00 sec) I understand your explanation. And I can understand that the read of index file and then datas in table takes more time than the read of only datas in table. But I don't understand a
[14 Jan 2009 9:35]
Bouvet Arnaud
Sorry, I've submitted my answer too fast. I don't understand a such difference for performances if I use or not index with offset. For example, if I retrieve only one row after offset, index file is only read once. And then MySQL makes a sequential read, no? See the results of this example: mysql> select * from cdr where cnx<6000000 limit 10000000,1; +----------+---------+---------+--------------------+-------------------+------+----------+ | autonum | cnx | pdp | imsi | imei | cell | cdr_hour | +----------+---------+---------+--------------------+-------------------+------+----------+ | 10000001 | 3000001 | 1000001 | NUll| NUll| 6713 | 12 | +----------+---------+---------+--------------------+-------------------+------+----------+ 1 row in set (2 min 24.43 sec) mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_next | 10000000 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+----------+ 6 rows in set (0.05 sec) mysql> select * from cdr ignore index (idx1) where cnx<6000000 limit 10000000,1; +----------+---------+---------+--------------------+-------------------+------+----------+ | autonum | cnx | pdp | imsi | imei | cell | cdr_hour | +----------+---------+---------+--------------------+-------------------+------+----------+ | 10000001 | 3000001 | 1000001 | NUll| NUll| 6713 | 12 | +----------+---------+---------+--------------------+-------------------+------+----------+ 1 row in set (1.70 sec) mysql> show session status like 'Handler_read%'; +-----------------------+----------+ | Variable_name | Value | +-----------------------+----------+ | Handler_read_first | 0 | | Handler_read_key | 0 | | Handler_read_next | 0 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 10000001 | +-----------------------+----------+ 6 rows in set (0.00 sec) How can I configure the MySQL server to decrease this difference?
[28 Sep 2009 19:55]
Sveta Smirnova
Thank you for the feedback. Verified as described.
[2 Oct 2009 7:27]
Sveta Smirnova
Steps to repeat. 1. Load dump from hidden comment. 2. Run query select * from cdr where cnx<6000000 limit 10000000,1; 3. Run query select * from cdr ignore index (idx1) where cnx<6000000 limit 10000000,1;
[14 Aug 2018 16:25]
Benjamin Tepolt
Good to see all the lively discussion here! Thank you for posting, bouvet Arnaud. I just wanted to add that this small problem has been addressed in dozens and dozens of blogs since 2009, and that it is almost a verifiable, cottage industry. Just look at this stackoverflow question with 100+ upvotes, https://stackoverflow.com/questions/4481388/why-does-mysql-higher-limit-offset-slow-the-qu... , "Why does MYSQL higher LIMIT offset slow the query down?" And yet, everyone has a solution, which are always identical. Change: SELECT * FROM .... LIMIT $X, $Y;. And instead use a subquery: SELECT * FROM ... WHERE id IN (SELECT id FROM ... LIMIT $X, $Y);. The simplicity of this fix gives me the impression that it ought to be simply built into the engine itself.