Bug #111367 | OFFSET 99 LIMIT 1 on a compound PK takes ages to execute. | ||
---|---|---|---|
Submitted: | 12 Jun 2023 15:36 | Modified: | 4 Aug 2023 15:09 |
Reporter: | Jean-François Gagné | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[12 Jun 2023 15:36]
Jean-François Gagné
[12 Jun 2023 15:38]
Jean-François Gagné
Updated Severity from S2 / Serious to S5 / Performance.
[12 Jun 2023 15:38]
Jean-François Gagné
Optimizer trace for the slow query.
Attachment: mysql_8.0.33_optimizer_trace.txt (text/plain), 655 bytes.
[12 Jun 2023 15:39]
Jean-François Gagné
Again updating Severity from S2 / Serious to S5 / Performance (the other did not work, probably error on my side).
[12 Jun 2023 16:06]
Jean-François Gagné
I did a mistake in uploading the 1st optimizer trace file, I am uploading a new one.
[12 Jun 2023 16:06]
Jean-François Gagné
Optimizer trace for the slow query (new file after error with the 1st).
Attachment: mysql_8.0.33_optimizer_trace2.txt (text/plain), 31.26 KiB.
[12 Jun 2023 18:07]
Jean-François Gagné
I reproduced with 5.7.42. mysql> select /*+ MAX_EXECUTION_TIME(5000) */ portal_id, id from file_metadata where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G ERROR 3024 (HY000): Query execution was interrupted, maximum statement execution time exceeded mysql> select portal_id, id from file_metadata ignore index(file_updated,file_deleted,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility,file_md5) where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** portal_id: 2459647 id: 1589 1 row in set (0.00 sec) mysql> show global variables like 'optimizer_switch'\G *************************** 1. row *************************** Variable_name: optimizer_switch Value: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,prefer_ordering_index=on 1 row in set (0.00 sec) mysql> explain select portal_id, id from file_metadata where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: index possible_keys: PRIMARY,file_updated,file_deleted,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility key: PRIMARY key_len: 12 ref: NULL rows: 100 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.00 sec) mysql> explain select portal_id, id from file_metadata ignore index(file_updated,file_deleted,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility,file_md5) where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: range possible_keys: PRIMARY key: PRIMARY key_len: 12 ref: NULL rows: 472568938 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.00 sec)
[13 Jun 2023 12:09]
MySQL Verification Team
Bonjour Mr. Jean-Francois, Thank you so much for your bug report. However, we must be able to repeat it. The only thing missing are data for the table. You can upload (gzipped, bzipped or zipped) mysqldump output from that table to this bug. You can do it by clicking on the "Files" tab. If a file is too large, follow the instructions in that tab and let us know the file name that you have uploaded to our SFTP site. Thanks in advance. As soon as we repeat your entire test case, we shall change the status of this report.
[1 Aug 2023 19:11]
Jean-François Gagné
Adding 8.0.34 as affected. More interesting details below. mysql> select version(); +-----------+ | version() | +-----------+ | 8.0.34 | +-----------+ 1 row in set (0.00 sec) The data shape might be a little weird: there are 304.121.139 rows for portal_id = 2459647, 39.685.307 rows for portal_id < 2459647 and 263.763.208 rows for portal_id > 2459647. In below, dots added for readability. mysql> select count(*) from file_metadata where portal_id < 2459647; select count(*) from file_metadata where portal_id = 2459647; select count(*) from file_metadata where portal_id > 2459647; +----------+ | count(*) | +----------+ | 39.685.307 | +----------+ 1 row in set (46.13 sec) +-----------+ | count(*) | +-----------+ | 304.121.139 | +-----------+ 1 row in set (3 min 19.44 sec) +-----------+ | count(*) | +-----------+ | 263.763.208 | +-----------+ 1 row in set (3 min 58.73 sec) Suggested by someone on MySQL Community Slack, I splitted the query in two, removing the OR, and both queries give the expected explain and run fast: mysql> explain select portal_id, id from file_metadata where ( (portal_id > 2459647) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: range possible_keys: PRIMARY,file_updated,folder_name_deleted,file_deleted,cloud_key,name_extension_deleted,file_md5,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility key: PRIMARY key_len: 4 ref: NULL rows: 236284469 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.01 sec) mysql> explain select portal_id, id from file_metadata where ( ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: range possible_keys: PRIMARY,file_updated,folder_name_deleted,file_deleted,cloud_key,name_extension_deleted,file_md5,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility key: PRIMARY key_len: 12 ref: NULL rows: 236284469 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.03 sec) mysql> select portal_id, id from file_metadata where ( (portal_id > 2459647) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** portal_id: 2459963 id: 27669456507 1 row in set (0.02 sec) mysql> select portal_id, id from file_metadata where ( ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** portal_id: 2459647 id: 1589 1 row in set (0.00 sec) While the original query, and its tuple version, still gives a full table scan in the explain, and are running slow. mysql> explain select portal_id, id from file_metadata where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: index possible_keys: PRIMARY,file_updated,file_deleted,file_name,type_name_deleted_updated,archive_hidden,frequent_search,frequent_search2,files_by_type_and_visibility key: PRIMARY key_len: 12 ref: NULL rows: 100 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.00 sec) mysql> explain select portal_id, id from file_metadata where (portal_id, id) > (2459647, 1459) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: file_metadata partitions: NULL type: index possible_keys: NULL key: PRIMARY key_len: 12 ref: NULL rows: 100 filtered: 100.00 Extra: Using where; Using index 1 row in set, 1 warning (0.00 sec) mysql> select portal_id, id from file_metadata where ( (portal_id > 2459647) or ( ((portal_id = 2459647)) AND (id > 1459) ) ) order by portal_id asc, id asc limit 1 offset 99\G select portal_id, id from file_metadata where (portal_id, id) > (2459647, 1459) order by portal_id asc, id asc limit 1 offset 99\G *************************** 1. row *************************** portal_id: 2459647 id: 1589 1 row in set (24 min 39.77 sec) *************************** 1. row *************************** portal_id: 2459647 id: 1589 1 row in set (23 min 36.93 sec)
[2 Aug 2023 12:28]
MySQL Verification Team
Bonjour, Thank you , Mr. Gagné, for your feedback.
[4 Aug 2023 15:01]
Jean-François Gagné
I was able to reproduce this on a easily constructed dataset. Below instruction are for 8.0.34, but I get similar results with 8.1.0 and 5.7.43, so adding these to the affected versions. I am also changing the severity from S5 / Performance to S2 Serious. My reasoning is that this behavior is a serious bug, as it prevents iterating / paginating on an index (or primary key in this case). Not being able to do this is a serious impairment, and it might even qualify as Critical in the case where such iteration is needed to retrieve data on a multi-GB/TB table (not being able to retrieve data from a db is a critical problem). The crux of the reproduction is in the function below. It is dense, so a few comments explaining what it does. # This function takes 3 arguments: (k)ey, (c)ount and (v)alue. # The function clears and create a schema / database (for being able to rerun). # Then, it creates a table with a compound primary key (p,i) and a third column (v). # The next bits fill the table: # the p column takes values 0 to c, # for all but the v value of p, there is a single row with i = 1, # for p = v, there are c rows with values of i from 1 to c. # So this table contains 2c rows: c rows with p = v, and c rows with other values of p. # The table is filled in such a way that every INSERT puts 1000 rows in the table to avoid # this being too slow (| paste | awk). # Once the table is filled, and depending of the value of the 1st argument, we create a key # with (p,v). As shown below, the bug is only present when there is a key. # Then, we grow each row by adding char column, and do an ALTER FORCE to inflate the table # (without the ALTER FORCE, instant add column keeps the size of the table small). # After inflating the table, we run an explain on a query paginating in p = v and extract # the type of scan: we expect a range scan, and if we get index, then we have our bug. # The many "| pv -t" are a way to see the progress of a potential long operation # (filling the table, adding the key, and ALTER FORCE). function t() { local k=$1 c=$2 v=$3 ./use <<< "drop database if exists test_jfg; create database test_jfg" ./use test_jfg <<< " create table t( p int not null, i bigint not null, v tinyint not null default 0, PRIMARY KEY (p,i))" { seq -f "(%.0f,1)" 0 $(($v-1)); seq -f "($v,%.0f)" 1 $c; seq -f "(%.0f,1)" $(($v+1)) $c; } | paste -s -d "$(printf ',%.0s' {1..1000})\n" | awk '{print "INSERT INTO t(p,i) VALUES" $0 ";"}' | ./use test_jfg | pv -t test $k -eq 0 || ./use test_jfg <<< "alter table t add key (p,v)" | pv -t { echo "ALTER TABLE t ADD COLUMN c0 CHAR(240) DEFAULT ''"; seq -f " ADD COLUMN c%.0f CHAR(240) DEFAULT ''" 1 15; } | paste -s -d "," | ./use test_jfg ./use test_jfg <<< "ALTER TABLE t FORCE" | pv -t local q="select p, i from t where (p = $v and i > 1) or p > $v order by p, i limit 1 offset 99" ./use test_jfg <<< "explain $q\G" | awk '$1 == "type:"{print $2}' } The reproduction is below. Many thanks for looking into this, Jean-François Gagné # Create a sandbox. dbdeployer deploy single mysql_8.0.34 # With below, we get an index scan, which shows that we reproduce the bug (more obvious below). t 1 3000 1000 0:00:00 0:00:00 0:00:05 index # Without a key, we do not reproduce the bug (we get a quick range scan). t 0 3000 1000 0:00:00 0:00:06 range # With a different value of v, we do not get the bug either. t 1 3000 2500 0:00:00 0:00:00 0:00:06 range # Above reproduces the bug only with only 6000 rows in the table, which do not make a query very slow, # let's put 20M rows in the table etc... and this obviously takes more time to load, but we get our bug will a table scan. t 1 $((10*1000*1000)) $((5*1000*1000)) 0:02:49 0:00:46 1:26:55 index # Now that we have a lot of data in the table, let's run a paginated query on p = 5M and on a the next value of p. # We see the bad plan, and the fact that query doing a full scan is calling the storage engine too much, and taking a lot of time. for i in 5000000 5000001; do q="select p, i from t where (p = $i and i > 1) or p > $i order by p, i limit 1 offset 99" echo; echo $q ./use test_jfg -e "explain $q" ./use test_jfg -e "flush status; $q; show status like 'Handler_read_%'" time ./use test_jfg -e "$q" > /dev/null done select p, i from t where (p = 5000000 and i > 1) or p > 5000000 order by p, i limit 1 offset 99 +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ | 1 | SIMPLE | t | NULL | index | PRIMARY,p | PRIMARY | 12 | NULL | 100 | 100.00 | Using where; Using index | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+ +---------+-----+ | p | i | +---------+-----+ | 5000000 | 101 | +---------+-----+ +-----------------------+---------+ | Variable_name | Value | +-----------------------+---------+ | Handler_read_first | 1 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 5000100 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+---------+ real 0m9.998s user 0m0.004s sys 0m0.002s select p, i from t where (p = 5000001 and i > 1) or p > 5000001 order by p, i limit 1 offset 99 +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+--------------------------+ | 1 | SIMPLE | t | NULL | range | PRIMARY,p | PRIMARY | 12 | NULL | 6667596 | 100.00 | Using where; Using index | +----+-------------+-------+------------+-------+---------------+---------+---------+------+---------+----------+--------------------------+ +---------+---+ | p | i | +---------+---+ | 5000101 | 1 | +---------+---+ +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 2 | | Handler_read_last | 0 | | Handler_read_next | 99 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+-------+ real 0m0.006s user 0m0.005s sys 0m0.000s
[4 Aug 2023 15:09]
MySQL Verification Team
Hi Mr. Gagne, We were able to reproduce your report. This is now a verified bug report. It is affecting bot 8.0 and 8.1.