| Bug #83470 | Reverse scan on a partitioned table does ICP check incorrectly, causing slowdown | ||
|---|---|---|---|
| Submitted: | 20 Oct 2016 18:24 | Modified: | 16 May 2017 2:53 |
| Reporter: | Sergey Petrunya | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Partitions | Severity: | S3 (Non-critical) |
| Version: | 5.7.16-debug-log | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | ICP | ||
[20 Oct 2016 18:46]
Sergey Petrunya
Reproducible on 5.7.16, too.
[21 Oct 2016 5:47]
MySQL Verification Team
Hello Sergey, Thank you for the report and test case. Thanks, Umesh
[21 Oct 2016 17:25]
Sergey Petrunya
The failure scenario is:
Put a breakpoint in QUICK_SELECT_DESC::get_next
Observe execution to reach this call:
file->set_end_range(&min_range, handler::RANGE_SCAN_DESC);
Put a hardware watchpoint on file->end_range
and see:
Hardware watchpoint 4: *$a
Old value = (key_range *) 0x7fff9401b4e8
New value = (key_range *) 0x0
(gdb) wher
#0 Partition_helper::ph_index_read_map (this=0x7fff9401b6a0, buf=0x7fff9401b820 "\377\200", key=0x7fff9401f678 "\231\217N", keypart_map=1, find_flag=HA_READ_BEFORE_KEY) at /home/psergey/dev-git/mysql-5.7/sql/partitioning/partition_handler.cc:2499
#1 0x0000000001ab3c8f in ha_innopart::index_read_map (this=0x7fff9401b3b0, buf=0x7fff9401b820 "\377\200", key=0x7fff9401f678 "\231\217N", keypart_map=1, find_flag=HA_READ_BEFORE_KEY) at /home/psergey/dev-git/mysql-5.7/storage/innobase/handler/ha_innopart.h:1299
#2 0x0000000000f7a13f in handler::ha_index_read_map (this=0x7fff9401b3b0, buf=0x7fff9401b820 "\377\200", key=0x7fff9401f678 "\231\217N", keypart_map=1, find_flag=HA_READ_BEFORE_KEY) at /home/psergey/dev-git/mysql-5.7/sql/handler.cc:2998
#3 0x00000000017e50c8 in QUICK_SELECT_DESC::get_next (this=0x7fff94031da0) at /home/psergey/dev-git/mysql-5.7/sql/opt_range.cc:11495
So, Partition_helper that clears end_range, and then ICP check walks off the
end of the range.
[21 Oct 2016 17:49]
Sergey Petrunya
Looking at how handler::end_range is set or cleared.
It is set when reading ranges:
- read_range_first() sets it
- then read_range_first() and read_range_next() use it.
- ICP code also uses it
It is set by QUICK_SELECT_DESC:
- QUICK_SELECT_DESC::get_next sets it
- then it calls index_read_map() or index_last(), followed by index_prev()
- ICP code inside these calls use it.
It is set by Partition_helper::ph_read_range_first
- it sets m_handler->end_key
- Then the code in Partition_helper::handle_unordered_scan_next_partition
uses it.
Where is handler::end_range cleared?
- it is done in handler::ha_index_end(), handler::ha_index_init() and
other [de]initialization calls.
- within an index scan, it is not cleared:
- if the scan uses read_range_XXX functions, each read_range_first
call will set end_range accordingly.
- if the scan is a reverse scan done by QUICK_SELECT_DESC, then
QUICK_SELECT_DESC code will call set_end_range() for each new range.
[21 Oct 2016 17:50]
Sergey Petrunya
Considering the above, there seems to be absolutely no reason to have m_handler->end_range= NULL; assignment inside these functions: - Partition_helper::ph_index_read_last_map - Partition_helper::ph_index_first - Partition_helper::ph_index_read_map
[25 Oct 2016 14:27]
Sergey Petrunya
Suggested patch
Attachment: bug83470.diff (text/x-patch), 4.83 KiB.
[25 Oct 2016 14:29]
Sergey Petrunya
Please find the patch attached. I'm offering it under BSD licence.
[21 Dec 2016 11:00]
MySQL Verification Team
see also http://bugs.mysql.com/bug.php?id=84107
[16 May 2017 2:53]
Jon Stephens
Documented fix in the 5.7.19 changelog, as follows:
When a query performed a reverse-ordered range scan (to satisfy
ORDER BY ... DESC) using index condition pushdown on a
partitioned InnoDB table, it could take an unreasonably long
time to finish. This was due to the fact that the condition
pushdown check failed to get the bounds of the range, so that
the scan continued to read index tuples until it reached the
first value in the index.
Closed.

Description: When a query - reads a partitioned InnoDB table - does a reverse-ordered range scan (to satisfy ORDER BY DESC) - uses Index Condition Pushdown it may take an unreasonably long time to finish. The reason is that ICP check fails to check for the range bound. The scan walks out of the range being scanned and continues to read index tuples until it reaches the first value in the index. How to repeat: create table ten(a int); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t1 ( date_col datetime not null, col1 int, filler varchar(255), key(date_col, col1) ) partition by range columns(date_col) ( partition p0 values less than ('2014-01-01 00:00:00'), partition p1 values less than ('2015-01-01 00:00:00'), partition p2 values less than ('2016-01-01 00:00:00') ); insert into t1 select date_add('2013-01-01', interval (A.a+10*B.a) day), 123456, repeat('filler-data', 5) from ten A, ten B, ten C; insert into t1 select date_add('2014-01-01', interval (A.a+10*B.a) day), 123456, repeat('filler-data', 5) from ten A, ten B, ten C; ## This will return 10 rows select * from t1 force index(date_col) where date_col >= date('2013-04-06') AND date_col < date('2013-04-06' + interval 1 day); ## ## Now, add an index condition that counts how many times it was checked ## set @cnt=1; select * from t1 force index(date_col) where col1 + 10000000 > (@cnt:=@cnt+1) and date_col >= date('2013-04-06') AND date_col < date('2013-04-06' + interval 1 day) ORDER BY date_col DESC; select @cnt; +------+ | @cnt | +------+ | 961 | +------+ And we see that index condition was checked 961 times, even if we just read 10 rows. EXPLAIN: MySQL [j15]> explain select * from t1 force index(date_col) where col1 + 10000000 > (@cnt:=@cnt+1) and date_col >= date('2013-04-06') AND date_col < date('2013-04-06' + interval 1 day) ORDER BY date_col DESC; +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | t1 | p0 | range | date_col | date_col | 5 | NULL | 10 | 100.00 | Using index condition | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ 1 row in set, 1 warning (0.00 sec)