| Bug #52660 | Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT | ||
|---|---|---|---|
| Submitted: | 7 Apr 2010 13:22 | Modified: | 23 Nov 2010 3:00 |
| Reporter: | Olav Sandstå | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 6.0.14 (mysql-6.0-codebase) | OS: | Any |
| Assigned to: | Olav Sandstå | CPU Architecture: | Any |
| Tags: | index_condition_pushdown, optimizer_switch | ||
[9 Apr 2010 13:21]
Valeriy Kravchuk
Verified just as described with recent mysql-6.0-codebase on 32-biut Linux: ... mysql> SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2; Empty set (34.91 sec) mysql> set session optimizer_switch='index_condition_pushdown=OFF'; Query OK, 0 rows affected (0.09 sec) mysql> SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2; Empty set (0.04 sec) mysql> select version(); +--------------------+ | version() | +--------------------+ | 6.0.14-alpha-debug | +--------------------+ 1 row in set (0.02 sec)
[8 Oct 2010 12:30]
Olav Sandstå
The explain for the above query gives the following:
EXPLAIN
SELECT c1,c3 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2;
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t3 range c1 c1 12 NULL 2 Using index condition; Using where
This query will be executed as a range scan using the index (c1(6),i1)
with upper and lower limit as:
c1 >= 'c-1004=w' and c1 <= 'c-1006=w'
It will also use ICP and push down the condition "i1 > 2".
Since ICP is in use, the storage engine will also evalute the range
condition as part of the ICP evaluation. So for each index entry
evaluate the following is evaluated:
1. Range condition: c1 <= 'c-1006=w'
2. ICP condition: i1 > 2
What is special here is that c1 is of type TINYTEXT. TINYTEXT is a
Blob-type in MyISAM. Blob fields are given some special handling when
reading index entries. For reading index entries the following
function (in mi_key.c) is used:
static int _mi_put_key_in_record(register MI_INFO *info, uint keynr,
my_bool unpack_blobs, uchar *record)
The unpack_blobs argument decides whether blob fields should be read
or skipped. When ICP is used this function is called like this:
int mi_check_index_cond(register MI_INFO *info, uint keynr, uchar *record)
{
if (_mi_put_key_in_record(info, keynr, FALSE, record))
{
mi_print_error(info->s, HA_ERR_CRASHED);
my_errno=HA_ERR_CRASHED;
return -1;
}
return info->index_cond_func(info->index_cond_func_arg);
}
so in this case we do not read the TINYTEXT field from the index. But
we will still evaluate the range condition. This evaluation will be
done against "whatever" the record buffer contains for this field. In
this case it will contain the values which c1 had when doing the
initial index position when searching for the start of the range. So
all range evaluations will do the following comparision:
'c-1004=w' <= 'c-1006=w'
ie. evaluate to TRUE. Since the ICP condition is FALSE for all
records in the database the ICP code will fail to detect that we are
reading out of the seach range. As a result all index entries for the
entire table will be read instead of just a few.
[16 Oct 2010 10:04]
Olav Sandstå
The change to not read BLOB fields from an index entry during executing of pushed index condition in MyISAM was introduced in a bug fix for Bug#42048 "Discrepancy between MyISAM and Maria's Index condition pushdown implementation". The change set has revision id sergefp@mysql.com-20090125165907-e3veaevraj2i924g and has the following comment: "Add a parameter to _m{i,a}_put_key_in_record() which tells it not to unpack blobs because unpacking blobs from index condition check function modifies MyISAM/Maria's internal state in a way that's not expected by the rest of its code."
[19 Oct 2010 12:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/121134 3264 Olav.Sandstaa@sun.com 2010-10-19 Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT When an index condition is pushed down to MyISAM the content of (most of) the index entry will be read into the record buffer. This will be used for evaluating the index condition and the range check in case of a range scan. When reading the content of the index entry all fields will be read with the exception of BLOB fields. This is not a problem for evaluating the index condition itself since the server will not push down an index condition that contains a BLOB field. But it is a problem for the range evaluation that is performed as part of the ICP implementation in MyISAM since this might be done using a "non-read" BLOB value. A simplified version of the query given in the bug report: SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2; it will be done as a range scan and following happens : 1. MyISAM locates the first record satisfying the lower range and this record will be read into the record buffer. So after this the content of the record buffer for the BLOB field will have the value: "c-1004=w" 2. When the server requests the next record in the range MyISAM will due to ICP being used read in the index entry (but not including the BLOB field), so when the end of range check is done it will evaluate the upper range value (c1 <= 'c-1006=w') against the value of the first record read (ie. it will evaluate to true) and since the ICP condition (i1 > 1) evaluates to false it will skip this record and go to the next record. This creates a loop that instead of stopping at the upper range value will loop until it has read the complete index (or until the ICP condition evaluates to true). So in the example in the bug report about 10 million index entries will be read and evaluated instead of just 3 or 4. The fix for this is to let MyISAM reject pushed index conditions whenever the index used for ICP contains a BLOB field. Note that this change disables ICP from being used by MyISAM in some cases where it actually was safe to use it. The pushed index condition will be reject in all cases where the index contains a BLOB field also when the BLOB field is not used in any range evaluation. The reason for this is at the time the server pushes the index condition MyISAM does not know whether the actually data access will be a range scan with an upper range value or not. The patch contains a simplified version of the original test case. This test case only contains 100 records which is too little to show any performance regression. The important point with the test case is that without the fix the explain will show "Using index condition" while with the fix it will say "Using where". @ mysql-test/include/icp_tests.inc Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/null_key_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/null_key_icp.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_icp_mrr.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ storage/myisam/ha_myisam.cc Change the ICP implementation for MyISAM to reject any pushed index conditions if the index contains a BLOB field. The reason for this change is that MyISAM does not read in BLOB fields from the index entry as part of the ICP evaluation. If it is a range scan where the BLOB field is used as part of the upper range limit then the range evaluation will be done against a previously read BLOB value and evaluate to true. As a consequence MyISAM will fail to detect that it has reached the end of a range interval and continue to read the complete index.
[1 Nov 2010 14:25]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/122437 3272 Olav.Sandstaa@sun.com 2010-11-01 Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT When an index condition is pushed down to MyISAM the content of (most of) the index entry will be read into the record buffer. This will be used for evaluating the index condition and the range check in case of a range scan. When reading the content of the index entry all fields will be read with the exception of BLOB fields. This is not a problem for evaluating the index condition itself since the server will not push down an index condition that contains a BLOB field. But it is a problem for the range evaluation that is performed as part of the ICP implementation in MyISAM since this might be done using a "non-read" BLOB value. A simplified version of the query given in the bug report: SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2; it will be done as a range scan and following happens : 1. MyISAM locates the first record satisfying the lower range and this record will be read into the record buffer. So after this the content of the record buffer for the BLOB field will have the value: "c-1004=w" 2. When the server requests the next record in the range MyISAM will due to ICP being used read in the index entry (but not including the BLOB field), so when the end of range check is done it will evaluate the upper range value (c1 <= 'c-1006=w') against the value of the first record read (ie. it will evaluate to true) and since the ICP condition (i1 > 1) evaluates to false it will skip this record and go to the next record. This creates a loop that instead of stopping at the upper range value will loop until it has read the complete index (or until the ICP condition evaluates to true). So in the example in the bug report about 10 million index entries will be read and evaluated instead of just 3 or 4. The fix for this is to let MyISAM reject pushed index conditions whenever the index used for ICP contains a BLOB field. Note that this change disables ICP from being used by MyISAM in some cases where it actually was safe to use it. The pushed index condition will be reject in all cases where the index contains a BLOB field also when the BLOB field is not used in any range evaluation. The reason for this is at the time the server pushes the index condition MyISAM does not know whether the actually data access will be a range scan with an upper range value or not. The patch contains a simplified version of the original test case. This test case only contains 100 records which is too little to show any performance regression. The important point with the test case is that without the fix the explain will show "Using index condition" while with the fix it will say "Using where". @ mysql-test/include/icp_tests.inc Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/null_key_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/null_key_icp.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_icp_mrr.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ sql/handler.h Add documentation for the handler::idx_cond_push() method. @ storage/myisam/ha_myisam.cc Change the ICP implementation for MyISAM to reject any pushed index conditions if the index contains a BLOB field. The reason for this change is that MyISAM does not read in BLOB fields from the index entry as part of the ICP evaluation. If it is a range scan where the BLOB field is used as part of the upper range limit then the range evaluation will be done against a previously read BLOB value and evaluate to true. As a consequence MyISAM will fail to detect that it has reached the end of a range interval and continue to read the complete index.
[2 Nov 2010 18:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/122580 3273 Olav.Sandstaa@sun.com 2010-11-02 Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT When an index condition is pushed down to MyISAM the content of (most of) the index entry will be read into the record buffer. This will be used for evaluating the index condition and the range check in case of a range scan. When reading the content of the index entry all fields will be read with the exception of BLOB fields. This is not a problem for evaluating the index condition itself since the server will not push down an index condition that contains a BLOB field. But it is a problem for the range evaluation that is performed as part of the ICP implementation in MyISAM since this might be done using a "non-read" BLOB value. A simplified version of the query given in the bug report: SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2; it will be done as a range scan and following happens : 1. MyISAM locates the first record satisfying the lower range and this record will be read into the record buffer. So after this the content of the record buffer for the BLOB field will have the value: "c-1004=w" 2. When the server requests the next record in the range MyISAM will due to ICP being used read in the index entry (but not including the BLOB field), so when the end of range check is done it will evaluate the upper range value (c1 <= 'c-1006=w') against the value of the first record read (ie. it will evaluate to true) and since the ICP condition (i1 > 1) evaluates to false it will skip this record and go to the next record. This creates a loop that instead of stopping at the upper range value will loop until it has read the complete index (or until the ICP condition evaluates to true). So in the example in the bug report about 10 million index entries will be read and evaluated instead of just 3 or 4. The fix for this is to let MyISAM reject pushed index conditions whenever the index used for ICP contains a BLOB field. Note that this change disables ICP from being used by MyISAM in some cases where it actually was safe to use it. The pushed index condition will be reject in all cases where the index contains a BLOB field also when the BLOB field is not used in any range evaluation. The reason for this is at the time the server pushes the index condition MyISAM does not know whether the actually data access will be a range scan with an upper range value or not. The patch contains a simplified version of the original test case. This test case only contains 100 records which is too little to show any performance regression. The important point with the test case is that without the fix the explain will show "Using index condition" while with the fix it will say "Using where". @ mysql-test/include/icp_tests.inc Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/innodb_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_all.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/myisam_icp_none.result Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT. @ mysql-test/r/null_key_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/null_key_icp.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_all.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ mysql-test/r/order_by_icp_mrr.result Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT". With this fix ICP will no longer be used for MyISAM if the index for the pushed index condition contains a BLOB field. @ sql/handler.h Add documentation for the handler::idx_cond_push() method. @ storage/myisam/ha_myisam.cc Change the ICP implementation for MyISAM to reject any pushed index conditions if the index contains a BLOB field. The reason for this change is that MyISAM does not read in BLOB fields from the index entry as part of the ICP evaluation. If it is a range scan where the BLOB field is used as part of the upper range limit then the range evaluation will be done against a previously read BLOB value and evaluate to true. As a consequence MyISAM will fail to detect that it has reached the end of a range interval and continue to read the complete index.
[2 Nov 2010 20:04]
Olav Sandstå
Patch pushed to mysql-next-mr-opt-backporting with revision id: olav.sandstaa@sun.com-20101102184747-qfuntqwj021imy9r.
[13 Nov 2010 16:17]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (merge vers: 5.6.99-m5) (pib:21)
[13 Nov 2010 16:30]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[23 Nov 2010 3:00]
Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.

Description: Index condition pushdown can in some cases lead to increased response times for range queries against MyISAM tables if the index used for the range query contains a TEXT field. In these cases MyISAM will scan the entire index instead of the interval given by the range. Example: The following query (see table definition further down): SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2; where c1 is of type TINYTEXT will be executed as a range query with a start and end of the range to be "c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w'". If ICP is enabled the condition "i1 > 2" will be pushed down. Executing this query with ICP disable will result in MyISAM reading the entries given by the range interval. With ICP enable MyISAM might in some cases read the entire index. For this particular query this can increase the execution time from about 0 seconds to 10 seconds for a table containing 10 million entries (see details in the "how to repeat" section). How to repeat: # Create a table (t3) containing 10 million records CREATE TABLE t1 (a INT); INSERT INTO t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); CREATE TABLE t2 (a INT); INSERT INTO t2 SELECT A.a + 10*(B.a + 10*(C.a + 10*(D.a + 10*(E.a + 10*(F.a + 10*G.a))))) FROM t1 A, t1 B, t1 C, t1 D, t1 E, t1 F, t1 G; CREATE TABLE t3 ( c1 TINYTEXT NOT NULL, c2 CHAR(30) NOT NULL, c3 CHAR(200) NOT NULL, i1 INT NOT NULL, PRIMARY KEY (c2), KEY (c1(6),i1) ); INSERT INTO t3 SELECT CONCAT('c-',10000000+t2.a,'=w'), CONCAT('c-',10000000+ t2.a,'=w'), 'filler', 1 FROM t2; # Run the following query and look at the time to complete it: SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2; # Disable Index Condition Pushdown: set session optimizer_switch='index_condition_pushdown=OFF'; # Run the same query again and look at the time to complete it: SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2; On my machine this query takes about 10 seconds to complete when ICP is enabled and about 0 seconds to complete when ICP is disabled