Bug #52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT
Submitted: 7 Apr 2010 13:22 Modified: 23 Nov 2010 3:00
Reporter: Olav Sandstå Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:6.0.14 (mysql-6.0-codebase) OS:Any
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: index_condition_pushdown, optimizer_switch

[7 Apr 2010 13:22] Olav Sandstå
Description:
Index condition pushdown can in some cases lead to increased response times for range queries against MyISAM tables if the index used for the range query contains a TEXT field. In these cases MyISAM will scan the entire index instead of the interval given by the range. 

Example: The following query (see table definition further down):

SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2;

where c1 is of type TINYTEXT will be executed as a range query with a start and end of the range to be "c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w'". If ICP is enabled the condition "i1 > 2" will be pushed down.

Executing this query with ICP disable will result in MyISAM reading the entries given by the range interval. With ICP enable MyISAM might in some cases read the entire index. For this particular query this can increase the execution time from about 0 seconds to 10 seconds for a table containing 10 million entries (see details in the "how to repeat" section).

How to repeat:
# Create a table (t3) containing 10 million records

CREATE TABLE t1 (a INT);
INSERT INTO t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

CREATE TABLE t2 (a INT);
INSERT INTO t2 SELECT A.a + 10*(B.a + 10*(C.a + 10*(D.a + 10*(E.a + 10*(F.a + 10*G.a))))) FROM t1 A, t1 B, t1 C, t1 D, t1 E, t1 F, t1 G;

CREATE TABLE t3 (
  c1 TINYTEXT NOT NULL,
  c2 CHAR(30) NOT NULL,
  c3 CHAR(200) NOT NULL,
  i1 INT NOT NULL,
  PRIMARY KEY (c2),
  KEY (c1(6),i1)
);

INSERT INTO t3 
  SELECT CONCAT('c-',10000000+t2.a,'=w'), CONCAT('c-',10000000+ t2.a,'=w'), 'filler', 1
  FROM t2;

# Run the following query and look at the time to complete it:

SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2;

# Disable Index Condition Pushdown:

set session optimizer_switch='index_condition_pushdown=OFF';

# Run the same query again and look at the time to complete it:

SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2;

On my machine this query takes about 10 seconds to complete when ICP is enabled and about 0 seconds to complete when ICP is disabled
[9 Apr 2010 13:21] Valeriy Kravchuk
Verified just as described with recent mysql-6.0-codebase on 32-biut Linux:

...
mysql> SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2;
Empty set (34.91 sec)

mysql> set session optimizer_switch='index_condition_pushdown=OFF';
Query OK, 0 rows affected (0.09 sec)

mysql> SELECT c1,c3 FROM t3 WHERE c1 >= 'c-10000004=w' and c1 <= 'c-10000006=w' and i1 > 2;
Empty set (0.04 sec)

mysql> select version();
+--------------------+
| version()          |
+--------------------+
| 6.0.14-alpha-debug |
+--------------------+
1 row in set (0.02 sec)
[8 Oct 2010 12:30] Olav Sandstå
The explain for the above query gives the following:

EXPLAIN
SELECT c1,c3 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t3	range	c1	c1	12	NULL	2	Using index condition; Using where

This query will be executed as a range scan using the index (c1(6),i1)
with upper and lower limit as:

  c1 >= 'c-1004=w' and c1 <= 'c-1006=w'

It will also use ICP and push down the condition "i1 > 2". 

Since ICP is in use, the storage engine will also evalute the range
condition as part of the ICP evaluation. So for each index entry
evaluate the following is evaluated:

 1. Range condition: c1 <= 'c-1006=w'
 2. ICP condition: i1 > 2

What is special here is that c1 is of type TINYTEXT. TINYTEXT is a
Blob-type in MyISAM. Blob fields are given some special handling when
reading index entries. For reading index entries the following
function (in mi_key.c) is used:

  static int _mi_put_key_in_record(register MI_INFO *info, uint keynr, 
                                   my_bool unpack_blobs, uchar *record)

The unpack_blobs argument decides whether blob fields should be read
or skipped. When ICP is used this function is called like this:

int mi_check_index_cond(register MI_INFO *info, uint keynr, uchar *record)
{
  if (_mi_put_key_in_record(info, keynr, FALSE, record))
  {
    mi_print_error(info->s, HA_ERR_CRASHED);
    my_errno=HA_ERR_CRASHED;
    return -1;
  }
  return info->index_cond_func(info->index_cond_func_arg);
}

so in this case we do not read the TINYTEXT field from the index. But
we will still evaluate the range condition. This evaluation will be
done against "whatever" the record buffer contains for this field. In
this case it will contain the values which c1 had when doing the
initial index position when searching for the start of the range. So
all range evaluations will do the following comparision:

  'c-1004=w' <= 'c-1006=w'

ie. evaluate to TRUE. Since the ICP condition is FALSE for all
records in the database the ICP code will fail to detect that we are
reading out of the seach range. As a result all index entries for the
entire table will be read instead of just a few.
[16 Oct 2010 10:04] Olav Sandstå
The change to not read BLOB fields from an index entry during executing of pushed index condition in MyISAM was introduced in a bug fix for Bug#42048  	"Discrepancy between MyISAM and Maria's Index condition pushdown implementation".

The change set has revision id sergefp@mysql.com-20090125165907-e3veaevraj2i924g
and has the following comment: 

"Add a parameter to _m{i,a}_put_key_in_record() which tells it not to
unpack blobs because unpacking blobs from index condition check function
modifies MyISAM/Maria's internal state in a way that's not expected by the
rest of its code."
[19 Oct 2010 12:56] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/121134

3264 Olav.Sandstaa@sun.com	2010-10-19
      Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT
      
      When an index condition is pushed down to MyISAM the content of (most
      of) the index entry will be read into the record buffer. This will be
      used for evaluating the index condition and the range check in case of
      a range scan. When reading the content of the index entry all fields
      will be read with the exception of BLOB fields. This is not a problem
      for evaluating the index condition itself since the server will not
      push down an index condition that contains a BLOB field. But it is a
      problem for the range evaluation that is performed as part of the ICP
      implementation in MyISAM since this might be done using a "non-read"
      BLOB value.
      
      A simplified version of the query given in the bug report:
      
      SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2;
      
      it will be done as a range scan and following happens :
      
      1. MyISAM locates the first record satisfying the lower range and this
      record will be read into the record buffer. So after this the
      content of the record buffer for the BLOB field will have the value:
      
         "c-1004=w"
      
      2. When the server requests the next record in the range MyISAM will
      due to ICP being used read in the index entry (but not including the
      BLOB field), so when the end of range check is done it will evaluate
      the upper range value (c1 <= 'c-1006=w') against the value of the
      first record read (ie. it will evaluate to true) and since the ICP
      condition (i1 > 1) evaluates to false it will skip this record and go
      to the next record. This creates a loop that instead of stopping at
      the upper range value will loop until it has read the complete index
      (or until the ICP condition evaluates to true).
      
      So in the example in the bug report about 10 million index entries
      will be read and evaluated instead of just 3 or 4.
      
      The fix for this is to let MyISAM reject pushed index conditions
      whenever the index used for ICP contains a BLOB field.
      
      Note that this change disables ICP from being used by MyISAM in some
      cases where it actually was safe to use it. The pushed index condition
      will be reject in all cases where the index contains a BLOB field also
      when the BLOB field is not used in any range evaluation. The reason for
      this is at the time the server pushes the index condition MyISAM does
      not know whether the actually data access will be a range scan with an
      upper range value or not.
      
      The patch contains a simplified version of the original test
      case. This test case only contains 100 records which is too little to
      show any performance regression. The important point with the test
      case is that without the fix the explain will show "Using index
      condition" while with the fix it will say "Using where".
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range 
        queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/null_key_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/null_key_icp.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_icp_mrr.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ storage/myisam/ha_myisam.cc
        Change the ICP implementation for MyISAM to reject any pushed 
        index conditions if the index contains a BLOB field.
        
        The reason for this change is that MyISAM does not read in
        BLOB fields from the index entry as part of the ICP evaluation.
        If it is a range scan where the BLOB field is used as part of the
        upper range limit then the range evaluation will be done against
        a previously read BLOB value and evaluate to true. As a consequence
        MyISAM will fail to detect that it has reached the end of a range
        interval and continue to read the complete index.
[1 Nov 2010 14:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/122437

3272 Olav.Sandstaa@sun.com	2010-11-01
      Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT
      
      When an index condition is pushed down to MyISAM the content of (most
      of) the index entry will be read into the record buffer. This will be
      used for evaluating the index condition and the range check in case of
      a range scan. When reading the content of the index entry all fields
      will be read with the exception of BLOB fields. This is not a problem
      for evaluating the index condition itself since the server will not
      push down an index condition that contains a BLOB field. But it is a
      problem for the range evaluation that is performed as part of the ICP
      implementation in MyISAM since this might be done using a "non-read"
      BLOB value.
      
      A simplified version of the query given in the bug report:
      
      SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2;
      
      it will be done as a range scan and following happens :
      
      1. MyISAM locates the first record satisfying the lower range and this
      record will be read into the record buffer. So after this the
      content of the record buffer for the BLOB field will have the value:
      
         "c-1004=w"
      
      2. When the server requests the next record in the range MyISAM will
      due to ICP being used read in the index entry (but not including the
      BLOB field), so when the end of range check is done it will evaluate
      the upper range value (c1 <= 'c-1006=w') against the value of the
      first record read (ie. it will evaluate to true) and since the ICP
      condition (i1 > 1) evaluates to false it will skip this record and go
      to the next record. This creates a loop that instead of stopping at
      the upper range value will loop until it has read the complete index
      (or until the ICP condition evaluates to true).
      
      So in the example in the bug report about 10 million index entries
      will be read and evaluated instead of just 3 or 4.
      
      The fix for this is to let MyISAM reject pushed index conditions
      whenever the index used for ICP contains a BLOB field.
      
      Note that this change disables ICP from being used by MyISAM in some
      cases where it actually was safe to use it. The pushed index condition
      will be reject in all cases where the index contains a BLOB field also
      when the BLOB field is not used in any range evaluation. The reason for
      this is at the time the server pushes the index condition MyISAM does
      not know whether the actually data access will be a range scan with an
      upper range value or not.
      
      The patch contains a simplified version of the original test
      case. This test case only contains 100 records which is too little to
      show any performance regression. The important point with the test
      case is that without the fix the explain will show "Using index
      condition" while with the fix it will say "Using where".
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range 
        queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/null_key_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/null_key_icp.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_icp_mrr.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ sql/handler.h
        Add documentation for the handler::idx_cond_push() method.
     @ storage/myisam/ha_myisam.cc
        Change the ICP implementation for MyISAM to reject any pushed 
        index conditions if the index contains a BLOB field.
        
        The reason for this change is that MyISAM does not read in
        BLOB fields from the index entry as part of the ICP evaluation.
        If it is a range scan where the BLOB field is used as part of the
        upper range limit then the range evaluation will be done against
        a previously read BLOB value and evaluate to true. As a consequence
        MyISAM will fail to detect that it has reached the end of a range
        interval and continue to read the complete index.
[2 Nov 2010 18:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/122580

3273 Olav.Sandstaa@sun.com	2010-11-02
      Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT
      
      When an index condition is pushed down to MyISAM the content of (most
      of) the index entry will be read into the record buffer. This will be
      used for evaluating the index condition and the range check in case of
      a range scan. When reading the content of the index entry all fields
      will be read with the exception of BLOB fields. This is not a problem
      for evaluating the index condition itself since the server will not
      push down an index condition that contains a BLOB field. But it is a
      problem for the range evaluation that is performed as part of the ICP
      implementation in MyISAM since this might be done using a "non-read"
      BLOB value.
      
      A simplified version of the query given in the bug report:
      
      SELECT c1 FROM t3 WHERE c1 >= 'c-1004=w' and c1 <= 'c-1006=w' and i1 > 2;
      
      it will be done as a range scan and following happens :
      
      1. MyISAM locates the first record satisfying the lower range and this
      record will be read into the record buffer. So after this the
      content of the record buffer for the BLOB field will have the value:
      
         "c-1004=w"
      
      2. When the server requests the next record in the range MyISAM will
      due to ICP being used read in the index entry (but not including the
      BLOB field), so when the end of range check is done it will evaluate
      the upper range value (c1 <= 'c-1006=w') against the value of the
      first record read (ie. it will evaluate to true) and since the ICP
      condition (i1 > 1) evaluates to false it will skip this record and go
      to the next record. This creates a loop that instead of stopping at
      the upper range value will loop until it has read the complete index
      (or until the ICP condition evaluates to true).
      
      So in the example in the bug report about 10 million index entries
      will be read and evaluated instead of just 3 or 4.
      
      The fix for this is to let MyISAM reject pushed index conditions
      whenever the index used for ICP contains a BLOB field.
      
      Note that this change disables ICP from being used by MyISAM in some
      cases where it actually was safe to use it. The pushed index condition
      will be reject in all cases where the index contains a BLOB field also
      when the BLOB field is not used in any range evaluation. The reason for
      this is at the time the server pushes the index condition MyISAM does
      not know whether the actually data access will be a range scan with an
      upper range value or not.
      
      The patch contains a simplified version of the original test
      case. This test case only contains 100 records which is too little to
      show any performance regression. The important point with the test
      case is that without the fix the explain will show "Using index
      condition" while with the fix it will say "Using where".
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#52660 Perf. regr. using ICP for MyISAM on range 
        queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/innodb_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_all.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/myisam_icp_none.result
        Result file for test for Bug#52660 Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT.
     @ mysql-test/r/null_key_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/null_key_icp.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_all.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ mysql-test/r/order_by_icp_mrr.result
        Change in explain output due to fix for Bug#52660 "Perf. regr. using ICP for MyISAM 
        on range queries on an index containing TEXT". With this fix ICP will no longer be
        used for MyISAM if the index for the pushed index condition contains a BLOB field.
     @ sql/handler.h
        Add documentation for the handler::idx_cond_push() method.
     @ storage/myisam/ha_myisam.cc
        Change the ICP implementation for MyISAM to reject any pushed 
        index conditions if the index contains a BLOB field.
        
        The reason for this change is that MyISAM does not read in
        BLOB fields from the index entry as part of the ICP evaluation.
        If it is a range scan where the BLOB field is used as part of the
        upper range limit then the range evaluation will be done against
        a previously read BLOB value and evaluate to true. As a consequence
        MyISAM will fail to detect that it has reached the end of a range
        interval and continue to read the complete index.
[2 Nov 2010 20:04] Olav Sandstå
Patch pushed to mysql-next-mr-opt-backporting with revision id:
olav.sandstaa@sun.com-20101102184747-qfuntqwj021imy9r.
[13 Nov 2010 16:17] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (merge vers: 5.6.99-m5) (pib:21)
[13 Nov 2010 16:30] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)
[23 Nov 2010 3:00] Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.