Bug #59186 Wrong results of join when ICP is enabled
Submitted: 27 Dec 2010 15:09 Modified: 3 Mar 2011 1:32
Reporter: Valeriy Kravchuk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.6.2-m5 OS:Any
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: index_condition_pushdown, optimizer_switch, regression
Triage: Triaged: D2 (Serious)

[27 Dec 2010 15:09] Valeriy Kravchuk
Description:
If the following query 

SELECT f2 FROM t2
JOIN t1 FORCE KEY (PRIMARY) ON t1.f1 = t2.f4
WHERE t2.f3
AND t2.f3 <= 4
AND t1.f3 IN ( 2 , 1 )
OR t1.f3 > 1
AND t1.f3 < 5
AND t2.f3 BETWEEN 6 AND 6 ;

is executed with index_condition_pushdown=on (the default value for the option) in current mysql-trunk, it returns one row whereas all other plans for this query return zero rows:

macbook-pro:trunk openxs$ bin/mysql -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.2-m5-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE t1 (f1 varchar(10), f3 int(11), PRIMARY KEY (f3)) ;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT IGNORE INTO t1 VALUES ('y',1),('or',5);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE t2 (f3 int(11), f2 varchar(1024), f4 varchar(10), PRIMARY KEY (f3)) ;
Query OK, 0 rows affected (0.15 sec)

mysql> INSERT IGNORE INTO t2 VALUES (6,'RPOYT','y'),(10,'JINQE','m');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT f2 FROM t2 JOIN t1 ON t1.f1 = t2.f4 WHERE t2.f3 AND t2.f3 <= 4 AND t1.f3 IN ( 2 , 1 ) OR t1.f3 > 1 AND t1.f3 < 5 AND t2.f3 BETWEEN 6 AND 6 ;
+-------+
| f2    |
+-------+
| RPOYT |
+-------+
1 row in set (0.02 sec)

mysql> explain SELECT f2 FROM t2 JOIN t1 ON t1.f1 = t2.f4 WHERE t2.f3 AND t2.f3 <= 4 AND t1.f3 IN ( 2 , 1 ) OR t1.f3 > 1 AND t1.f3 < 5 AND t2.f3 BETWEEN 6 AND 6 ;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                                                                            |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------------------------------------------+
|  1 | SIMPLE      | t1    | range | PRIMARY       | PRIMARY | 4       | NULL |    1 | Using index condition                                                            |
|  1 | SIMPLE      | t2    | range | PRIMARY       | PRIMARY | 4       | NULL |    2 | Using index condition; Using where; Using join buffer (BNL, incremental buffers) |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> show variables like 'optimizer%';
...
| optimizer_switch           | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on |
...

mysql> set session optimizer_switch='index_condition_pushdown=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain SELECT f2 FROM t2 JOIN t1 ON t1.f1 = t2.f4 WHERE t2.f3 AND t2.f3 <= 4 AND t1.f3 IN ( 2 , 1 ) OR t1.f3 > 1 AND t1.f3 < 5 AND t2.f3 BETWEEN 6 AND 6 ;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                                                     |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | t1    | range | PRIMARY       | PRIMARY | 4       | NULL |    1 | Using where                                               |
|  1 | SIMPLE      | t2    | range | PRIMARY       | PRIMARY | 4       | NULL |    2 | Using where; Using join buffer (BNL, incremental buffers) |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-----------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT f2 FROM t2 JOIN t1 ON t1.f1 = t2.f4 WHERE t2.f3 AND t2.f3 <= 4 AND t1.f3 IN ( 2 , 1 ) OR t1.f3 > 1 AND t1.f3 < 5 AND t2.f3 BETWEEN 6 AND 6 ;
Empty set (0.01 sec)

As index condition pushdown is ON by default, we can call this a regression over previous 5.x versions.

This bug was originally reported as MariaDB bug at https://bugs.launchpad.net/maria/+bug/694092

How to repeat:
CREATE TABLE t1 (f1 varchar(10), f3 int(11), PRIMARY KEY (f3)) ;
INSERT IGNORE INTO t1 VALUES ('y',1),('or',5);

CREATE TABLE t2 (f3 int(11), f2 varchar(1024), f4 varchar(10), PRIMARY KEY (f3)) ;
INSERT IGNORE INTO t2 VALUES (6,'RPOYT','y'),(10,'JINQE','m');

SELECT f2 FROM t2 JOIN t1 FORCE KEY (PRIMARY) ON t1.f1 = t2.f4 WHERE t2.f3 AND t2.f3 <= 4 AND t1.f3 IN ( 2 , 1 ) OR t1.f3 > 1 AND t1.f3 < 5 AND t2.f3 BETWEEN 6 AND 6 ;

Suggested fix:
Check https://bugs.launchpad.net/maria/+bug/694092
[27 Dec 2010 15:11] Valeriy Kravchuk
Verified with current mysql-trunk from bzr on Mac OS X. Workaround is to switch ICP off for this kind of queries.
[12 Jan 2011 9:43] Olav Sandstå
The index conditions that are pushed down to the storage engine are:

For table "t1" on the primary key ("f3"):

  ((`test`.`t1`.`f3` in (2,1)) or ((`test`.`t1`.`f3` > 1) and (`test`.`t1`.`f3` < 5)))

For table "t2" on the primary key ("f3"):

  ((`test`.`t2`.`f3` and (`test`.`t2`.`f3` <= 4)) or (`test`.`t2`.`f3` between 6 and 6))
[20 Jan 2011 10:33] Olav Sandstå
Simplified version of test case:
================================

CREATE TABLE t1 (
  pk INTEGER NOT NULL,
  c1 VARCHAR(3) NOT NULL,
  PRIMARY KEY (pk)
);

INSERT INTO t1 VALUES (1,'y'),(0,'or');

CREATE TABLE t2 (
  pk INTEGER NOT NULL,
  c1 VARCHAR(3) NOT NULL,
  c2 VARCHAR(6) NOT NULL,
  PRIMARY KEY (pk)
);

INSERT INTO t2 VALUES (6,'y','RPOYT'),(10,'m','JINQE');

SELECT c2 FROM t1 JOIN t2 ON t1.c1 = t2.c1
WHERE (t2.pk <= 4 AND t1.pk IN (2,1)) OR
      (t1.pk > 1 AND t2.pk BETWEEN 6 AND 6);

DROP TABLE t1, t2;
[20 Jan 2011 12:17] Olav Sandstå
Investigation of why this query produces the wrong result:
==========================================================

This problem does only appear when index condition pushdown is
enabled. According to explain, index condition pushdown is used for
both tables.

1. Index condition pushdown for table "t1" on the primary key:
==============================================================

At the end of the optimization phase table "t1" has the following
select condition:

  ((t1.pk in (2,1)) or (t1.pk > 1))

Since ICP is enabled the optimizer will attempt to push down as much
as possible of this condition to the storage engine. Most of the logic
for determining which part of the condition that can be pushed down is
found in make_cond_for_index(). This function will recursively
traverse the condition tree and for each node determine which parts of
the tree that can be pushed down to the storage engine. In this case
it will evaluate the following sub trees:

  (t1.pk in (2,1)
  (t1.pk > 1)
  ((t1.pk in (2,1)) or (t1.pk > 1))

In this case all of these are found to be safe to push down to the
storage engine and the complete condition is then pushed down. One
important detail to note here is that during this evaluation all nodes
that are considered safe to push down are "marked" using the following
member of Item:

  cond->marker= ICP_COND_USES_INDEX_ONLY;

(so in this case all the above nodes in the tree get this "marker").

After the condition has been pushed down to the storage engine (and
accepted by the storage engine) we compute the "remainder" for the
condition, ie. the part of the complete condition that must be
evaluated by the server. This "remainder" condition is created by the
function make_cond_remainder() which traverses the complete condition
and builds a new condition tree that only contains the parts that have
not been pushed down to the storage engine. To avoid that this
function has to do the entire evaluation for which parts that can be
pushed down it checks the cond->marker member. If this has the value
ICP_COND_USES_INDEX_ONLY it can skip this part of the tree. So in this
case it only has to look at the root node of the original condition and
concludes that everything is pushed down and that there is nothing
left that the server needs to evaluate.

2. Index condition pushdown for table "t2" on the primary key:
==============================================================

At the end of the optimization phase table "t2" has the following
select condition:

  ((t2.c1 = t1.c1) and (((t2.pk <= 4) and (t1.pk in (2,1))) or ((t1.pk > 1) and (t2.pk between 6 and 6))))

make_cond_for_index() produces the following condition that can be
pushed down to the storage engine:

  ((t2.pk <= 4) or (t2.pk between 6 and 6))

So far everything is "seems" to be correct. After this condition is
pushed down we find the "remainder". In this case
make_cond_remainder() computes the following:

  t2.c1 = t1.c1

as the "remainder". This is assigned as "t2"'s select condition and
will be evaluated by the server. This is NOT correct as there are now
parts of the original select condition that do not get evaluated
neither by the storage engine nor the server. This causes the extra
row to appear in the result from the query.

So why do make_cond_remainder() fail in this case? It will traverse
the original select condition and include nodes that have not been
marked with ICP_COND_USES_INDEX_ONLY. When make_cond_for_index() was
running only the two nodes that got pushed down to the storage
engine should have been marked:

  (t2.pk <= 4)
  (t2.pk between 6 and 6))

but when make_cond_remainder is running the entire sub tree:

  ((t2.pk <= 4) and (t1.pk in (2,1))) or ((t1.pk > 1) and (t2.pk between 6 and 6)))

is already marked with ICP_COND_USES_INDEX_ONLY (and thus not included
in the "remainder"). And looking at the leaf nodes, all of them are
marked with ICP_COND_USES_INDEX_ONLY:

  (t2.pk <= 4), (t1.pk in (2,1)), (t1.pk > 1), (t2.pk between 6 and 6)

And the cause for this is: 

The two leaf nodes that should not have been marked with
ICP_COND_USES_INDEX_ONLY are shared between the select condition for
"t1" and "t2" and is stored in the same Item object. These two were
marked with ICP_COND_USES_INDEX_ONLY when running
make_cond_for_index() on table "t1". The marker value is not cleared
before or during running make_cond_for_index() on "t2" and thus these
will still have the ICP_COND_USES_INDEX_ONLY marker when running
make_cond_remainder() for table "t2". This causes them to not be
included in the select condition to be evaluated by the server for
table "t2".
[20 Jan 2011 22:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129298

3539 Olav Sandstaa	2011-01-20
      Fix for Bug#59186 Wrong results of join when ICP is enabled
      
      When index condition pushdown was used an extra row appeared in the
      result set due to parts of the select condition for the second table
      in the JOIN was not evaluated. 
      
      This was caused when computing the "remainder" for the second table's
      select condition after pushing down parts of it to the storage
      engine. If parts of the select condition was common for both tables in
      the JOIN and the common part was pushed down for the first table the
      common part could have the marker field set to
      ICP_COND_USES_INDEX_ONLY during evaluation of make_cond_for_index()
      for the first table. If the common part of the select condition was
      not pushed down for the second table it would still be marked with
      ICP_COND_USES_INDEX_ONLY when computing the remainder for the select
      condition for the second table. This would cause that this part of the
      select condition neither was pushed down to the storage engine nor
      included in the select condition to be evaluated by the server.
      
      The fix for this is to extend make_cond_for_index() so that it clears
      the marker field for the parts of the item tree that it decides should
      not be pushed down to the storage engine. This will prevent that
      common items in select conditions for different tables that have been
      marked with ICP_COND_USES_INDEX_ONLY when evaluating one table does
      not keep this value when computing the "remainder" for a following
      table in a JOIN.
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#59186 Wrong results of join when ICP is enabled.
     @ mysql-test/r/innodb_icp.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/innodb_icp_none.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/myisam_icp.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/myisam_icp_none.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ sql/sql_select.cc
        Reset the condition's marker field in make_cond_for_index() when it has
        determined that this part of the select condition should not be
        included in the condition to be pushed down to the storage engine. 
        
        The reason this must be cleared is that if the condition
        is a common part of for the select condition of two tables in
        a JOIN operation then the marker field might have gotten the value
        set to ICP_COND_USES_INDEX_ONLY when evaluating the select
        condition for the first table. If this is the case we need
        to reset it to avoid that this part of the select condition
        is wrongly concluded to not be needed in the select condition
        to be evaluated by the server after pushing parts of the
        condition down to the storage engine.
[28 Jan 2011 11:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129844

3565 Olav Sandstaa	2011-01-28
      Fix for Bug#59186 Wrong results of join when ICP is enabled
      
      When index condition pushdown was used an extra row appeared in the
      result set due to parts of the select condition for the second table
      in the JOIN was not evaluated. 
      
      This was caused when computing the "remainder" for the second table's
      select condition after pushing down parts of it to the storage
      engine. If parts of the select condition was common for both tables in
      the JOIN and the common part was pushed down for the first table the
      common part could have the marker field set to
      ICP_COND_USES_INDEX_ONLY during evaluation of make_cond_for_index()
      for the first table. If the common part of the select condition was
      not pushed down for the second table it would still be marked with
      ICP_COND_USES_INDEX_ONLY when computing the remainder for the select
      condition for the second table. This would cause that this part of the
      select condition neither was pushed down to the storage engine nor
      included in the select condition to be evaluated by the server.
      
      The fix for this is to extend make_cond_for_index() so that it clears
      the marker field for the parts of the item tree that it decides should
      not be pushed down to the storage engine. This will prevent that
      common items in select conditions for different tables that have been
      marked with ICP_COND_USES_INDEX_ONLY when evaluating one table does
      not keep this value when computing the "remainder" for a following
      table in a JOIN.
      
      The patch also contains some re-write and clean-up code to 
      make_cond_for_index():
      -remove unnessary cast for return of NULL.
      -ensure that make_cond_for_index() is always called with a cond value.
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#59186 Wrong results of join when ICP is enabled.
     @ mysql-test/r/innodb_icp.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/innodb_icp_none.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/myisam_icp.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ mysql-test/r/myisam_icp_none.result
        Result file for the test for Bug#59186 Wrong results of join when 
        ICP is enabled.
     @ sql/sql_select.cc
        Reset the condition's marker field in make_cond_for_index() when it has
        determined that this part of the select condition should not be
        included in the condition to be pushed down to the storage engine. 
        
        The reason this must be cleared is that if the condition
        is a common part of for the select condition of two tables in
        a JOIN operation then the marker field might have gotten the value
        set to ICP_COND_USES_INDEX_ONLY when evaluating the select
        condition for the first table. If this is the case we need
        to reset it to avoid that this part of the select condition
        is wrongly concluded to not be needed in the select condition
        to be evaluated by the server after pushing parts of the
        condition down to the storage engine.
[28 Jan 2011 12:58] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:olav.sandstaa@oracle.com-20110128125704-qoynw29whkg93549) (version source revid:olav.sandstaa@oracle.com-20110128125704-qoynw29whkg93549) (merge vers: 5.6.2) (pib:24)
[3 Mar 2011 1:32] Paul Dubois
Noted in 5.6.2 changelog.

With index condition pushdown enabled, a join could produce an extra
row due to parts of the select condition for the second table in the
join not being evaluated. 

CHANGESET - http://lists.mysql.com/commits/129844