Bug #58838 Wrong results with HAVING + LIMIT without GROUP BY when ICP is enabled
Submitted: 9 Dec 2010 12:05 Modified: 3 Mar 2011 2:27
Reporter: John Embretsen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.1-m5 OS:Any
Assigned to: Olav Sandstå CPU Architecture:Any
Tags: index_condition_pushdown, optimizer_switch, wl5639

[9 Dec 2010 12:05] John Embretsen
Description:
Using a server with the "index_condition_pushdown" (ICP) feature available (current mysql-trunk (5.6.1) and above), the server returns wrong results when ICP is enabled, with queries against MyISAM involving HAVING and LIMIT, when the column in the WHERE condition contains NULL.

Example where "LIMIT 1" and "LIMIT 2" give wrong/strange results*:

SELECT * FROM t1 ORDER BY pk;
pk      col_int_key
1       37
2       8
3       -25
4       NULL
5       55

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk;
pk
3

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 0;
pk

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 1;
pk
1

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 2;
pk
1
2

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 5;
pk
3

SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 100;
pk
3

Note that if the data in column col_int_key does not contain a NULL value, the correct result (single row, 3) is returned. If GROUP BY is added, the correct result is also returned. If using "WHERE ... AND pk = 3" instead of HAVING, the correct result is returned.

If switching off ICP (SET SESSION optimizer_switch='index_condition_pushdown=off'), the correct result is returned.

*) Using HAVING without GROUP BY and/or aggregates is questionable, and it may be debatable what the correct results should be (MySQL manual says "Do not use HAVING for items that should be in the WHERE clause", http://dev.mysql.com/doc/refman/5.5/en/select.html). However, MySQL does allow it by default. In the above it is assumed that the HAVING clause is converted/added to the WHERE clause in the query.

See also similar Bug#46077.

How to repeat:
CREATE TABLE t1 (
  pk INT NOT NULL AUTO_INCREMENT,
  col_int_key INT DEFAULT NULL,
  PRIMARY KEY (pk),
  KEY col_int_key (col_int_key)
) ENGINE=MyISAM;

INSERT INTO t1 VALUES (1,37);
INSERT INTO t1 VALUES (2,8);
INSERT INTO t1 VALUES (3,-25);
INSERT INTO t1 VALUES (4,NULL);
INSERT INTO t1 VALUES (5,55);

SELECT * FROM t1 ORDER BY pk;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 0;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 1;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 2;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 5;
SELECT pk FROM t1 WHERE col_int_key <> 1 HAVING pk = 3 ORDER BY pk LIMIT 100;
[30 Dec 2010 15:18] Olav Sandstå
Explain output for the "LIMIT 1" query:

explain SELECT pk FROM t1 WHERE c1 <> 1 HAVING pk = 3 ORDER BY pk LIMIT 1;
id      select_type     table   type    possible_keys   key     key_len ref    rows     Extra
1       SIMPLE  t1      range   col_int_key     col_int_key     5       NULL   4
        Using index condition; Using filesort
[30 Dec 2010 15:58] Olav Sandstå
Here is an overview of what causes these queries to produce a wrong
result when index condition pushdown is enabled:

1. During the optimization phase the optimizer decides to run this a
   range query using the secondary index col_int_key (see above explain
   output).

2. Based on this the following part of the where condition is pushed down to
   the handler on the col_int_key index:

    `test`.`t1`.`c1` <> 1

   (and thus also removed from the where condition that the server
   will evaluate).

3. During execution phase (in JOIN::exec()) we determines that the
   HAVING part of the query can be included in the where
   condition. See the following code:

      // Some tables may have been const
      curr_join->tmp_having->update_used_tables();
      JOIN_TAB *curr_table= &curr_join->join_tab[curr_join->const_tables];
      table_map used_tables= (curr_join->const_table_map |
			      curr_table->table->map);

      Item* sort_table_cond= make_cond_for_table(curr_join->tmp_having,
						 used_tables,
						 used_tables, 0);
      if (sort_table_cond)
      {
	if (!curr_table->select)
	  if (!(curr_table->select= new SQL_SELECT))
	    DBUG_VOID_RETURN;
	if (!curr_table->select->cond)
	  curr_table->select->cond= sort_table_cond;
	else
	{
	  if (!(curr_table->select->cond=
		new Item_cond_and(curr_table->select->cond,
				  sort_table_cond)))
	    DBUG_VOID_RETURN;
	  curr_table->select->cond->fix_fields(thd, 0);
	}
        curr_table->set_select_cond(curr_table->select->cond, __LINE__);
	curr_table->select_cond->top_level_item();

   This code will add the following to the select condition for this query:

      `test`.`t1`.`pk` = 3

4. During execution phase we call test_if_skip_sort_order() to see if
   there are more optimal ways for retrieving data than using the
   col_in_key index. Since we have pushed part of the where condition
   down to the handler we use the orignal select condition stored in
   pre_idx_push_select_cond instead of the current select condition
   when doing this evaluation. The cost evaluation in
   test_if_skip_sort_order() determines (due to the LIMIT 1 part of
   the query) that it would be better to use the primary key for
   retrieving data instead of the col_int_key. Since we here change
   index we need to make sure that the part of the original where
   condition previously pushed down to the handler will be evaluated
   by the server. To do this we let the select condition for the table
   be the condition stored in pre_idx_push_select_cond variable.

   The pre_idx_push_select_cond variable was given its value in step 2
   above. It does not contain the having condition that was included
   in step 3 above.

   So the effect of test_if_skip_sort_order() is that the following
   change takes place for value of the table's select condition:

   At enter: `test`.`t1`.`pk` = 3
   At exit:  `test`.`t1`.`c1` <> 1

   As a result of changing index we have lost the HAVING condition from
   the select condition that will be evaluated by the server. This causes
   the wrong result to be returned from the LIMIT queries.
[30 Dec 2010 16:01] Olav Sandstå
See also Bug#52605 that contains a similar situation relating to wrong result after having switched index during executions for LIMIT queries.
[3 Jan 2011 14:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127796

3451 Olav Sandstaa	2011-01-03
      Fix for Bug#58838 Wrong results with HAVING + LIMIT without GROUP BY 
                        when ICP is enabled.
      
      The wrong result was caused due to "loosing" the HAVING condition
      when changing to use a different index than originally selected by the
      optimizer. 
      
      During the optimization phase it was decided to use a secondary index
      for retrieving data from the table. Since ICP was enabled the WHERE
      condition was pushed down to the storage engine. A copy of the WHERE
      condition was kept in the JOINTAB's pre_idx_push_select_cond
      member. In the start of the execution phase it was determined that the
      HAVING condition could be included in the JOINTAB's select
      condition. Then in test_if_skip_sort_order() we (since part of the
      select condition has been pushed down to the storage engine) replace
      the JOINTAB's select condition with the original select condition
      (found in the pre_idx_push_select_cond member). Since
      test_if_skip_sort_order() determines that is will be less costly to
      use the primary key for doing the retrieval of data (due to LIMIT) it
      changes to use the primary key. If it had not changed index then
      test_if_skip_sort_order() would have restored the select condition to
      what it was when it was called. But due to changing index the part of
      the select condition that has been pushed down must also be included
      in the select condition. To achieve this the condition in
      pre_idx_push_select_cond is kept as the JOINTAB's select
      condition. This does not include the HAVING condition and as a result
      the HAVING condition is no longer part of the JOINTAB's select
      condition.
      
      The fix for this problem: When adding the HAVING condition to the
      JOINTAB's select condition also add it to the pre_idx_push_select_cond
      (if this exists).
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp_all.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp_none.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp_all.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp_none.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ sql/sql_select.cc
        When including the HAVING condition as part of the JOINTAB's
        select condition it should also be added to the original
        select condition (stored in pre_idx_push_select_cond) if 
        parts of the select condition has been pushed down to the storage
        engine. This is necessary since the pre_idx_push_select_cond
        is used in test_if_skip_sort_order() and might replace the
        select condition in the cases where test_if_skip_sort_order()
        decides to change to use a different index for retrieving data.
[28 Jan 2011 10:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129835

3563 Olav Sandstaa	2011-01-28
      Fix for Bug#58838 Wrong results with HAVING + LIMIT without GROUP BY 
                        when ICP is enabled.
      
      The wrong result was caused due to "loosing" the HAVING condition
      when changing to use a different index than originally selected by the
      optimizer. 
      
      During the optimization phase it was decided to use a secondary index
      for retrieving data from the table. Since ICP was enabled the WHERE
      condition was pushed down to the storage engine. A copy of the WHERE
      condition was kept in the JOINTAB's pre_idx_push_select_cond
      member. In the start of the execution phase it was determined that the
      HAVING condition could be included in the JOINTAB's select
      condition. Then in test_if_skip_sort_order() we (since part of the
      select condition has been pushed down to the storage engine) replace
      the JOINTAB's select condition with the original select condition
      (found in the pre_idx_push_select_cond member). Since
      test_if_skip_sort_order() determines that is will be less costly to
      use the primary key for doing the retrieval of data (due to LIMIT) it
      changes to use the primary key. If it had not changed index then
      test_if_skip_sort_order() would have restored the select condition to
      what it was when it was called. But due to changing index the part of
      the select condition that has been pushed down must also be included
      in the select condition. To achieve this the condition in
      pre_idx_push_select_cond is kept as the JOINTAB's select
      condition. This does not include the HAVING condition and as a result
      the HAVING condition is no longer part of the JOINTAB's select
      condition.
      
      The fix for this problem: When adding the HAVING condition to the
      JOINTAB's select condition also add it to the pre_idx_push_select_cond
      (if this exists).
     @ mysql-test/include/icp_tests.inc
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp_all.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/innodb_icp_none.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp_all.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ mysql-test/r/myisam_icp_none.result
        Test case for Bug#58838 "Wrong results with HAVING + LIMIT without
        GROUP BY when ICP is enabled".
     @ sql/sql_select.cc
        When including the HAVING condition as part of the JOINTAB's
        select condition it should also be added to the original
        select condition (stored in pre_idx_push_select_cond) if 
        parts of the select condition has been pushed down to the storage
        engine. This is necessary since the pre_idx_push_select_cond
        is used in test_if_skip_sort_order() and might replace the
        select condition in the cases where test_if_skip_sort_order()
        decides to change to use a different index for retrieving data.
[28 Jan 2011 10:23] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:olav.sandstaa@oracle.com-20110128102156-bbdh0eufqr9y48u7) (version source revid:olav.sandstaa@oracle.com-20110128102156-bbdh0eufqr9y48u7) (merge vers: 5.6.2) (pib:24)
[3 Mar 2011 2:27] Paul DuBois
Noted in 5.6.2 changelog.

With index condition pushdown enabled, incorrect results were
returned for queries on MyISAM tables involving HAVING and LIMIT,
when the column in the WHERE condition contained NULL.

CHANGESET - http://lists.mysql.com/commits/129835