Bug #42846 wrong result returned for range scan when using covering index
Submitted: 13 Feb 2009 18:45 Modified: 18 Dec 2009 13:21
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:4.1.24, 5.0.74, 5.1.31, 6.0.9 OS:Any
Assigned to: Martin Hansson CPU Architecture:Any
Triage: Triaged: D2 (Serious)

[13 Feb 2009 18:45] Shane Bester
Description:
Extra row is returned in the first query:

mysql> select sql_no_cache * from `t2` where
    -> (c1 <> 'b' and c2 > '4') or
    -> (c2 <= '5' and c1 <> 2) or
    -> (c1 = 5 and c2 <> '2');
<cut>
5 rows in set (0.00 sec)

mysql>
mysql> select sql_no_cache * from `t2`;
<cut>
4 rows in set (0.00 sec)

The row ('g','9') is duplicated by accident in the select using index for range scan. 

How to repeat:
drop table if exists `t2`;
create table `t2`(`c1` varchar(20),`c2` int,key (`c2`,`c1`)) engine=innodb;
insert into `t2` values ('ddd','1'), ('c','2'), ('d','4'), ('g','9');

select sql_no_cache * from `t2` where (c1 <> 'b' and c2 > '4') or (c2 <= '5' and c1 <> 2) or (c1 = 5 and c2 <> '2');

select sql_no_cache * from `t2`;

Suggested fix:
i dunno, but 'select * from t2 where ...' should never return more rows than 'select * from t2'!
[13 Feb 2009 19:00] Shane Bester
another case that returns too many rows for the first select:

drop table if exists `t1`;
create table `t1`(`c1` varchar(20),`c2` char(20),key(`c2`,`c1`))engine=innodb ;
insert into `t1`(`c1`,`c2`) values ('a', '0'),('b', '1'),('c', '2'),('d','4'),('g','9'),('y','43'),('ddd','1');

select * from `t1` where  
(c1 <> '0' and c2 <> 'c')  or 
(c1 = '3' and c2 <= '5')  or  
(c2 <> 'e' and c2 > '3');

select * from `t1`;
[13 Feb 2009 19:04] Valeriy Kravchuk
Verified just as described:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.31-enterprise-gpl-pro-debug MySQL Enterprise Server - Pro Ed
ition Debug (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> use test
Database changed
mysql> drop table if exists `t2`;
Query OK, 0 rows affected (0.16 sec)

mysql> create table `t2`(`c1` varchar(20),`c2` int,key (`c2`,`c1`)) engine=innod
b;
Query OK, 0 rows affected (0.14 sec)

mysql> insert into `t2` values ('ddd','1'), ('c','2'), ('d','4'), ('g','9');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select sql_no_cache * from `t2` where (c1 <> 'b' and c2 > '4') or (c2 <=
'5' and c1 <> 2)
    -> or (c1 = 5 and c2 <> '2');
+------+------+
| c1   | c2   |
+------+------+
| ddd  |    1 |
| c    |    2 |
| d    |    4 |
| g    |    9 |
| g    |    9 |
+------+------+
5 rows in set (0.03 sec)

mysql> select sql_no_cache * from `t2`;
+------+------+
| c1   | c2   |
+------+------+
| ddd  |    1 |
| c    |    2 |
| d    |    4 |
| g    |    9 |
+------+------+
4 rows in set (0.00 sec)

mysql> explain select sql_no_cache * from `t2` where (c1 <> 'b' and c2 > '4') or
 (c2 <= '5' and c1 <> 2) or (c1 = 5 and c2 <> '2')\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: range
possible_keys: c2
          key: c2
      key_len: 5
          ref: NULL
         rows: 3
        Extra: Using where; Using index
1 row in set (0.03 sec)

Looks like range scan for the covering index is broken.
[13 Feb 2009 19:15] Shane Bester
testcase for myisam:

drop table if exists `t1`;
create table `t1`(`c1` char(9),`c2` char(9),key(`c2`,`c1`))engine=myisam;
insert into `t1`(`c1`,`c2`) values ('a', '0'),('b', '1'),
('c', '2'),('g','9'),('ddd','1');

select * from `t1` where  
(c2 >= '3' AND c1 = 'b')  OR  
(c1 <> 'b' AND c2 > 'a')  OR 
(c1 >= 'a' AND c2 >= '5') OR
(c2 > '2' AND c2 <> '2');

workaround is to use "ignore index" hint.
[18 Feb 2009 9:08] Shane Bester
testcase for maria (6.0.9)

drop table if exists `t1`;
create table `t1` (  `a` char(3),`b` int,key `i`(`a`))engine=maria default charset=latin1;
insert into `t1` values ('o',2),('o',2);
select * from `t1` where  (`b` <> 1 and `a` < 'a');
select * from `t1` ignore index(`i`) where (`b` <> 1 and `a` < 'a');
[13 May 2009 15:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/73950

2895 Martin Hansson	2009-05-13
      Bug#42846: wrong result returned for range scan when using
      covering index
      
      When an open-ended range was merged with a closed range, the 
      closed range copied the (non-existent) endpoint from the open
      range, thereby violating the range optimizer invariant that 
      ranges must be disjoint. This led to duplicate answers.
      Fixed by not copying endpoints off open-ended ranges.
     @ mysql-test/r/range.result
        Bug#42846: Result
     @ mysql-test/t/range.test
        Bug#42846: Test case
     @ sql/opt_range.cc
        Bug#42846: Fix
[27 May 2009 6:40] Martin Hansson
This is actually two bugs in the range optimizer module (SEL_ARG data
structure).

Two important points regarding the SEL_ARG structure:

1) It corresponds to a boolean expression (AND, OR) over ranges (e.g. 1 <
field <= 10)

2) Whenever an OR expression appears, the sets matched by its two branches
must not overlap, or a tuple in the overlapping area will appear twice in the
result.

3) It is written in "keypart order", meaning that in each disjunct
(OR-branch), keyparts appear in the same order as they do in the index.

A legal SEL_ARG expression:

 1 < keypart1 < 5 AND ( 2 < keypart2 < 4 OR keypart 2 < 5 ) OR keypart1 > 6

( I use i < x < j as a shorthand for i < x AND x < j)

The function key_or() creates a new OR node, adding its two arguments as
branches, and maintains the delicate SEL_ARG invariant. This involves a lot of
details that are irrelevant to this bug, such as tree balancing and memory
management, so I set them aside.

Problem #1
----------

When OR'ing together expressions such as

(1)   3 <= kp1 AND kp1 < 5 OR 
(2)   5 < kp1 AND kp2 = 3 OR
   - OR -
(3)   3 <= kp1;

The algorithm will detect that the expression (3) has the same lower endpoint
as expression (1) and attempts to unify them, but the result is incorrectly

3 <= kp1 OR 5 < kp1 AND kp2 = 3.

It misses that (3) has no higher endpoint.

Problem #2
----------

When OR'ing together expressions such as

(1)   2 < kp1 < 10 AND kp2 = 2 OR 
(2)   10 <= kp1 < 20 AND kp2 = 3
   - OR -
(3)   1 < kp1 AND kp2 = 2

With problem 1 fixed, the algorithm will detect that the ranges (3) and (1)
overlap and that (3) is open ended. Hence it copies only the lower endpoint: 1
< kp1. Since (3) is open ended, we need to look at (2)'s lower endpoint, 10 <=
kp1, and copy that into (1). This creates the (still invalid) SEL_ARG
expression

(1b)  1 < kp1 < 10 AND kp2 = 2 OR
(2)   10 <= kp1 < 20 AND kp2 = 3

What should be done?
In this case, (3) should have been be split into two ranges:

(3a)   1 < kp1 < 10 AND kp2 = 3     (same as 1b above)

(3b)   20 <= kp1 AND kp2 = 3

And (2) has to be modified to

(2')   10 <= kp1 < 20 AND ( kp2 = 3 OR kp2 = 2 )

Yielding the final result

(3a)   1 < kp1 < 10 AND kp2 = 2 OR 
(2')   10 <= kp1 < 20 AND ( kp2 = 3 OR kp2 = 2 ) OR
(3b)   20 <= kp1 AND kp2 = 3

The problem is symmetric, meaning the following. Above the ranges that trigger
the bug have a lower endpoint and are open-ended. But the same happens for
ranges open in the lower end. Also note that open-endedness is not
required. We trigger the bug as soon as ranges overlap.

I propose to fix Problem #1 first, as it is much simpler. When the patch is
approved I'll file a second bug for Problem #2.
[2 Jun 2009 12:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75460

2895 Martin Hansson	2009-06-02
      Bug#42846: wrong result returned for range scan when using 
      covering index
      When two range predicates were combined under an OR predicate,
      the algorithm tried to merge overlapping ranges into one. But the
      case when a range overlapped several other ranges was not handled.
      This lead to 1) ranges overlapping, which gave repeated results 
      and 2) a range that overlapped several other ranges was cut off. 
      Fixed by
      1) Making sure that a ranges got an upper bound equal 
      to the next range with a greater minimum.
      2) Removing a coutinue statement
     @ mysql-test/r/group_min_max.result
        Bug#42846: Changed query plans
     @ mysql-test/r/range.result
        Bug#42846: Test result.
     @ mysql-test/t/range.test
        Bug#42846: Test case.
     @ sql/opt_range.cc
        Bug#42846: The fix. 
        Part1: Previously, both endpoints from key2 were copied, which is not safe. Since ranges are processed in ascending order of minimum endpoints, it is safe to copy the minimum endpoint from key2 but not the maximum. The maximum may only be copied if there is no other range or the other range's minimum is greater than key2's maximum.
        
        Part 2: Do not simply move to the next key part, the range in key2 may have to be split up, creating more ranges.
[2 Jun 2009 13:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/75468

2895 Martin Hansson	2009-06-02
      Bug#42846: wrong result returned for range scan when using 
      covering index
      When two range predicates were combined under an OR predicate,
      the algorithm tried to merge overlapping ranges into one. But the
      case when a range overlapped several other ranges was not handled.
      This lead to 1) ranges overlapping, which gave repeated results 
      and 2) a range that overlapped several other ranges was cut off. 
      Fixed by
      1) Making sure that a ranges got an upper bound equal 
      to the next range with a greater minimum.
      2) Removing a continue statement
     @ mysql-test/r/group_min_max.result
        Bug#42846: Changed query plans
     @ mysql-test/r/range.result
        Bug#42846: Test result.
     @ mysql-test/t/range.test
        Bug#42846: Test case.
     @ sql/opt_range.cc
        Bug#42846: The fix. 
        Part1: Previously, both endpoints from key2 were copied, which is not safe. Since ranges are processed in ascending order of minimum endpoints, it is safe to copy the minimum endpoint from key2 but not the maximum. The maximum may only be copied if there is no other range or the other range's minimum is greater than key2's maximum.
[5 Oct 2009 8:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85704

3153 Martin Hansson	2009-10-05
      Bug#42846: wrong result returned for range scan when using
      covering index
      
      When two range predicates were combined under an OR
      predicate, the algorithm tried to merge overlapping ranges
      into one. But the case when a range overlapped several other
      ranges was not handled. This lead to
      
      1) ranges overlapping, which gave repeated results and 
      2) a range that overlapped several other ranges was cut off.  
      
      Fixed by 
      
      1) Making sure that a range got an upper bound equal to the
      next range with a greater minimum.
      2) Removing a continue statement
     @ mysql-test/r/group_min_max.result
        Bug#42846: Changed query plans
     @ mysql-test/r/range.result
        Bug#42846: Test result.
     @ mysql-test/t/range.test
        Bug#42846: Test case.
     @ sql/opt_range.cc
        Bug#42846: The fix. 
        
        Part1: Previously, both endpoints from key2 were copied,
        which is not safe. Since ranges are processed in ascending
        order of minimum endpoints, it is safe to copy the minimum
        endpoint from key2 but not the maximum. The maximum may only
        be copied if there is no other range or the other range's
        minimum is greater than key2's maximum.
[5 Oct 2009 8:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85705

3153 Martin Hansson	2009-10-05
      Bug#42846: wrong result returned for range scan when using
      covering index
      
      When two range predicates were combined under an OR
      predicate, the algorithm tried to merge overlapping ranges
      into one. But the case when a range overlapped several other
      ranges was not handled. This lead to
      
      1) ranges overlapping, which gave repeated results and 
      2) a range that overlapped several other ranges was cut off.  
      
      Fixed by 
      
      1) Making sure that a range got an upper bound equal to the
      next range with a greater minimum.
      2) Removing a continue statement
     @ mysql-test/r/group_min_max.result
        Bug#42846: Changed query plans
     @ mysql-test/r/range.result
        Bug#42846: Test result.
     @ mysql-test/t/range.test
        Bug#42846: Test case.
     @ sql/opt_range.cc
        Bug#42846: The fix. 
        
        Part1: Previously, both endpoints from key2 were copied,
        which is not safe. Since ranges are processed in ascending
        order of minimum endpoints, it is safe to copy the minimum
        endpoint from key2 but not the maximum. The maximum may only
        be copied if there is no other range or the other range's
        minimum is greater than key2's maximum.
[5 Oct 2009 8:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85709

3154 Martin Hansson	2009-10-05 [merge]
      Merge of Bug#42846
[9 Oct 2009 9:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86310

3156 Martin Hansson	2009-10-09
      Bug#42846: wrong result returned for range scan when using
      covering index
            
      When two range predicates were combined under an OR
      predicate, the algorithm tried to merge overlapping ranges
      into one. But the case when a range overlapped several other
      ranges was not handled. This lead to
      
      1) ranges overlapping, which gave repeated results and 
      2) a range that overlapped several other ranges was cut off.  
      
      Fixed by 
      
      1) Making sure that a range got an upper bound equal to the
      next range with a greater minimum.
      2) Removing a continue statement
     @ mysql-test/r/group_min_max.result
        Bug#42846: Changed query plans
     @ mysql-test/r/range.result
        Bug#42846: Test result.
     @ mysql-test/t/range.test
        Bug#42846: Test case.
     @ sql/opt_range.cc
        Bug#42846: The fix. 
        
        Part1: Previously, both endpoints from key2 were copied,
        which is not safe. Since ranges are processed in ascending
        order of minimum endpoints, it is safe to copy the minimum
        endpoint from key2 but not the maximum. The maximum may only
        be copied if there is no other range or the other range's
        minimum is greater than key2's maximum.
[9 Oct 2009 11:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86337

3642 Martin Hansson	2009-10-09 [merge]
      Merge of Bug#42846 to -pe
[9 Oct 2009 15:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86412

3157 Martin Hansson	2009-10-09 [merge]
      Merge of bug#42846
[14 Oct 2009 14:39] Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091014143611-cphb0enjlx6lpat1) (version source revid:satya.bn@sun.com-20091013071829-zc4c3go44j6re592) (merge vers: 5.1.40) (pib:13)
[15 Oct 2009 23:53] Paul Dubois
Noted in 5.1.41 changelog.

Incorrect handling of range predicates combined with OR operators
could yield incorrect results.

Setting report to NDI pending 5.5.x+.
[22 Oct 2009 6:33] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 7:05] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091019131708-bc6pv55x6287a0wc) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 19:57] Paul Dubois
Noted in 5.5.0, 6.0.14 changelogs.
[18 Dec 2009 10:41] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:56] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:11] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:25] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[18 Dec 2009 13:21] MC Brown
Already noted in earlier changelogs.