Bug #41610 key_infix_len can be overwritten causing some group by queries to return no rows
Submitted: 19 Dec 2008 1:59 Modified: 18 Mar 2009 14:56
Reporter: Eric Bergen (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.74, 5.1.30, 6.0.8 OS:Any
Assigned to: Georgi Kodinov CPU Architecture:Any
Tags: Contribution, regression

[19 Dec 2008 1:59] Eric Bergen
Description:
select distinct foo from table where bar=N can return no rows found where select foo from table where bar=N returns rows.

In opt_range.cc:get_best_group_min_max() the variable key_infix_len is passed by ref to get_constant_key_infix(). It's initialized to 0 in get_constant_key_infix. When choosing an index if two indexes are tested a, and b and a is the index that gets chosen key_infix_len can be set to zero by the test for index b. This causes the resulting query to return zero results.

How to repeat:
DROP TABLE IF EXISTS `distinct_fail`;
CREATE TABLE `distinct_fail` (
  `a` int NOT NULL,
  `b` int NOT NULL,
  `c` int NOT NULL,
  `d` int NOT NULL,
  KEY `foo` (`c`,`d`,`a`,`b`),
  KEY `bar` (`c`,`a`,`b`,`d`)
) ENGINE=InnoDB;

INSERT INTO `distinct_fail` VALUES 
  (0,0,0,3),
  (0,0,0,4),
  (0,0,0,4),
  (0,0,0,5),
  (0,0,1,0),
  (0,0,1,0),
  (0,0,1,1),
  (0,0,1,1),
  (0,0,1,2),
  (0,0,1,3),
  (0,0,1,4),
  (0,0,1,5),
  (0,0,2,1),
  (0,0,2,2),
  (0,0,2,2),
  (0,0,2,3),
  (0,0,2,3),
  (0,0,2,4),
  (0,0,2,4),
  (0,0,3,0),
  (0,0,3,0),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,1),
  (0,0,3,3),
  (0,0,3,3),
  (0,0,3,3),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,4),
  (0,0,3,5),
  (0,0,4,0),
  (0,0,4,0),
  (0,0,4,0),
  (0,0,4,1),
  (0,0,4,1),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,3),
  (0,0,4,4),
  (0,0,4,4),
  (0,0,5,0),
  (0,0,5,0),
  (0,0,5,1),
  (0,0,5,3);

analyze table distinct_fail;
select 'This next query should return rows' as message;
select c from distinct_fail where d=4;

select 'If the bug exists this next query will be an empty set' as message;
select distinct c from distinct_fail where d=4;

select 'This is what the result should be' as message;
select distinct c from distinct_fail where d=4 order by a;

Suggested fix:
I have a hacky patch that saves the last known good key_infix_len and uses it if it's available. I think the code should be refactored not to pass the key_infix_len by reference. The patch is available here:
 
http://ebergen.net/patches/distinct_hacky_fix.patch
[19 Dec 2008 4:15] Valeriy Kravchuk
Thank you for the bug report. Verified just as described:

C:\Program Files\MySQL\MySQL Server 5.1\bin>mysql -uroot -proot -P3308 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.0.74-enterprise-gpl-nt MySQL Enterprise Server - Pro Edition (
GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> DROP TABLE IF EXISTS `distinct_fail`;
Query OK, 0 rows affected, 1 warning (0.05 sec)

mysql> CREATE TABLE `distinct_fail` (
    ->   `a` int NOT NULL,
    ->   `b` int NOT NULL,
    ->   `c` int NOT NULL,
    ->   `d` int NOT NULL,
    ->   KEY `foo` (`c`,`d`,`a`,`b`),
    ->   KEY `bar` (`c`,`a`,`b`,`d`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.42 sec)

mysql>
mysql> INSERT INTO `distinct_fail` VALUES
    ->   (0,0,0,3),
    ->   (0,0,0,4),
    ->   (0,0,0,4),
    ->   (0,0,0,5),
    ->   (0,0,1,0),
    ->   (0,0,1,0),
    ->   (0,0,1,1),
    ->   (0,0,1,1),
    ->   (0,0,1,2),
    ->   (0,0,1,3),
    ->   (0,0,1,4),
    ->   (0,0,1,5),
    ->   (0,0,2,1),
    ->   (0,0,2,2),
    ->   (0,0,2,2),
    ->   (0,0,2,3),
    ->   (0,0,2,3),
    ->   (0,0,2,4),
    ->   (0,0,2,4),
    ->   (0,0,3,0),
    ->   (0,0,3,0),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,3),
    ->   (0,0,3,3),
    ->   (0,0,3,3),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,5),
    ->   (0,0,4,0),
    ->   (0,0,4,0),
    ->   (0,0,4,0),
    ->   (0,0,4,1),
    ->   (0,0,4,1),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,4),
    ->   (0,0,4,4),
    ->   (0,0,5,0),
    ->   (0,0,5,0),
    ->   (0,0,5,1),
    ->   (0,0,5,3);
Query OK, 50 rows affected (0.06 sec)
Records: 50  Duplicates: 0  Warnings: 0

mysql>
mysql> analyze table distinct_fail;
+--------------------+---------+----------+----------+
| Table              | Op      | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| test.distinct_fail | analyze | status   | OK       |
+--------------------+---------+----------+----------+
1 row in set (0.05 sec)

mysql> select 'This next query should return rows' as message;
+------------------------------------+
| message                            |
+------------------------------------+
| This next query should return rows |
+------------------------------------+
1 row in set (0.00 sec)

mysql> select c from distinct_fail where d=4;
+---+
| c |
+---+
| 0 |
| 0 |
| 1 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 4 |
+---+
12 rows in set (0.06 sec)

mysql>
mysql> select 'If the bug exists this next query will be an empty set' as messag
e;
+--------------------------------------------------------+
| message                                                |
+--------------------------------------------------------+
| If the bug exists this next query will be an empty set |
+--------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select distinct c from distinct_fail where d=4;
Empty set (0.02 sec)

mysql>
mysql> select 'This is what the result should be' as message;
+-----------------------------------+
| message                           |
+-----------------------------------+
| This is what the result should be |
+-----------------------------------+
1 row in set (0.00 sec)

mysql> select distinct c from distinct_fail where d=4 order by a;
+---+
| c |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 0 |
+---+
5 rows in set (0.03 sec)
[19 Dec 2008 4:26] Valeriy Kravchuk
5.1.30 and 6.0.8 are also affected. While 4.1.22, for example, works correctly:

C:\Program Files\MySQL\MySQL Server 5.1\bin>mysql -uroot -proot -P3306 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 4.1.22-community-nt-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> DROP TABLE IF EXISTS `distinct_fail`;
Query OK, 0 rows affected, 1 warning (0.05 sec)

mysql> CREATE TABLE `distinct_fail` (
    ->   `a` int NOT NULL,
    ->   `b` int NOT NULL,
    ->   `c` int NOT NULL,
    ->   `d` int NOT NULL,
    ->   KEY `foo` (`c`,`d`,`a`,`b`),
    ->   KEY `bar` (`c`,`a`,`b`,`d`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (1.53 sec)

mysql>
mysql> INSERT INTO `distinct_fail` VALUES
    ->   (0,0,0,3),
    ->   (0,0,0,4),
    ->   (0,0,0,4),
    ->   (0,0,0,5),
    ->   (0,0,1,0),
    ->   (0,0,1,0),
    ->   (0,0,1,1),
    ->   (0,0,1,1),
    ->   (0,0,1,2),
    ->   (0,0,1,3),
    ->   (0,0,1,4),
    ->   (0,0,1,5),
    ->   (0,0,2,1),
    ->   (0,0,2,2),
    ->   (0,0,2,2),
    ->   (0,0,2,3),
    ->   (0,0,2,3),
    ->   (0,0,2,4),
    ->   (0,0,2,4),
    ->   (0,0,3,0),
    ->   (0,0,3,0),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,1),
    ->   (0,0,3,3),
    ->   (0,0,3,3),
    ->   (0,0,3,3),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,4),
    ->   (0,0,3,5),
    ->   (0,0,4,0),
    ->   (0,0,4,0),
    ->   (0,0,4,0),
    ->   (0,0,4,1),
    ->   (0,0,4,1),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,3),
    ->   (0,0,4,4),
    ->   (0,0,4,4),
    ->   (0,0,5,0),
    ->   (0,0,5,0),
    ->   (0,0,5,1),
    ->   (0,0,5,3);
Query OK, 50 rows affected (0.11 sec)
Records: 50  Duplicates: 0  Warnings: 0

mysql> analyze table distinct_fail;
+--------------------+---------+----------+----------+
| Table              | Op      | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| test.distinct_fail | analyze | status   | OK       |
+--------------------+---------+----------+----------+
1 row in set (0.03 sec)

mysql> select distinct c from distinct_fail where d=4;
+---+
| c |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
+---+
5 rows in set (0.00 sec)

So, this is a regression bug.

The following plans are used for wrong and "shoud be" cases:

mysql> explain select distinct c from distinct_fail where d=4\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: distinct_fail
         type: range
possible_keys: NULL
          key: foo
      key_len: 4
          ref: NULL
         rows: 13
        Extra: Using where; Using index for group-by
1 row in set (0.01 sec)

mysql> explain select distinct c from distinct_fail where d=4 order by a\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: distinct_fail
         type: index
possible_keys: NULL
          key: foo
      key_len: 16
          ref: NULL
         rows: 50
        Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.00 sec)
[15 Jan 2009 2:12] Eric Bergen
4.1.22 works correctly because it didn't include the group by optimizations that trigger the bug.
[13 Feb 2009 14:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/66221

2744 Georgi Kodinov	2009-02-13
      Bug #41610: key_infix_len can be overwritten causing some group by queries to return no rows
      
      The algorithm of determining the best key for loose index scan is doing a loop over the 
      available indexes and selects the one that has the best cost. 
      It retrieves the parameters of the current index into a set of variables.
      If the cost of using the current index is lower than the best cost so far it 
      copies these variables into another set of variables that contain the information for the
      best index so far.
      After having checked all the indexes it uses these variables (outside of the index 
      loop) to create the table read plan object instance.
      The was a single omission : the key_infix/key_infix_len variables were used outside of the 
      loop without being preserved in the loop for the best index so far.
      This causes these variables to get overwritten by the next index(es) checked.
      Fixed by adding variables to hold the data for the current index, passing the new variables 
      to the function that assigns values to them and copying the new variables into the existing ones
      when selecting a new current best index.
      To avoid further such problems moved the declarations of the variables used to keep information
      about the current index inside the loop's compound statement.
      Also fixed a wrong test output in group_min_max.test
      modified:
        mysql-test/r/group_min_max.result
        mysql-test/t/group_min_max.test
        sql/opt_range.cc
[13 Feb 2009 14:48] Georgi Kodinov
Eric, thanks for your analysis. It has helped a lot.
[26 Feb 2009 13:59] Timour Katchaounov
Review comments:

1. Please change key_infix_len to cur_key_infix_len in the line below:

    key_infix_parts= key_infix_len ?
                     (first_non_infix_part - first_non_group_part) : 0;

2. The variable declarations/initialization you moved into the for loop,
makes those variables be reinitialized on every pass, which is not needed.
Please remove all unnecessary initializations, and test with ValGrind.
* check if
  uint cur_param_idx=MAX_KEY;
  really needs to be initialized, and remove the initialization if not needed.
* move the declaration/initialization
  uint pk= param->table->s->primary_key;
  before the loop where it was, and make the variable a const.
* it doesn't seem necessary to enclose just this variable in an extra block
  as we discussed.

3. Remove the assertion below, which seems to be always TRUE:
   DBUG_ASSERT(tree != 0 || cur_param_idx == MAX_KEY);
[27 Feb 2009 13:25] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/67853

2764 Georgi Kodinov	2009-02-27
      Bug #41610: key_infix_len can be overwritten causing some group by queries to
      return no rows
      
      The algorithm of determining the best key for loose index scan is doing a loop
      over the available indexes and selects the one that has the best cost.
      It retrieves the parameters of the current index into a set of variables.
      If the cost of using the current index is lower than the best cost so far it 
      copies these variables into another set of variables that contain the 
      information for the best index so far.
      After having checked all the indexes it uses these variables (outside of the 
      index loop) to create the table read plan object instance.
      The was a single omission : the key_infix/key_infix_len variables were used 
      outside of the loop without being preserved in the loop for the best index 
      so far.
      This causes these variables to get overwritten by the next index(es) checked.
      Fixed by adding variables to hold the data for the current index, passing 
      the new variables to the function that assigns values to them and copying 
      the new variables into the existing ones when selecting a new current best 
      index.
      To avoid further such problems moved the declarations of the variables used 
      to keep information about the current index inside the loop's compound 
      statement.
     @ mysql-test/r/group_min_max.result
        Bug #41610: test case
     @ mysql-test/t/group_min_max.test
        Bug #41610: test case
     @ sql/opt_range.cc
        Bug #41610: copy the infix data for the current best index
[9 Mar 2009 14:13] Bugs System
Pushed into 5.0.79 (revid:joro@sun.com-20090309135922-a0di9ebkxoj4d4wv) (version source revid:staale.smedseng@sun.com-20090227160758-td4jot2la75f9zy1) (merge vers: 5.0.79) (pib:6)
[13 Mar 2009 1:30] Paul DuBois
Noted in 5.0.79 changelog.

Queries that used the loose index scan access method could return no rows.

Setting report to NDI pending push into 5.1.x/6.0.x.
[13 Mar 2009 19:06] Bugs System
Pushed into 5.1.33 (revid:joro@sun.com-20090313111355-7bsi1hgkvrg8pdds) (version source revid:staale.smedseng@sun.com-20090227160332-3k1kc0rao6y07cbp) (merge vers: 5.1.33) (pib:6)
[15 Mar 2009 2:48] Paul DuBois
Noted in 5.1.33 changelog.

Setting to NDI pending push into 6.0.x.
[18 Mar 2009 13:19] Bugs System
Pushed into 6.0.11-alpha (revid:joro@sun.com-20090318122208-1b5kvg6zeb4hxwp9) (version source revid:staale.smedseng@sun.com-20090227155937-ly92xe32djh3o6xo) (merge vers: 6.0.10-alpha) (pib:6)
[18 Mar 2009 14:56] Paul DuBois
Noted in 6.0.11 changelog.
[9 May 2009 16:44] Bugs System
Pushed into 5.1.34-ndb-6.2.18 (revid:jonas@mysql.com-20090508185236-p9b3as7qyauybefl) (version source revid:jonas@mysql.com-20090508100057-30ote4xggi4nq14v) (merge vers: 5.1.33-ndb-6.2.18) (pib:6)
[9 May 2009 17:41] Bugs System
Pushed into 5.1.34-ndb-6.3.25 (revid:jonas@mysql.com-20090509063138-1u3q3v09wnn2txyt) (version source revid:jonas@mysql.com-20090508175813-s6yele2z3oh6o99z) (merge vers: 5.1.33-ndb-6.3.25) (pib:6)
[9 May 2009 18:39] Bugs System
Pushed into 5.1.34-ndb-7.0.6 (revid:jonas@mysql.com-20090509154927-im9a7g846c6u1hzc) (version source revid:jonas@mysql.com-20090509073226-09bljakh9eppogec) (merge vers: 5.1.33-ndb-7.0.6) (pib:6)