Bug #16203 GROUP BY on column used in where returns false empty result
Submitted: 4 Jan 2006 20:42 Modified: 21 Feb 2006 22:21
Reporter: Uri H Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.19-BK, 5.0.18 OS:Linux (Linux, Windows)
Assigned to: Igor Babaev CPU Architecture:Any

[4 Jan 2006 20:42] Uri H
Description:
When I use a table with two columns that are also the primary index, and run a DISTINCT query with a LIKE clause, no results are returned, although I'd normally expect a result.

How to repeat:
The following script recreates the problem under 3 machines I've tested:

--- Setting everything up...
DROP TABLE IF EXISTS `sample01`;
CREATE TABLE `sample01` (
  `colone` varchar(5),
  `coltwo` int(11),
  PRIMARY KEY  (`colone`,`coltwo`)
) ENGINE=MyISAM;

INSERT INTO `sample01` (`colone`, `coltwo`) VALUES ('AA',1),('AA',2),('AA',3),('AA',4),('BB',5);

OPTIMIZE TABLE `sample01`;
------------------------------------------------------------------

--- This query works fine. 1 line is returned.
SELECT DISTINCT `colone` FROM `sample01` WHERE (`colone` LIKE 'A%')\p;
                
--- The next query doesn't work for some reason. No results are returned, while I'd expect also 1 line.
SELECT DISTINCT `colone` FROM `sample01` WHERE (`colone` LIKE 'B%')\p;

--- Cleaning up
DROP TABLE `sample01`

(Note: The same thing happens when trying group by:
--- This query works fine. 1 line is returned.
SELECT `colone` FROM `sample01` WHERE (`colone` LIKE 'A%') GROUP BY `colone`\p;
                
--- The next query doesn't work for some reason. No results are returned, while I'd expect also 1 line.
SELECT `colone` FROM `sample01` WHERE (`colone` LIKE 'B%') GROUP BY `colone`\p;
)
[5 Jan 2006 16:21] Valeriy Kravchuk
Thank you for a bug report. Verified just as described on 5.0.19-BK (ChangeSet@1.2010.1.2, 2006-01-04 21:39:39+03:00) on Linux:

mysql> CREATE TABLE `sample01` (
    ->   `colone` varchar(5),
    ->   `coltwo` int(11),
    ->   PRIMARY KEY  (`colone`,`coltwo`)
    -> ) ENGINE=MyISAM;

Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO `sample01` (`colone`, `coltwo`) VALUES
    -> ('AA',1),('AA',2),('AA',3),('AA',4),('BB',5);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> OPTIMIZE TABLE `sample01`;
+---------------+----------+----------+----------+
| Table         | Op       | Msg_type | Msg_text |
+---------------+----------+----------+----------+
| test.sample01 | optimize | status   | OK       |
+---------------+----------+----------+----------+
1 row in set (0.03 sec)

mysql> SELECT DISTINCT `colone` FROM `sample01` WHERE (`colone` LIKE 'A%');
+--------+
| colone |
+--------+
| AA     |
+--------+
1 row in set (0.08 sec)

mysql> SELECT DISTINCT `colone` FROM `sample01` WHERE (`colone` LIKE 'B%');
Empty set (0.00 sec)

mysql> select * from sample01;
+--------+--------+
| colone | coltwo |
+--------+--------+
| AA     |      1 |
| AA     |      2 |
| AA     |      3 |
| AA     |      4 |
| BB     |      5 |
+--------+--------+
5 rows in set (0.00 sec)

mysql> SELECT `colone` FROM `sample01` WHERE `colone` LIKE 'B%' group by `colone`;
Empty set (0.00 sec)

mysql> SELECT `colone` FROM `sample01` WHERE `colone` LIKE 'A%' group by `colone`;
+--------+
| colone |
+--------+
| AA     |
+--------+
1 row in set (0.01 sec)

mysql> SELECT `colone` FROM `sample01` WHERE `colone` LIKE 'B%';
+--------+
| colone |
+--------+
| BB     |
+--------+
1 row in set (0.00 sec)

mysql> SELECT `colone` FROM `sample01` WHERE `colone` LIKE 'A%';
+--------+
| colone |
+--------+
| AA     |
| AA     |
| AA     |
| AA     |
+--------+
4 rows in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.19    |
+-----------+
1 row in set (0.00 sec)
[13 Jan 2006 20:27] Gunnar von Boehn
To clarify, the problem is not the LIKE alone but the combination
of DISTINCT and usage of the distinct column in the where part.

Another selects which fail is:
SELECT DISTINCT(colone) FROM sample01 WHERE colone = 'BB';

The problem is new in 5.0, I could not repeat it in the MySQL 4.1 tree

Kind regards
Gunnar von Boehn
[13 Jan 2006 20:41] Gunnar von Boehn
Updated Synopsis as the problem is the Group by / Distinct.
Expl: A DISTINCT() on a single table is translated to GROUP BY.

Regards
Gunnar von Boehn
[5 Feb 2006 6:09] Igor Babaev
mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.19-debug |
+--------------+

mysql> CREATE TABLE t1 (a varchar(5), b int(11), PRIMARY KEY (a,b));
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t1 VALUES ('AA',1), ('AA',2), ('AA',3), ('BB',1), ('AA',4);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> OPTIMIZE TABLE t1;
+---------+----------+----------+----------+
| Table   | Op       | Msg_type | Msg_text |
+---------+----------+----------+----------+
| test.t1 | optimize | status   | OK       |
+---------+----------+----------+----------+
1 row in set (0.00 sec)

mysql> SELECT a FROM t1 WHERE a='AA' GROUP BY a;
+----+
| a  |
+----+
| AA |
+----+
1 row in set (0.00 sec)

mysql> SELECT a FROM t1 WHERE b='BB' GROUP BY a;
Empty set (0.00 sec)

We have a wrong result for the last qyery due to a bug in the function cost_group_min_max that
causes an incorrect detection of an empty range for the primary key (a,b).

The following EXPLAIN commands indirectly display this problem:

mysql> EXPLAIN SELECT a FROM t1 WHERE a='AA' GROUP BY a;
+----+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra                    |
+----+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
|  1 | SIMPLE      | t1    | ref  | PRIMARY       | PRIMARY | 7       | const |    3 | Using where; Using index |
+----+-------------+-------+------+---------------+---------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT a FROM t1 WHERE a='BB' GROUP BY a;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | const | PRIMARY       | NULL | NULL    | NULL |    1 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
[5 Feb 2006 6:28] Igor Babaev
In the previous report one should read

mysql> SELECT a FROM t1 WHERE a='BB' GROUP BY a;
Empty set (0.00 sec)

instead of

mysql> SELECT a FROM t1 WHERE b='BB' GROUP BY a;
Empty set (0.00 sec)
[6 Feb 2006 19:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/2210
[10 Feb 2006 5:26] Igor Babaev
ChangeSet
  1.2032 06/02/06 11:35:13 igor@rurik.mysql.com +3 -0
  Fixed bug #16203.
  If check_quick_select returns non-empty range then the function cost_group_min_max
  cannot return 0 as an estimate of the number of retrieved records.
  Yet the function erroneously returned 0 as the estimate in some situations.

  sql/opt_range.cc
    1.204 06/02/06 11:35:06 igor@rurik.mysql.com +1 -0
    Fixed bug #16203.
    If check_quick_select returns non-empty range then the function cost_group_min_max
    cannot return 0 as an estimate of the number of retrieved records.
    Yet the function erroneously returned 0 as the estimate in some situations. 

  mysql-test/t/group_min_max.test
    1.18 06/02/06 11:35:06 igor@rurik.mysql.com +21 -0
    Added a test case for bug #16203.

  mysql-test/r/group_min_max.result
    1.20 06/02/06 11:35:05 igor@rurik.mysql.com +27 -0
    Added a test case for bug #16203.

The fix will appear in 5.0.19 and 5.1.7
[21 Feb 2006 22:21] Mike Hillyer
Documented in 5.0.19 and 5.1.7 changelogs:
      
      <listitem>
        <para>
          Using <literal>GROUP BY</literal> on column used in
          <literal>WHERE</literal> clause could cause empty set to be
          returned. (Bug #16203)
        </para>
      </listitem>