Bug #47650 using group by with rollup without indexes returns incorrect results with where
Submitted: 25 Sep 18:53 Modified: 10 Dec 17:42
Reporter: Derrek Leute
Status: In progress
Category:Server: DML Severity:S3 (Non-critical)
Version:5.0.X, 5.1.39, next-mr OS:Any (Mac OS X Snow leopard, Linux)
Assigned to: Martin Hansson Target Version:5.1+
Tags: rollup, GROUP, count, where
Triage: Triaged: D2 (Serious)

[25 Sep 18:53] Derrek Leute
Description:

The below assumes all innodb tables. Have not tested with other table types.

Using the following query:

SELECT e.employer_id AS employer_id, COUNT(distinct u.user_id) AS cnt FROM e_employer e
LEFT JOIN e_user u ON e.employer_id = u.employer_id WHERE e.employer_id = 10006 GROUP BY
e.employer_id with rollup LIMIT 100;

+-------------+------+
| employer_id |  cnt |
+-------------+------+
|       10006 | 9967 |
|        NULL | 9967 |
+-------------+------+

9967 is the total number of rows in u. The result should have been 71.

This problem only occurs when where = specifies a single row in e. "where in
(10006,10000)" works as expected with or without indexes.

Adding an index on u.employer_id solved the problem. However, it doesn't seem like
indexes should be required to make group by with rollup work correctly.

I noticed this because I have developed a graphical reporting tool for any data
structure. Not every table is indexed at all, let alone properly.

It should also be noted that other ways to get the data i.e. count(*) where id = 10006 do
work as expected with or without indexes.

How to repeat:

Tables A, B, and C.

Table A (At least two records)

id (primary key)

Table B (populated with records linking to all rows of table A)

id (primary key autoincrement)
a_id (id from table a *not* indexed.) 

Table C (Populated with references to all rows of table A)

id (primary key)
a_id (id from table A. indexed)

The following is pseudo code. See the description above for code that actually reproduced
the issue.

SELECT A.id AS a_id, COUNT(distinct B.id) AS cnt FROM A A LEFT JOIN B B ON A.id = B.a_id
WHERE A.id = 1 GROUP BY a.id with rollup LIMIT 100;

results for that join with table b = count of total records in table b.

however doing the same thing with table c (with the index) results in the intended
result.

Suggested fix:
No idea.

Workaround is to add indexes for all joins. However, this seems like it should still work
without indexes even if it is slower.
[25 Sep 19:35] Miguel Solorzano
Thank you for the bug report. Could you please provide the exactly server version which
are you getting the bad results, complete test case create table and insert data
statements, the result you got and the expected ones. Thanks in advance.
[25 Sep 20:07] Derrek Leute
Below is a structure and data dump for this problem. It should be noted that the tables
are MyISAM. So it isn't an innodb issue. Complete version numbers in the mysql dump
below. This happened on 5.0 and 5.1 for me.

The difference between the two queries below is table b does not have an index on a_id
and table c does. See the mysqldump5 below for details.

SELECT A.id AS a_id, COUNT(distinct B.id) AS cnt FROM A A LEFT JOIN B B ON A.id = B.a_id
WHERE A.id = 1 GROUP BY a.id with rollup LIMIT 100;

+------+-----+
| a_id | cnt |
+------+-----+
|    1 |   8 |
| NULL |   8 |
+------+-----+

SELECT A.id AS a_id, COUNT(distinct C.id) AS cnt FROM A A LEFT JOIN C C ON A.id = C.a_id
WHERE A.id = 1 GROUP BY a.id with rollup LIMIT 100;

+------+-----+
| a_id | cnt |
+------+-----+
|    1 |   3 |
| NULL |   3 |
+------+-----+

-- MySQL dump 10.13  Distrib 5.1.39, for apple-darwin10.0.0 (i386)
--
-- Host: localhost    Database: bugreport
-- ------------------------------------------------------
-- Server version	5.1.39

--
-- Table structure for table `a`
--

DROP TABLE IF EXISTS `a`;
CREATE TABLE `a` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;

--
-- Dumping data for table `a`
--

LOCK TABLES `a` WRITE;
INSERT INTO `a` VALUES (1),(2),(3),(4),(5),(6),(7),(8);
UNLOCK TABLES;

--
-- Table structure for table `b`
--

DROP TABLE IF EXISTS `b`;
CREATE TABLE `b` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `a_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;

--
-- Dumping data for table `b`
--

LOCK TABLES `b` WRITE;
INSERT INTO `b` VALUES (1,1),(2,1),(3,1),(4,2),(5,2),(6,2),(7,3),(8,3);
UNLOCK TABLES;

--
-- Table structure for table `c`
--

DROP TABLE IF EXISTS `c`;
CREATE TABLE `c` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `a_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ctable` (`a_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;

--
-- Dumping data for table `c`
--

LOCK TABLES `c` WRITE;
INSERT INTO `c` VALUES (1,1),(2,1),(3,1),(4,2),(5,2),(6,2),(7,3),(8,3);
UNLOCK TABLES;
[26 Sep 22:02] Sveta Smirnova
Thank you for the feedback.

Verified as described.
[10 Nov 15:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/89960

3191 Martin Hansson	2009-11-10
      Bug#47650: using group by with rollup without indexes
                 returns incorrect results with where
      
      A result set for a ROLLUP query outer-joining tables t1 and
      t2 turned t1 into a const table after applying the join
      condition, which in turn caused GROUP BY to be optimized
      away. A filtered table scan was then used on t2 in the
      join. The const access and lack of GROUP BY caused the
      optimizer to use filesort without temporary table to access
      t2. But the join condition, translated into a WHERE is lost
      in the optimization step. Filesort does no handle filtered
      table scans.
      
      Fixed by always sorting with a temporary table if a query
      contains ROLLUP and there is no key to access the joined
      table.
     @ mysql-test/r/func_gconcat.result
        Bug#47650: Expected change in test result
     @ mysql-test/r/group_by.result
        Bug#47650: Test result
     @ mysql-test/r/olap.result
        Bug#47650: Expected change in test result
     @ mysql-test/t/group_by.test
        Bug#47650: Test case
     @ sql/sql_select.cc
        Bug#47650: Fix and organizing surrounding code according to
        coding standard
[16 Nov 12:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90498

3200 Martin Hansson	2009-11-16
      Bug#47650: using group by with rollup without indexes returns
          incorrect results with where : worked on fix.
      
      An ugly hack that serves to prove a point. If this ugly hack 
      turns out to do the job, we could base the fix on it once we
      implement it properly.