Bug #68424 Queries with nested select distinct queries sometimes return incorrect results
Submitted: 19 Feb 2013 6:59 Modified: 19 Apr 2013 4:47
Reporter: Bryan Turner Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.6.10 OS:Linux (Xubuntu 12.10 x64)
Assigned to: CPU Architecture:Any
Tags: regression

[19 Feb 2013 6:59] Bryan Turner
Description:
When running a query with a nested select distinct query, duplicate values are returned. Running the nested select distinct query shows the correct rows. Plugging those rows directly into the outer query shows the correct rows. Running the whole thing using a nested query returns duplicate rows.

I first noticed this with Hibernate queries returning unexpected results, but I was able to replicate the issue using the mysql command line tool directly.

How to repeat:
The following is an excerpt of a dump taken from a schema which exhibits this behaviour:

-- MySQL dump 10.13  Distrib 5.6.10, for linux-glibc2.5 (x86_64)
--
-- Host: localhost    Database: oops
-- ------------------------------------------------------
-- Server version       5.6.10

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `oops_user`
--

DROP TABLE IF EXISTS `oops_user`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `oops_user` (
  `id` int(11) NOT NULL,
  `name` varchar(255) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uk_user_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `oops_user`
--

LOCK TABLES `oops_user` WRITE;
/*!40000 ALTER TABLE `oops_user` DISABLE KEYS */;
INSERT INTO `oops_user` VALUES (2,'admin'),(3,'project admin for test1'),(4,'project admin for test2'),(7,'project reader for test1'),(8,'project reader for test2'),(5,'project writer for test1'),(6,'project writer for test2'),(1,'sysadmin'),(9,'xlicensed');
/*!40000 ALTER TABLE `oops_user` ENABLE KEYS */;
UNLOCK TABLES;

--
-- Table structure for table `oops_weight`
--

DROP TABLE IF EXISTS `oops_weight`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `oops_weight` (
  `perm_id` int(11) NOT NULL,
  `perm_weight` int(11) NOT NULL,
  PRIMARY KEY (`perm_id`),
  UNIQUE KEY `perm_weight` (`perm_weight`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `oops_weight`
--

LOCK TABLES `oops_weight` WRITE;
/*!40000 ALTER TABLE `oops_weight` DISABLE KEYS */;
INSERT INTO `oops_weight` VALUES (9,0),(0,1000),(2,2000),(1,3000),(3,4000),(8,5000),(4,6000),(5,7000),(6,9000),(7,10000);
/*!40000 ALTER TABLE `oops_weight` ENABLE KEYS */;
UNLOCK TABLES;

--
-- Table structure for table `oops_perm`
--

DROP TABLE IF EXISTS `oops_perm`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `oops_perm` (
  `id` int(11) NOT NULL,
  `perm_id` int(11) NOT NULL,
  `project_id` int(11) DEFAULT NULL,
  `group_name` varchar(255) COLLATE utf8_bin DEFAULT NULL,
  `user_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fk_perm_project` (`project_id`),
  KEY `fk_perm_user` (`user_id`),
  KEY `granted_perm_weight_fk` (`perm_id`),
  CONSTRAINT `granted_perm_weight_fk` FOREIGN KEY (`perm_id`) REFERENCES `oops_weight` (`perm_id`),
  CONSTRAINT `fk_perm_project` FOREIGN KEY (`project_id`) REFERENCES `project` (`id`),
  CONSTRAINT `fk_perm_user` FOREIGN KEY (`user_id`) REFERENCES `oops_user` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `oops_perm`
--

LOCK TABLES `oops_perm` WRITE;
/*!40000 ALTER TABLE `oops_perm` DISABLE KEYS */;
INSERT INTO `oops_perm` VALUES (1,7,NULL,NULL,1),(2,6,NULL,NULL,2),(3,4,1,NULL,3),(4,4,2,NULL,4),(5,3,1,NULL,5),(6,3,2,NULL,6),(7,2,1,NULL,7),(8,2,2,NULL,8),(9,3,2,NULL,NULL),(10,2,3,NULL,NULL),(12,4,2,'developers',NULL),(14,9,NULL,NULL,9),(15,9,NULL,'licensed_group',NULL),(16,6,NULL,NULL,1),(17,9,NULL,NULL,1),(18,9,NULL,'example-users',NULL),(19,9,NULL,'example-admins',NULL),(20,6,NULL,'example-admins',NULL),(21,7,NULL,'example-admins',NULL),(22,2,3,'group-with-global-and-resource-perms',NULL),(23,9,NULL,'group-with-global-and-resource-perms',NULL),(24,2,3,NULL,9),(25,9,NULL,NULL,9);
/*!40000 ALTER TABLE `oops_perm` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2013-02-18 22:31:40

I created a fresh, empty database using the following:

  create database oops character set utf8 collate utf8_bin;

I imported this excerpt back again using this command:

  mysql oops < tables.dmp

With that data imported, the following queries demonstrate the issue:

1. The nested query:

mysql> select distinct user_id from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and group_name is null and user_id is not null;
+---------+
| user_id |
+---------+
|       1 |
|       2 |
|       9 |
+---------+
3 rows in set (0.00 sec)

2. The outer query, with the nested values provided directly:

mysql> select id, name from oops_user where id in (2, 1, 9) order by name;
+----+-----------+
| id | name      |
+----+-----------+
|  2 | admin     |
|  1 | sysadmin  |
|  9 | xlicensed |
+----+-----------+
3 rows in set (0.01 sec)

3. The combined query

mysql> select id, name from oops_user where id in (select distinct user_id from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and group_name is null and user_id is not null) order by name;
+----+-----------+
| id | name      |
+----+-----------+
|  2 | admin     |
|  1 | sysadmin  |
|  1 | sysadmin  |
|  9 | xlicensed |
|  9 | xlicensed |
+----+-----------+
5 rows in set (0.00 sec)
[19 Feb 2013 6:59] Bryan Turner
Table dump as a standalone file

Attachment: tables.dmp (application/octet-stream, text), 4.56 KiB.

[19 Feb 2013 10:04] MySQL Verification Team
Hello Bryan,

Thank you for the report.

Verified as described on reported version.

Thanks,
Umesh
[20 Feb 2013 9:39] Roy Lyseng
Hi Bryan,

We will fix this problem as soon as possible.

Please try this workaround while you wait:

set optimizer_switch='loosescan=off';
[21 Feb 2013 4:33] Bryan Turner
mysql> set optimizer_switch='loosescan=off';
Query OK, 0 rows affected (0.00 sec)

Still the same query results, though.
[21 Feb 2013 14:51] Guilhem Bichot
Hello Bryan. Thanks for helping us! One more question: could you please run the three queries below in sequence, and post their output?

set optimizer_switch='loosescan=off';

explain select id, name from oops_user where id in (select distinct user_id
from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and
group_name is null and user_id is not null) order by name;

select id, name from oops_user where id in (select distinct user_id
from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and
group_name is null and user_id is not null) order by name;
[25 Feb 2013 4:00] Bryan Turner
Foremost, it appears I need to retract my previous statement that the workaround did not help. It appears that it does, but only for the mysql command line session where it was run. (I had tried to verify it previously by running that command from mysql and then connecting to it remotely; that use case still does not work).

With that said, here's the requested output:

mysql> set optimizer_switch='loosescan=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select id, name from oops_user where id in (select distinct user_id
    -> from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and
    -> group_name is null and user_id is not null) order by name;
+----+-------------+-----------+-------+-----------------------------------------------------+--------------+---------+-------------------+------+------------------------------------+
| id | select_type | table     | type  | possible_keys                                       | key          | key_len | ref               | rows | Extra                              |
+----+-------------+-----------+-------+-----------------------------------------------------+--------------+---------+-------------------+------+------------------------------------+
|  1 | SIMPLE      | oops_user | index | PRIMARY                                             | uk_user_name | 767     | NULL              |    9 | Using where; Using index           |
|  1 | SIMPLE      | oops_perm | ref   | fk_perm_project,fk_perm_user,granted_perm_weight_fk | fk_perm_user | 5       | oops.oops_user.id |    1 | Using where; FirstMatch(oops_user) |
+----+-------------+-----------+-------+-----------------------------------------------------+--------------+---------+-------------------+------+------------------------------------+
2 rows in set (0.01 sec)

mysql> select id, name from oops_user where id in (select distinct user_id
    -> from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and
    -> group_name is null and user_id is not null) order by name;
+----+-----------+
| id | name      |
+----+-----------+
|  2 | admin     |
|  1 | sysadmin  |
|  9 | xlicensed |
+----+-----------+
3 rows in set (0.00 sec)

As soon as I restart mysql, though:

mysql> select id, name from oops_user where id in (select distinct user_id
    -> from oops_perm where perm_id in (9, 5, 6, 7) and project_id is null and
    -> group_name is null and user_id is not null) order by name;
+----+-----------+
| id | name      |
+----+-----------+
|  2 | admin     |
|  1 | sysadmin  |
|  1 | sysadmin  |
|  9 | xlicensed |
|  9 | xlicensed |
+----+-----------+
5 rows in set (0.00 sec)

Is there a way to apply this setting more permanently? Then I could check it with the tests that helped me discover it.
[25 Feb 2013 8:17] Guilhem Bichot
Hello Bryan. Thanks for having double-checked! Your results are consistent with what I had found, I had tracked it to be a bug in the "semi-join LooseScan" algorithm.
Indeed, when you do
set optimizer_switch='loosescan=off';
it is equivalent to
set session optimizer_switch='loosescan=off';
which affects only the current session. If you want to affect all sessions, you have to possibilities:
1) set global optimizer_switch='loosescan=off';
this will affect all new sessions until the server is restarted.
2) start the server with --optimizer_switch=loosescan=off (or add that to your MySQL server's configuration file, without '--')
then this will affect all sessions. For your testing, this may be the simplest alternative.
More details about SET: http://dev.mysql.com/doc/refman/5.6/en/mysqld-server.html
[5 Mar 2013 12:49] Erlend Dahl
A fix for this will be part of the upcoming 5.6.11 and 5.7.1 releases.
[19 Apr 2013 4:47] Bryan Turner
Verified in the new 5.6.11 release. Thanks for finding and fixing this issue!