Bug #86165 Loose index scans returning wrong result
Submitted: 3 May 2017 3:58 Modified: 27 Sep 2017 2:28
Reporter: Andrew nicols Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.6.36, 5.7.18, 8.0.1 OS:Any (Replicated with official docker images)
Assigned to: CPU Architecture:Any
Tags: regression

[3 May 2017 3:58] Andrew nicols
Description:
When performing a COUNT(DISTINCT) on a column that causes a loose index scan with a group-by, the incorrect result is returned.

Any of the following will mean that the correct result is returned:
* swapping the order of the unique key around (idnumber, then relatedid)
* disabling use_index_extensions
* including the first row in the table in the IN or dropping earlier rows until the first row in the table is included

This matches closely with what was described in #83005 and I have been unable to find any other open issues which describe this.

How to repeat:
Sample test case:

-- Create the test dataset.
CREATE DATABASE testing;
CONNECT testing;
DROP TABLE IF EXISTS `example`;
CREATE TABLE `example` (
  `id` bigint(10) NOT NULL AUTO_INCREMENT,
  `idnumber` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
  `relatedid` bigint(10) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';

INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);

-- And performing the following query:
SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);

The expected result is always 1.

-- Default setup with no changes.
mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

-- Incorrect result.

mysql> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
| id | select_type | table   | type  | possible_keys       | key         | key_len | ref  | rows | Extra                                            |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
|  1 | SIMPLE      | example | range | PRIMARY,example_idx | example_idx | 8       | NULL |    5 | Using where; Using index for group-by (scanning) |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
1 row in set (0.00 sec)

-- Include the first row in the IN:
mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181000);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

-- Correct result.

mysql> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181000);
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
| id | select_type | table   | type  | possible_keys       | key         | key_len | ref  | rows | Extra                                            |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
|  1 | SIMPLE      | example | range | PRIMARY,example_idx | example_idx | 8       | NULL |    5 | Using where; Using index for group-by (scanning) |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
1 row in set (0.00 sec)

-- Now disable use_index_extensions:

mysql> SET optimizer_switch='use_index_extensions=off';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

-- Correct result.
mysql> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
| id | select_type | table   | type  | possible_keys       | key         | key_len | ref  | rows | Extra                    |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | example | index | PRIMARY,example_idx | example_idx | 311     | NULL |    4 | Using where; Using index |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

-- Re-enable use_index_extensions and swap order of keys:
mysql> ALTER TABLE example DROP KEY example_idx;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE example ADD CONSTRAINT example_idx UNIQUE (idnumber,relatedid);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

-- Correct result.

mysql> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
| id | select_type | table   | type  | possible_keys       | key         | key_len | ref  | rows | Extra                    |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | example | index | PRIMARY,example_idx | example_idx | 311     | NULL |    4 | Using where; Using index |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

-- Swap the key back.

mysql> ALTER TABLE example DROP INDEX example_idx;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> ALTER TABLE example ADD CONSTRAINT example_idx UNIQUE (relatedid, idnumber);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

-- Now delete the current first row.
mysql> DELETE FROM example WHERE id = 181000;
Query OK, 1 row affected (0.00 sec)

mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181004, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

Verified that this is still an issue on:
* 5.6.36
* 5.7.18
[3 May 2017 4:02] Andrew nicols
Note, I'm also seeing this when I remove the WHERE:

mysql> SELECT COUNT(DISTINCT(relatedid)) FROM example ;
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

mysql> explain SELECT COUNT(DISTINCT(relatedid)) FROM example ;
+----+-------------+---------+-------+---------------+-------------+---------+------+------+-------------------------------------+
| id | select_type | table   | type  | possible_keys | key         | key_len | ref  | rows | Extra                               |
+----+-------------+---------+-------+---------------+-------------+---------+------+------+-------------------------------------+
|  1 | SIMPLE      | example | range | example_idx   | example_idx | 8       | NULL |    6 | Using index for group-by (scanning) |
+----+-------------+---------+-------+---------------+-------------+---------+------+------+-------------------------------------+
1 row in set (0.00 sec)
[3 May 2017 5:34] MySQL Verification Team
Hello Andrew nicols,

Thank you for the report and test case.
Verified as described with 5.6.36/5.7.18 builds.

Thanks,
Umesh
[3 May 2017 5:35] MySQL Verification Team
-- 5.6.36 - affected

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.6.36: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.36 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@localhost [(none)]> use test
Database changed
root@localhost [test]> -- Create the test dataset.
root@localhost [test]> CREATE DATABASE testing;
  `idnumber` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
  `relatedid` bigint(10) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';

INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [test]> CONNECT testing;
Connection id:    2
Current database: testing

root@localhost [testing]> DROP TABLE IF EXISTS `example`;
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@localhost [testing]> CREATE TABLE `example` (
    ->   `id` bigint(10) NOT NULL AUTO_INCREMENT,
    ->   `idnumber` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
    ->   `relatedid` bigint(10) NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';
INSERT INTO example(relatedid) values (185000);Query OK, 0 rows affected (0.00 sec)

root@localhost [testing]>
root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
| id | select_type | table   | type  | possible_keys       | key         | key_len | ref  | rows | Extra                                            |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
|  1 | SIMPLE      | example | range | PRIMARY,example_idx | example_idx | 8       | NULL |    5 | Using where; Using index for group-by (scanning) |
+----+-------------+---------+-------+---------------------+-------------+---------+------+------+--------------------------------------------------+
1 row in set (0.00 sec)

root@localhost [testing]>
[3 May 2017 5:36] MySQL Verification Team
-- 5.7.18 - affected

-- 5.7.18

[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.7.18: bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/86165 --core-file --socket=/tmp/mysql_ushastry.sock --port=3306 --log-error=$PWD/86165/log.err 2>&1 &
[1] 2057
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.7.18:
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.7.18: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.18 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@localhost [(none)]> create database test;
Query OK, 1 row affected (0.00 sec)

root@localhost [(none)]> use test
Database changed
root@localhost [test]> -- Create the test dataset.
root@localhost [test]> CREATE DATABASE testing;
  `relatedid` bigint(10) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';

INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [test]> CONNECT testing;
Connection id:    4
Current database: testing

root@localhost [testing]> DROP TABLE IF EXISTS `example`;
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@localhost [testing]> CREATE TABLE `example` (
    ->   `id` bigint(10) NOT NULL AUTO_INCREMENT,
    ->   `idnumber` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
    ->   `relatedid` bigint(10) NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';
Query OK, 0 rows affected (0.00 sec)

root@localhost [testing]>
root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.01 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys       | key         | key_len | ref  | rows | filtered | Extra                                            |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
|  1 | SIMPLE      | example | NULL       | range | PRIMARY,example_idx | example_idx | 8       | NULL |    5 |    50.00 | Using where; Using index for group-by (scanning) |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181000);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181000);
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys       | key         | key_len | ref  | rows | filtered | Extra                                            |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
|  1 | SIMPLE      | example | NULL       | range | PRIMARY,example_idx | example_idx | 8       | NULL |    5 |    50.00 | Using where; Using index for group-by (scanning) |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

root@localhost [testing]> SET optimizer_switch='use_index_extensions=off';
Query OK, 0 rows affected (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys       | key         | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | example | NULL       | index | PRIMARY,example_idx | example_idx | 311     | NULL |    4 |    50.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------------+-------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

root@localhost [testing]> ALTER TABLE example DROP KEY example_idx;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

root@localhost [testing]> ALTER TABLE example ADD CONSTRAINT example_idx UNIQUE (idnumber,relatedid);
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]>
[3 May 2017 5:36] MySQL Verification Team
-- 5.5.56 seems to be fine

-- 5.5.56
[umshastr@hod03]/export/umesh/server/binaries/GABuilds/mysql-5.5.56: bin/mysql -uroot -S /tmp/mysql_ushastry.sock
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.56-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

root@localhost [(none)]> use test
Database changed
root@localhost [test]> -- Create the test dataset.
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);
INSERT INTO example(relatedid) values (185000);root@localhost [test]> CREATE DATABASE testing;
Query OK, 1 row affected (0.00 sec)

root@localhost [test]> CONNECT testing;
Connection id:    2
Current database: testing

root@localhost [testing]> DROP TABLE IF EXISTS `example`;
Query OK, 0 rows affected, 1 warning (0.00 sec)

root@localhost [testing]> CREATE TABLE `example` (
    ->   `id` bigint(10) NOT NULL AUTO_INCREMENT,
    ->   `idnumber` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
    ->   `relatedid` bigint(10) NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   UNIQUE KEY `example_idx` (`relatedid`,`idnumber`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=181000 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci COMMENT='Example';
Query OK, 0 rows affected (0.00 sec)

root@localhost [testing]>
root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> INSERT INTO example(relatedid) values (185000);
Query OK, 1 row affected (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----------------------------+
| COUNT(DISTINCT(relatedid)) |
+----------------------------+
|                          1 |
+----------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> EXPLAIN SELECT COUNT(DISTINCT(relatedid)) FROM example WHERE id IN (181002, 181001);
+----+-------------+---------+-------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type | table   | type  | possible_keys | key         | key_len | ref  | rows | Extra                    |
+----+-------------+---------+-------+---------------+-------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | example | index | PRIMARY       | example_idx | 311     | NULL |    4 | Using where; Using index |
+----+-------------+---------+-------+---------------+-------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

root@localhost [testing]> select * from example;
+--------+----------+-----------+
| id     | idnumber | relatedid |
+--------+----------+-----------+
| 181000 | NULL     |    185000 |
| 181001 | NULL     |    185000 |
| 181002 | NULL     |    185000 |
| 181003 | NULL     |    185000 |
+--------+----------+-----------+
4 rows in set (0.00 sec)
[3 May 2017 6:00] Andrew nicols
Linking to https://tracker.moodle.org/browse/MDL-58754 where we discovered this issue.
[3 May 2017 6:24] Andrew nicols
Also verified on docker image for 8.0.1.
[27 Sep 2017 2:28] Paul DuBois
Posted by developer:
 
Fixed in 5.6.39, 5.7.21, 8.0.4.

Incorrect results could occur on a table with a unique index when the
optimizer chose a loose index scan even though the unique index had
no index extensions.