Bug #99150 The IN operator malfunctions for floating-poing numbers
Submitted: 1 Apr 2020 11:35 Modified: 1 Apr 2020 12:47
Reporter: Manuel Rigger Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.19 OS:Any
Assigned to: CPU Architecture:Any

[1 Apr 2020 11:35] Manuel Rigger
Description:
Consider the following statements:

How to repeat:
CREATE TABLE t0(c0 FLOAT UNIQUE) ;
INSERT INTO t0(c0) VALUES (296088496), (0), (2080822651);
SELECT * FROM t0 WHERE t0.c0 NOT IN (622120376, -1.421821436E9, 296088496); -- expected: {0, 296089000, 2080820000}, actual: {0, 296089000}

Unexpectedly, the SELECT fetches only two rows. The predicate should evaluate to TRUE for all values, and thus fetch 3 rows, which can be verified with the following negated query, which does not fetch any rows:

SELECT * FROM t0 WHERE t0.c0 IN (622120376, -1.421821436E9, 296088496); -- {}

Note that although 296088496 is inserted into the table, the number is converted 296089000, which seems somewhat expected, since c0 is declared as a FLOAT.

The bug can only be triggered in special circumstances. Not only is the UNIQUE constraint necessary, but also the magnitude of the numbers play a role.

I found this bug based on MySQL 8.0.19.
[1 Apr 2020 11:47] MySQL Verification Team
Thank you for the bug report. This is known issue and explained why in the manual:

https://dev.mysql.com/doc/refman/8.0/en/problems-with-float.html
[1 Apr 2020 11:58] Manuel Rigger
Thank you for the link! I'm aware of the general limitations of floating-point numbers. However, in my understanding, an INDEX should never influence a query's result. Is it expected or described somewhere that this is not the case for floating-point numbers? In the other DBMS that I tested, such cases were regarded as bugs (e.g., see https://www.sqlite.org/src/tktview?name=2841e99d10 for a bug in SQLite).
[1 Apr 2020 12:15] MySQL Verification Team
Thank you for the feedback, below Oracle result same as MySQL, checked with Postgres same result as well:

Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL> CREATE TABLE t0(c0 FLOAT UNIQUE) ;

Table created.

SQL> INSERT INTO t0(c0) VALUES (296088496);

1 row created.

SQL> INSERT INTO t0(c0) VALUES (0);

1 row created.

SQL> INSERT INTO t0(c0) VALUES (2080822651);

1 row created.

SQL> SELECT * FROM t0 WHERE t0.c0 NOT IN (622120376, -1.421821436E9, 296088496);

        C0
----------
         0
2080822651

SQL>
[1 Apr 2020 12:47] Manuel Rigger
I checked with the latest version of PostgreSQL (based on commit 4d276ba94fd9b19457aeb5b6d9af00589fe184a0), and you are correct, the result of the first query is the same. However, from my perspective, the important difference is that PostgreSQL also produces a consistent result even when dropping the UNIQUE constraint. Furthermore, PostgreSQL fetches a row for the second query, so that the query with the non-negated predicate, and another one with the negated predicate in sum fetch all rows. Consequently, I still tend to believe that this might be a bug in MySQL. The bug might be though, that the case without the UNIQUE constraint does not work correctly.

postgres=# CREATE TABLE t0(c0 FLOAT UNIQUE) ;
CREATE TABLE
postgres=# INSERT INTO t0(c0) VALUES (296088496), (0), (2080822651);
INSERT 0 3
postgres=# SELECT * FROM t0 WHERE t0.c0 NOT IN (622120376, -1.421821436E9, 296088496); -- expected: {0, 296089000, 2080820000}, actual: {0, 296089000}
     c0     
------------
          0
 2080822651
(2 rows)

postgres=# SELECT * FROM t0 WHERE t0.c0 IN (622120376, -1.421821436E9, 296088496); -- {}
    c0     
-----------
 296088496
(1 row)

postgres=# DROP TABLE t0;
DROP TABLE
postgres=# CREATE TABLE t0(c0 FLOAT) ;
CREATE TABLE
postgres=# INSERT INTO t0(c0) VALUES (296088496), (0), (2080822651);
INSERT 0 3
postgres=# SELECT * FROM t0 WHERE t0.c0 NOT IN (622120376, -1.421821436E9, 296088496); -- expected: {0, 296089000, 2080820000}, actual: {0, 296089000}
     c0     
------------
          0
 2080822651
(2 rows)

postgres=# SELECT * FROM t0 WHERE t0.c0 IN (622120376, -1.421821436E9, 296088496); -- {}
    c0     
-----------
 296088496
(1 row)
[7 Jul 2020 9:46] Lukas Eder
Please consider reopening this bug, which is not about whether the result is correct or not, but about the fact that an index or unique constraint should never influence the result, similar to this one: https://bugs.mysql.com/bug.php?id=99182

Perhaps the two issues have the same underlying cause.