Bug #114405 Discrepancy in SQL Results due to Missing 'DISTINCT' Keyword
Submitted: 19 Mar 2024 12:53 Modified: 19 Mar 2024 14:19
Reporter: shan he Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.36 OS:Ubuntu (20.044)
Assigned to: CPU Architecture:Any

[19 Mar 2024 12:53] shan he
Description:

I previously performed reduction for bug #113423, but it was considered not a bug. The results of the two reduced SQL statements should be implication. This bug still occurs in 8.0.36.

--sql1--
SELECT (~`f4`) AS `f3` FROM (SELECT (`f9`) FROM (
    SELECT `col_char(20)_key_signed` AS `f9` FROM `table_3_utf8_undef`) AS `t1` ) AS `t2` NATURAL JOIN (
      SELECT 
      (SUBDATE(_UTF8MB4'2018-05-21', INTERVAL 1 DAY_MICROSECOND)) AS `f4`FROM 
    `table_7_utf8_undef` )  AS `t4`;
result:

+----------------------+
| f3                   |
+----------------------+
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
| 18446723893189315655 |
+----------------------+

--sql2--
SELECT (~`f4`) AS `f3` FROM (SELECT (`f9`)  FROM (
    SELECT `col_char(20)_key_signed` AS `f9`FROM `table_3_utf8_undef`) AS `t1` ) AS `t2` NATURAL JOIN (
      SELECT DISTINCT  
      (SUBDATE(_UTF8MB4'2018-05-21', INTERVAL 1 DAY_MICROSECOND)) AS `f4` FROM 
   `table_7_utf8_undef`  ) AS `t4`;

result:
+----------------------+
| f3                   |
+----------------------+
| 18446744073709549597 |
| 18446744073709549597 |
| 18446744073709549597 |
+----------------------+
In SQL1, there is one less keyword "DISTINCT" compared to SQL2.With the rest of the query statement being the same,  the expected result is that SQL1 contains SQL2. However, the query result shows that SQL2 contains 18446744073709549597, which is not part of SQL1.

Could you please confirm whether the simplification process we applied has been helpful for developers in diagnosing and addressing this issue?

How to repeat:
create table table_3_utf8_undef (
`pk` int primary key,
`col_bigint_undef_signed` bigint  ,
`col_bigint_undef_unsigned` bigint unsigned ,
`col_bigint_key_signed` bigint  ,
`col_bigint_key_unsigned` bigint unsigned ,
`col_float_undef_signed` float  ,
`col_float_undef_unsigned` float unsigned ,
`col_float_key_signed` float  ,
`col_float_key_unsigned` float unsigned ,
`col_double_undef_signed` double  ,
`col_double_undef_unsigned` double unsigned ,
`col_double_key_signed` double  ,
`col_double_key_unsigned` double unsigned ,
`col_decimal(40, 20)_undef_signed` decimal(40, 20)  ,
`col_decimal(40, 20)_undef_unsigned` decimal(40, 20) unsigned ,
`col_decimal(40, 20)_key_signed` decimal(40, 20)  ,
`col_decimal(40, 20)_key_unsigned` decimal(40, 20) unsigned ,
`col_char(20)_undef_signed` char(20)  ,
`col_char(20)_key_signed` char(20)  ,
`col_varchar(20)_undef_signed` varchar(20)  ,
`col_varchar(20)_key_signed` varchar(20)  ,
key (`col_bigint_key_signed`),
key (`col_bigint_key_unsigned`),
key (`col_float_key_signed`),
key (`col_float_key_unsigned`),
key (`col_double_key_signed`),
key (`col_double_key_unsigned`),
key (`col_decimal(40, 20)_key_signed`),
key (`col_decimal(40, 20)_key_unsigned`),
key (`col_char(20)_key_signed`),
key (`col_varchar(20)_key_signed`)
) character set utf8 ;
insert into table_3_utf8_undef values (0,82.1847,1,39.0425,38.1089,-1,1,94.1106,1.009,12.991,19755,-13064,0,1,79.1429,-2,1,"well",'3
','-0','e'),(1,1,20.0078,-9.183,68.1957,1,2,1,0.0001,12.991,2,71.0510,1,-1,2,12.991,12.991,'3	','1','3	','-0'),(2,-2,1,-21247,1.009,2,1.009,0.0001,36.0002,-2,2,-0,0.0001,-2,0.1598,47.1515,1.009,'3	','w','-1','e');
create table table_7_utf8_undef (
`pk` int primary key,
`col_bigint_undef_signed` bigint  ,
`col_bigint_undef_unsigned` bigint unsigned ,
`col_bigint_key_signed` bigint  ,
`col_bigint_key_unsigned` bigint unsigned ,
`col_float_undef_signed` float  ,
`col_float_undef_unsigned` float unsigned ,
`col_float_key_signed` float  ,
`col_float_key_unsigned` float unsigned ,
`col_double_undef_signed` double  ,
`col_double_undef_unsigned` double unsigned ,
`col_double_key_signed` double  ,
`col_double_key_unsigned` double unsigned ,
`col_decimal(40, 20)_undef_signed` decimal(40, 20)  ,
`col_decimal(40, 20)_undef_unsigned` decimal(40, 20) unsigned ,
`col_decimal(40, 20)_key_signed` decimal(40, 20)  ,
`col_decimal(40, 20)_key_unsigned` decimal(40, 20) unsigned ,
`col_char(20)_undef_signed` char(20)  ,
`col_char(20)_key_signed` char(20)  ,
`col_varchar(20)_undef_signed` varchar(20)  ,
`col_varchar(20)_key_signed` varchar(20)  ,
key (`col_bigint_key_signed`),
key (`col_bigint_key_unsigned`),
key (`col_float_key_signed`),
key (`col_float_key_unsigned`),
key (`col_double_key_signed`),
key (`col_double_key_unsigned`),
key (`col_decimal(40, 20)_key_signed`),
key (`col_decimal(40, 20)_key_unsigned`),
key (`col_char(20)_key_signed`),
key (`col_varchar(20)_key_signed`)
) character set utf8 ;
insert into table_7_utf8_undef values (0,-9.183,1,1.1384,2,15.1271,12.991,-2,0.0001,36.1270,79.1819,0.0001,0.0001,3.1387,52.0818,-0,0.0001,'1','3	','0','0'),(1,79,12.991,107,2,-0.0001,0,1.009,1.009,34,1,-1,69.0208,1,2,120,12.991,'3	','-1',"if",'b'),(2,-2,1,-9.183,1,12.991,0.0001,53,12.991,1.009,12.991,12.991,0.0001,-0.0001,12.991,0.0001,2,'3
','p','0','3	'),(3,-0.0001,12.991,1.009,1.009,-9.183,2,0,1,-2,1,2,1,2,1.009,2,12.991,'3
','0','k','0'),(4,1.009,0.0001,-1,12.991,2,47,2,0,12.991,12.991,1.009,0,1.009,1.009,-0.0001,6949,'-1','	3','1','m'),(5,-0,1,0,0,0.0001,28.1237,12.991,0,12.991,12.991,-0,12.991,2,2,2,1.009,'0','	3','0','	3'),(6,45.0855,1,38.1166,1,1.009,80.0284,2,122,0.0001,0,-1,11130,0,1,1,0,"know",'-0','
3','3
');

SELECT (~`f4`) AS `f3` FROM (SELECT (`f9`) FROM (
    SELECT `col_char(20)_key_signed` AS `f9` FROM `table_3_utf8_undef`) AS `t1` ) AS `t2` 
    NATURAL JOIN (SELECT (SUBDATE(_UTF8MB4'2018-05-21', INTERVAL 1 DAY_MICROSECOND)) AS `f4`FROM 
    `table_7_utf8_undef` )  AS `t4`;

SELECT (~`f4`) AS `f3` FROM (SELECT (`f9`)  FROM (
    SELECT `col_char(20)_key_signed` AS `f9`FROM `table_3_utf8_undef`) AS `t1` ) AS `t2` 
    NATURAL JOIN (SELECT DISTINCT  (SUBDATE(_UTF8MB4'2018-05-21', INTERVAL 1 DAY_MICROSECOND)) AS `f4` FROM 
   `table_7_utf8_undef`  ) AS `t4`;
[19 Mar 2024 14:19] MySQL Verification Team
Hi Mr. he,

Thank you very much for your bug report.

However, this is not a bug.

Difference is that with SELECT DISTINCT you get only three rows, so there is a smaller number of rows to join with, hence the outer SELECT has lesser rows to inspect.

Here are the results.

 First the SELECTs from  the two derived tables in the  natural join , one of which is not DISTINCT and the other one is DISTINCT and then your full queries:

---------------------------------
f9
w
3
1
w
3
1
w
3
1
w
3
1
w
3
1
w
3
1
w
3
1
---------------------------------

f9
w
3
1

---------------------------------

f3
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655
18446723893189315655

---------------------------------
f3
18446744073709549597
18446744073709549597
18446744073709549597

---------------------------------

As you can see a difference is only because innermost DISTINCT query makes a derived table with much less rows.

Not a bug.