Bug #115907 Incorrect result when condition use concat function
Submitted: 23 Aug 2024 5:19 Modified: 26 Aug 2024 2:36
Reporter: jianhua sun Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.7,8.0,8.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: concat, incorrect, join

[23 Aug 2024 5:19] jianhua sun
Description:
When performing a join between two tables and using a CONCAT function to concatenate two integer fields in the equality condition, the query results are not only unstable but also incorrect. Here are the specific observations:

Query q1: Before executing q9, the result of q1 is incorrect. However, after inserting a new row with q9, the result of q1 becomes correct.
Query q2: Compared to q1, the difference is that the columns involved in the CONCAT function do not have indexes. Regardless of whether q9 is executed, the result of q2 remains incorrect.
Query q3: Compared to q1, the difference lies in removing quotes from one of the conditions in the WHERE clause, allowing implicit type comparison between strings and integers. Regardless of whether q9 is executed, the result of q3 remains correct.
Query q4: Compared to q2, the difference is the removal of quotes from one of the conditions in the WHERE clause, allowing implicit type comparison between strings and integers. Regardless of whether q9 is executed, the result of q4 remains correct.
Query q5: Compared to q1, the difference is pushing down a constant condition into the CONCAT function. Before executing q9, the result of q5 is incorrect. After inserting a new row with q9, the result of q5 becomes correct.
Query q6: Compared to q2, the difference is pushing down a constant condition into the CONCAT function. Regardless of whether q9 is executed, the result of q6 remains incorrect.
Query q7: Compared to q3, the difference is pushing down a constant condition into the CONCAT function. Regardless of whether q9 is executed, the result of q7 remains correct.
Query q8: Compared to q4, the difference is pushing down a constant condition into the CONCAT function. Regardless of whether q9 is executed, the result of q8 remains correct.

All versions of MySQL can reproduce this issue.

How to repeat:

drop table t1;
CREATE TABLE `t1` (
  `c1` bigint unsigned NOT NULL ,
  `c2` bigint NOT NULL ,
  `c3` bigint NOT NULL ,
  PRIMARY KEY (`c1`),
  KEY `key_c2` (`c2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;

insert into t1(c1, c2, c3) values (10009700001000000, 100097, 100097);
insert into t1(c1, c2, c3) values (10009700001000001, 100097, 100097);
insert into t1(c1, c2, c3) values (10009700001000002, 100097, 100097);
insert into t1(c1, c2, c3) values (10009700001000003, 100097, 100097);
insert into t1(c1, c2, c3) values (10009700001000004, 100097, 100097);

drop table t2;
CREATE TABLE `t2` (
  `c1` varchar(40) NOT NULL ,
  `c2` bigint NOT NULL,
  `c3` bigint unsigned NOT NULL,
  PRIMARY KEY (`c1`,`c2`,`c3`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;

insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000000);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000001);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000002);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000003);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000004);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000005);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000006);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000007);
insert into t2(c1, c2, c3) values (20240820003914360, 437817, 1000008);

/*Q1 t1.c2 with index, t2.c1 equal to string */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c2, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q2 t1.c3 without index, t2.c1 equal to string */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c3, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817;

/*Q3 t1.c2 with index, t2.c1 equal to integer */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c2, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817;

/*Q4 t1.c3 without index, t2.c1 equal to integer */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c3, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817; 

/*Q5 t1.c2 with index, t2.c1 equal to string,  100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q6 t1.c3 without index, t2.c1 equal to string,  100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q7 t1.c2 with index, t2.c1 equal to integer, 100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817; 

/*Q8 t1.c3 without index, t2.c1 equal to integer, 100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817; 

/*Q9 insert one rows into t1*/
insert into t1(c1, c2, c3) values (10009700001000005, 100097, 100097);

/*Q1 t1.c2 with index, t2.c1 equal to string */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c2, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q2 t1.c3 without index, t2.c1 equal to string */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c3, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817;

/*Q3 t1.c2 with index, t2.c1 equal to integer */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c2, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817;

/*Q4 t1.c3 without index, t2.c1 equal to integer */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(t1.c3, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817; 

/*Q5 t1.c2 with index, t2.c1 equal to string,  100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q6 t1.c3 without index, t2.c1 equal to string,  100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = '20240820003914360'
        AND t2.c2 = 437817; 

/*Q7 t1.c2 with index, t2.c1 equal to integer, 100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c2 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817; 

/*Q8 t1.c3 without index, t2.c1 equal to integer, 100097 pushdown to concat */
SELECT t1.*, t2.*
FROM t1
JOIN t2
    ON t1.c1 = concat(100097, '0000', t2.c3)
WHERE t1.c3 = 100097
        AND t2.c1 = 20240820003914360
        AND t2.c2 = 437817;
[23 Aug 2024 9:02] MySQL Verification Team
Hi Mr. sun,

Thank you for your bug report.

However, we do not understand why do you consider this a bug.

SQL Standard is quite clear on the join condition. ON condition is there to defined a common column. Hence, if you do not join two columns , from the same domain, that is not a join. Then, that ON statement is converted to WHERE statement.

In that case, you do not have a JOIN, but a Cartesian product with filtered results.

Not a bug.
[26 Aug 2024 2:36] jianhua sun
But, even if MySQL does not treat the join as one with a join condition, the result of executing this cartesian product join is still incorrect. 

mysql> /*Q1' t1.c2 with index, t2.c1 equal to string */
mysql> SELECT t1.*, t2.*
    -> FROM t1, t2
    -> WHERE t1.c2 = 100097
    -> AND t1.c1 = concat(t1.c2, '0000', t2.c3)
    ->         AND t2.c1 = '20240820003914360'
    ->         AND t2.c2 = 437817;
+-------------------+--------+--------+-------------------+--------+---------+
| c1                | c2     | c3     | c1                | c2     | c3      |
+-------------------+--------+--------+-------------------+--------+---------+
| 10009700001000000 | 100097 | 100097 | 20240820003914360 | 437817 | 1000000 |
| 10009700001000000 | 100097 | 100097 | 20240820003914360 | 437817 | 1000001 |
| 10009700001000001 | 100097 | 100097 | 20240820003914360 | 437817 | 1000000 |
| 10009700001000001 | 100097 | 100097 | 20240820003914360 | 437817 | 1000001 |
| 10009700001000002 | 100097 | 100097 | 20240820003914360 | 437817 | 1000002 |
| 10009700001000003 | 100097 | 100097 | 20240820003914360 | 437817 | 1000003 |
| 10009700001000003 | 100097 | 100097 | 20240820003914360 | 437817 | 1000004 |
| 10009700001000003 | 100097 | 100097 | 20240820003914360 | 437817 | 1000005 |
| 10009700001000004 | 100097 | 100097 | 20240820003914360 | 437817 | 1000003 |
| 10009700001000004 | 100097 | 100097 | 20240820003914360 | 437817 | 1000004 |
| 10009700001000004 | 100097 | 100097 | 20240820003914360 | 437817 | 1000005 |
+-------------------+--------+--------+-------------------+--------+---------+
11 rows in set (0.00 sec)

mysql>
mysql>
mysql> /*Q5' t1.c2 with index, t2.c1 equal to string,  100097 pushdown to concat */
mysql> SELECT t1.*, t2.*
    -> FROM t1, t2
    -> WHERE t1.c2 = 100097
    ->     AND t1.c1 = concat(100097, '0000', t2.c3)
    ->         AND t2.c1 = '20240820003914360'
    ->         AND t2.c2 = 437817;
+-------------------+--------+--------+-------------------+--------+---------+
| c1                | c2     | c3     | c1                | c2     | c3      |
+-------------------+--------+--------+-------------------+--------+---------+
| 10009700001000000 | 100097 | 100097 | 20240820003914360 | 437817 | 1000000 |
| 10009700001000001 | 100097 | 100097 | 20240820003914360 | 437817 | 1000001 |
| 10009700001000002 | 100097 | 100097 | 20240820003914360 | 437817 | 1000002 |
| 10009700001000003 | 100097 | 100097 | 20240820003914360 | 437817 | 1000003 |
| 10009700001000004 | 100097 | 100097 | 20240820003914360 | 437817 | 1000004 |
+-------------------+--------+--------+-------------------+--------+---------+
5 rows in set (0.01 sec)
[26 Aug 2024 7:00] Roy Lyseng
I am a bit confused over what the incorrect results you see are.
However, I wonder if the problems is due to the join condition's operands
being cast to a DOUBLE. Can you try to cast the CONCAT result into an UNSIGNED value, or cast t1.c1 into a CHAR(17)?
[16 Sep 2024 9:12] MySQL Verification Team
Thank you , Roy.