Bug #115283 | result not correct when using ‘<=>’ operator compare null value | ||
---|---|---|---|
Submitted: | 11 Jun 12:00 | Modified: | 11 Jun 12:10 |
Reporter: | ximin liang (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 8.0.37 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[11 Jun 12:00]
ximin liang
[11 Jun 12:01]
ximin liang
Description: Hello mysql team: I found a bug about result not correct when using ‘<=>’ operator compare null value executed in nested loop join. How to repeat: Can be repeated in MySQL 8.4.0 CREATE TABLE `t1` ( `c1` int DEFAULT NULL, `c2` int DEFAULT NULL, KEY `idx` (`c1`)); CREATE TABLE `t2` ( `c1` int DEFAULT NULL, `c2` int DEFAULT NULL, KEY `idx` (`c1`) ); insert into t2(c1,c2) values (1,1),(null,null); insert into t1(c1,c2) values (1,1),(null,null); select /*+no_bnl()*/ `t1`.`c1` AS `c1`,`t1`.`c2` AS `c2`,`t2`.`c1` AS `c1`,`t2`.`c2` AS `c2` from `test`.`t1` inner join `test`.`t2`on(`t2`.`c1`= `t1`.`c1`) and (`t1`.`c1` <=> `t2`.`c1`); +------+------+------+------+ | c1 | c2 | c1 | c2 | +------+------+------+------+ | 1 | 1 | 1 | 1 | | NULL | NULL | NULL | NULL | +------+------+------+------+ 2 rows in set (0.00 sec) if we force hash join: mysql> select /*+bnl()*/ `t1`.`c1` AS `c1`,`t1`.`c2` AS `c2`,`t2`.`c1` AS `c1`,`t2`.`c2` AS `c2` from `test`.`t1` inner join `test`.`t2` on(`t2`.`c1`= `t1`.`c1`) and (`t1`.`c1` <=> `t2`.`c1`); +------+------+------+------+ | c1 | c2 | c1 | c2 | +------+------+------+------+ | 1 | 1 | 1 | 1 | +------+------+------+------+ 1 row in set (0.00 sec) Suggested fix: seems plan is not right, notice filter: t1.c1 <=> t1.c1 mysql> explain format=tree select /*+no_bnl()*/ `t1`.`c1` AS `c1`,`t1`.`c2` AS `c2`,`t2`.`c1` AS `c1`,`t2`.`c2` AS `c2` from `test`.`t1` inner join `test`.`t2` on(`t2`.`c1`= `t1`.`c1`) and (`t1`.`c1` <=> `t2`.`c1`); +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Nested loop inner join (cost=1.15 rows=2) -> Filter: (t1.c1 <=> t1.c1) (cost=0.45 rows=2) -> Table scan on t1 (cost=0.45 rows=2) -> Index lookup on t2 using idx (c1=t1.c1) (cost=0.3 rows=1) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.01 sec) it is about func `substitute_for_best_equal_field`. for cond t1.c1 <=> t2.c1, this func replace t2.c1 with t1.c1 so result miss
[11 Jun 12:10]
MySQL Verification Team
Hello ximin liang, Thank you for the report and test case. Verified as described. regards, Umesh
[19 Jun 8:08]
huahua xu
Hi ximin liang, I think that it is a bug, but I don't agree with your reason for the incorrect result from the query statement execution. After tracked the optimization process of the statement, I found that the optimizer removed the equality(`t2`.`c1`= `t1`.`c1`) which is considered a redundant predicate condition. It is reasonable, because the predicate is guaranteed to be true by use of 'ref' access method.(the form 'field = value' could be redundant if the [ref-] access chosen for the table use an index containing 'field', where 'value' is specified as (part of) its ref-key) throgh see the associated source code, it is suspicious: ``` /* Remove equalities that are guaranteed to be true by use of 'ref' access method. Note that ref access implements "table1.field1 <=> table2.indexed_field2", i.e. if it passed a NULL field1, it will return NULL indexed_field2 if there are. Thus the equality "table1.field1 = table2.indexed_field2", is equivalent to "ref access AND table1.field1 IS NOT NULL" i.e. "ref access and proper setting/testing of ref->null_rejecting". Thus, we must be careful, that when we remove equalities below we also set ref->null_rejecting, and test it at execution; otherwise wrong NULL matches appear. So: - for the optimization phase, the code which is below, and the code in test_if_ref(), and in add_key_field(), must be kept in sync: if the applicability conditions in one place are relaxed, they should also be relaxed elsewhere. - for the execution phase, all possible execution methods must test ref->null_rejecting. */ Item *left_item = func->arguments()[0]->real_item(); Item *right_item = func->arguments()[1]->real_item(); bool redundant = false; if (left_item->type() == Item::FIELD_ITEM && !(left_item->used_tables() & null_extended) && test_if_ref(thd, down_cast<Item_field *>(left_item), right_item, &redundant)) { return true; } if (redundant) { return false; } if (right_item->type() == Item::FIELD_ITEM && !(right_item->used_tables() & null_extended) && test_if_ref(thd, down_cast<Item_field *>(right_item), left_item, &redundant)) { return true; } if (redundant) { return false; } ``` The code comment notes that the optimizer should set ref->null_rejecting to avoid wrong NULL matches appear when removing the redundant equality. But in fact, it didn't do that.
[19 Jun 8:40]
huahua xu
The patch maybe fix the bug
Attachment: set_null_rejecting_when_remove_equality_with_ref_access.patch (application/octet-stream, text), 2.26 KiB.
[31 Oct 10:16]
MySQL Verification Team
This is the original bug report for the following duplicate: https://bugs.mysql.com/bug.php?id=116517