MySQL Bugs: #119861: Inconsistent query results when using <=> to compare row data with empty set subquery result

Bug #119861	Inconsistent query results when using <=> to compare row data with empty set subquery result
Submitted:	9 Feb 9:01	Modified:	16 Feb 8:35
Reporter:	Xingyu Yang (OCA)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	8.0.44	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
When using the <=> (null-safe equal) operator to compare a row of data with the result of a single-row subquery, if the subquery returns an Empty set, the final query result becomes unstable and inconsistent (returns unexpected rows in some cases), even though the subquery result remains empty throughout.

How to repeat:
-- 1. Create test tables and insert initial data:
-- sql
create table t1(c1 int default null, c2 int default null);
create table t2(c1 int default null, c2 int default null);
insert into t1 values (1,null);
insert into t2 values (1,1);

-- 2. Verify the subquery returns an empty set:
-- sql
select c1,c2 from t2 where c1=4 order by c1,c2 limit 1;
-- Result: Empty set (0.01 sec)

-- 3. Execute the main query with <=> comparison against the empty subquery:
-- sql
select * from t1 where (c1,c2) <=> (select c1,c2 from t2 where c1=4 order by c1,c2 limit 1);
-- Result: Empty set (0.01 sec) (expected)

-- 4. Insert additional data into t2 (does not affect the subquery result):
-- sql
insert into t2 values (2,null);

-- 5. Re-verify the subquery still returns empty set:
-- sql
select c1,c2 from t2 where c1=4 order by c1,c2 limit 1;
-- Result: Empty set (0.01 sec) (same as before)

-- 6. Re-execute the main query:
-- sql
select * from t1 where (c1,c2) <=> (select c1,c2 from t2 where c1=4 order by c1,c2 limit 1);
-- Result: Returns row (1, null) (unexpected)

Suggested fix:
Root Cause

The root cause lies in the compare_pair_for_nulls() function used for row comparison with <=>:
When comparing row data (item->result_type()==ROW_RESULT), the function checks each column individually, and considers NULL = NULL as a match (per <=> semantics).
Even though the subquery returns an Empty set, it still executes a table scan. The last row's values remain "residued" in the field buffer. After inserting (2, null) into t2, this residual row (with c2 = NULL) is read during the comparison. Since t1.c2 and the residual t2.c2 are both NULL, the <=> comparison succeeds, leading to the unexpected row being returned.

Fix Solution

Reset the values of all columns in the single-row subquery result when the subquery returns an empty set, to eliminate residual data in the field buffer. The fix modifies sql/item_subselect.cc to explicitly store NULL in all cached columns of the subquery result when the subquery is a single-row type and returns NULL (empty set):

diff --git a/sql/item_subselect.cc b/sql/item_subselect.cc
index e4466735d8c..c39c0bbdbaf 100644
--- a/sql/item_subselect.cc
+++ b/sql/item_subselect.cc
@@ -3072,6 +3072,13 @@ bool SubqueryWithResult::exec(THD *thd) {
   char const *save_where = thd->where;
   const bool res = unit->execute(thd);
   thd->where = save_where;
+  if (item->substype() == Item_subselect::SINGLEROW_SUBS && item->null_value) {
+    Item_singlerow_subselect *item_singlerow =
+        down_cast<Item_singlerow_subselect *>(this->item);
+    for (uint i = 0; i < item_singlerow->cols(); ++i) {
+      down_cast<Item_cache *>(item_singlerow->element_index(i))->store_null();
+    }
+  }
   return res;
 }

Reset the values of all columns in the single-row subquery result when the subquery returns an empty set, (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-bugfix-Inconsistent-query-results-when-using-to-comp.patch (application/octet-stream, text), 1.02 KiB.

Hi  Xingyu Yang,

Thank you for the test case and the contribution. Verified the bug as described.