Description:
Here's the query to demonstrate the bug:
SELECT 1
FROM ( SELECT `user`.`User` FROM `mysql`.`user` AS `user` LEFT JOIN `mysql`.`time_zone` ON false ) AS `t1`
WHERE
(
true OR
`t1`.`User` IN (SELECT 1 where false) )
AND
( true OR true );
in 8.0.21 it returns several rows, in 8.0.23+ (including the latest 8.0.29) this query returns Impossible Where.
The correct result is several rows, because the WHERE condition in the outer query is always true and the outer join in the derived table with always false ON condition should return all rows from the left table.
here's how it's optimized in 8.0.21:
*) the derived table t1 is merged into the outer select
*) because the WHERE condition of the outer query is effectively true, it's optimized away
*) at the end of the optimization the query becomes SELECT 1 FROM `mysql`.`user` AS `user` LEFT JOIN `mysql`.`time_zone` ON false
then it's executed and we see the result
here's how it's optimized in 8.0.23:
*) derived table t1 is merged into the outer select
*) then this bug fix kicks in
commit eadef7cca586a34d27496e69b8785918739b3e49
Author: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
Date: Mon Sep 21 10:09:55 2020 +0530
Bug#31216115: WHEN WHERE CONDITION REFERS TO SUBQUERY AND IS ALWAYS FALSE, SERVER WILL CRASH
and prevents optimizing away `t1`.`User` IN (SELECT 1 where false) because it contains a reference to the merged derived table
*) "( true OR true )" gets rewritten as "(true)", i.e it's an OR cond with a single arg. The condition now looks like following "( true OR `t1`.`User` IN (SELECT 1 where false) ) AND ( true )"
*) then optimizer starts to flatten join, in particular it changes LEFT JOIN from the merged derived table to INNER JOIN because "(true)" represents itself as null rejecting for all tables (see suggested fix) . Once the join is converted to INNER, optimizer adds ON condition to the WHERE clause, i.e adds "false" to the WHERE condition of the outer query. The condition becomes "( true OR `t1`.`User` IN (SELECT 1 where false) ) AND ( true ) AND false"
*) the "AND false" part triggers the Impossible WHERE
How to repeat:
SELECT 1
FROM ( SELECT `user`.`User` FROM `mysql`.`user` AS `user` LEFT JOIN `mysql`.`time_zone` ON false ) AS `t1`
WHERE
(
true OR
`t1`.`User` IN (SELECT 1 where false) )
AND
( true OR true );
Suggested fix:
The actual regression is introduced by this commit
commit 88b391372e74616bc35edef4bcd238dabea13899
Author: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
Date: Thu Sep 27 09:43:25 2018 +0530
WL#9571: Remove trivial conditions before outer to inner join
transformation
but it remained hidden till this commit revealed it
commit eadef7cca586a34d27496e69b8785918739b3e49
Author: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
Date: Mon Sep 21 10:09:55 2020 +0530
Bug#31216115: WHEN WHERE CONDITION REFERS TO SUBQUERY AND IS ALWAYS FALSE, SERVER WILL CRASH
The cause of the bug is that the optimized "(true)" got not_null_table_cache property incorrectly set up. In Item_cond::fix_fields() :
if (remove_condition) {
new_item->fix_fields(thd, ref);
used_tables_cache = 0;
if (func_type == COND_AND_FUNC && ignore_unknown())
not_null_tables_cache = 0;
else
not_null_tables_cache = ~(table_map)0;
li.rewind();
while ((item = li++)) {
Cleanup_after_removal_context ctx(select);
item->walk(&Item::clean_up_after_removal, enum_walk::SUBQUERY_POSTFIX,
pointer_cast<uchar *>(&ctx));
li.remove();
}
Prepared_stmt_arena_holder ps_arena_holder(thd);
list.push_front(new_item);
}
After this code not_null_tables_cache for OR will be left as 0xFFFFFFFF.
Over the course of Item_cond::fix_fields() code amends not_null_tables_cache with values from arg list thus limiting it to actual not null tables, but here at the end OR will still contain one element which isn't taken into account to calculate correct value of not_null_tables_cache.
The fix should be pretty obvious now.