Bug #119029 Two missed optimization opportunities of "TRUE OR any_expr"
Submitted: 18 Sep 0:40
Reporter: jinhui lai Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server Severity:S5 (Performance)
Version:9.4.0 OS:Any
Assigned to: CPU Architecture:Any

[18 Sep 0:40] jinhui lai
Description:
Since "TRUE OR any_expr" is TRUE. MySQL can apply short-circuit evaluation to it. When the true expression is "1", "TRUE", or "2>1", etc., the optimizer can rewrite it to "TRUE" directly. In another scenario, the executor can short-circuit when determining that one expression is true after scanning one table, and skip scanning another table.

How to repeat:
-- Create table t1 with 1 row and t2 with 10,000,000 rows
CREATE TABLE t1(c0 INT8);
CREATE TABLE t2(c0 INT8);

INSERT INTO t1 VALUES(1);

DELIMITER //
CREATE PROCEDURE batch_insert_numbers()
BEGIN
    DECLARE i INT DEFAULT 1;
    DECLARE batch_size INT DEFAULT 1000;
    
    WHILE i <= 10000000 DO
        START TRANSACTION;
        WHILE i <= 10000000 AND batch_size > 0 DO
            INSERT INTO t2(c0) VALUES (i);
            SET i = i + 1;
            SET batch_size = batch_size - 1;
        END WHILE;
        COMMIT;
        SET batch_size = 1000;
    END WHILE;
END //
DELIMITER ;

CALL batch_insert_numbers();

-- The optimizer can rewrite such a SQL statement to "SELECT TRUE" directly
SELECT (SELECT MAX(t2.c0) FROM t2)<1 OR TRUE;
+---------------------------------------+
| (SELECT MAX(t2.c0) FROM t2)<1 OR TRUE |
+---------------------------------------+
|                                     1 |
+---------------------------------------+
1 row in set (3.298 sec)

explain SELECT (SELECT MAX(t2.c0) FROM t2)<1 OR TRUE;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
|  1 | PRIMARY     | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL |    NULL |     NULL | No tables used |
|  2 | SUBQUERY    | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9738355 |   100.00 | NULL           |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
2 rows in set, 1 warning (0.001 sec)

-- The executor can short-circuit when determining that one expression is true after scanning one table, and skip scanning another table.
SELECT (SELECT MAX(t2.c0) FROM t2)<1 OR (SELECT MAX(t1.c0) FROM t1)>=1;
+-----------------------------------------------------------------+
| (SELECT MAX(t2.c0) FROM t2)<1 OR (SELECT MAX(t1.c0) FROM t1)>=1 |
+-----------------------------------------------------------------+
|                                                               1 |
+-----------------------------------------------------------------+
1 row in set (3.023 sec)

explain SELECT (SELECT MAX(t2.c0) FROM t2)<1 OR (SELECT MAX(t1.c0) FROM t1)>=1;
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
|  1 | PRIMARY     | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL |    NULL |     NULL | No tables used |
|  3 | SUBQUERY    | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |       1 |   100.00 | NULL           |
|  2 | SUBQUERY    | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9738355 |   100.00 | NULL           |
+----+-------------+-------+------------+------+---------------+------+---------+------+---------+----------+----------------+
3 rows in set, 1 warning (0.001 sec)