Bug #53485 Subquery with MIN/MAX aggregate incorrectly optimized away -> Incorrect result
Submitted: 7 May 2010 12:59 Modified: 5 May 2018 9:59
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1.44, 5.5.5-m3 OS:Any
Assigned to: CPU Architecture:Any

[7 May 2010 12:59] Ole John Aske
Description:
The optimizer will incorrectly 'optimized away' a subquery selecting MIN/MAX even if the subquery is correlated by (or: 'depends on') an outer referrence. In the example:

SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subqry FROM T1;

The optimizer pre-evaluates the term 'SELECT MIN(42) FROM T2' to the answer 42 and
replace the entire subquery with const_table(42) which is returned for all referrences to this subquery. The correlation predicate 'WHERE T2.pk = T1.pk' is never checked during query execution. If this predicate was executed, it would not have qualify any rows from the subquery and it should have evaluated to NULL.

How to repeat:
CREATE TABLE T1(PK INT PRIMARY KEY) engine = myISAM;
CREATE TABLE T2(PK INT PRIMARY KEY) engine = myISAM;

INSERT INTO T1 VALUES (1);
INSERT INTO T2 VALUES (10);

explain
SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subquery FROM T1;

+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+
| id | select_type        | table | type   | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+
|  1 | PRIMARY            | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                              |
|  2 | DEPENDENT SUBQUERY | NULL  | NULL   | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+

mysql> SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subqry FROM T1;
+---------------+
| scalar_subqry |
+---------------+
|            42 |
+---------------+
1 row in set (0.00 sec)

Suggested fix:
Either:

1) Dissable const table optimization for subqueries which are correlated through an outer referrence.

2) Extend join_read_const() (and also join_read_system()?) to also evaluate an optional condition before either returning the const record, or STATUS_NOT_FOUND.
[7 May 2010 13:04] Ole John Aske
This is similar to bug#52051, where min/max is also incorrectly optimized away and replaced with a const-table.

However, bug#52051 is related to OUTER JOIN where predicate dependencies are not correctly handled, and is therefore not a duplicate. - It might be that the patch for 52051 may be enhanced to also fix this problem.
[7 May 2010 13:31] Valeriy Kravchuk
Verified just as described, also with current mysql-trunk from bzr:

77-52-4-109:trunk openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 22
Server version: 5.5.5-m3-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE T1(PK INT PRIMARY KEY) engine = myISAM;
Query OK, 0 rows affected (0.05 sec)

mysql> CREATE TABLE T2(PK INT PRIMARY KEY) engine = myISAM;
Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO T1 VALUES (1);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO T2 VALUES (10);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subqry FROM T1;
+---------------+
| scalar_subqry |
+---------------+
|            42 |
+---------------+
1 row in set (0.00 sec)

mysql> explain SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subqry FROM T1;
+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+
| id | select_type        | table | type   | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+
|  1 | PRIMARY            | T1    | system | NULL          | NULL | NULL    | NULL |    1 |                              |
|  2 | DEPENDENT SUBQUERY | NULL  | NULL   | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+--------------------+-------+--------+---------------+------+---------+------+------+------------------------------+
2 rows in set (0.00 sec)
[2 Apr 2012 22:45] David Conger-Eldeen
We have a trigger that will hang on this bug. We are running MySQL 5.1.49_sp1. Please let me know if there is going to be a patch for this or if there is one in a later release of MySQL.

The query in the trigger that causes the statement to "hang" with a state of "Optimizing" is:

SELECT omits_iteration
      ,omits_sql_ready_date
INTO  @o_iter
     ,@o_srd
 FROM division_request_select_omit_sql_log
WHERE request_id = new.request_id
  AND subsystem_id = 4
  AND omits_iteration = (
       SELECT MAX(omits_iteration)
         FROM division_request_select_omit_sql_log
        WHERE request_id = new.request_id
          AND subsystem_id = 4
       )
;

The EXPLAIN for this is:

id|select_type|table                               |type |possible_keys|key	   |key_len|ref              |rows|Extra
1 |PRIMARY    |division_request_select_omit_sql_log|const|PRIMARY	   |PRIMARY|9      |const,const,const|1	  |
2 |SUBQUERY   |\N                                  |\N   |\N           |\N     |\N     |\N               |\N  |Select tables optimized away
[7 May 2013 9:02] Hartmut Holzgraefe
Still reproducible in 5.5.30, seems to be fixed in 5.6.10 though ...
[6 Feb 2018 19:17] Sveta Smirnova
Seems to be fixed in 5.7:

mysql> CREATE TABLE T1(PK INT PRIMARY KEY) engine = myISAM;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE T2(PK INT PRIMARY KEY) engine = myISAM;
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> INSERT INTO T1 VALUES (1);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO T2 VALUES (10);
Query OK, 1 row affected (0.02 sec)

mysql> SELECT (SELECT MIN(42) FROM T2 WHERE T2.pk = T1.pk ) AS scalar_subquery FROM T1;
+-----------------+
| scalar_subquery |
+-----------------+
|            NULL |
+-----------------+
1 row in set (0.00 sec)
[5 May 2018 9:59] Jon Stephens
Documented fix as follows in the MySQL 5.6.40 changelog:

    The optimizer incorrectly optimized away a subquery selecting a
    MIN() or MAX() expression even when the subquery was correlated
    with or depended on an outer reference. An example of such a
    query is shown here:

  SELECT (
    SELECT MIN(42) FROM t2 WHERE t2.pk = t1.pk
    ) AS scalar_subqry 
    FROM t1;

    In this case, the optimizer reduced SELECT MIN(42) FROM T2 to a
    constant value (42), and the correlation predicate WHERE T2.pk =
    T1.pk was never checked during query execution. (If the
    predicate were executed, it would not have qualifed any rows
    from the subquery, and would have evaluated as NULL.)

Closed.