Bug #107821 Result error : SQL with group by with rollup and const field
Submitted: 8 Jul 2022 18:02 Modified: 11 Jul 2022 10:23
Reporter: Ze Yang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0, 8.0.29 OS:Any
Assigned to: CPU Architecture:Any

[8 Jul 2022 18:02] Ze Yang
Description:
The query below return empty set.
But if we first execute the inner subquery, we see that we should not return empty set as there are values satisfy the condition in outer query block.
If we separate execute the query we will see the result is not empty.

```
SELECT layer_1_column_0 >'2011-06-22' AS `layer_0_column_0`
FROM
  (SELECT 'aa' AS `layer_1_column_0`,
          NOT(layer_2_column_1) AS `layer_1_column_1`,
          layer_2_column_2 AS `layer_1_column_2`,
          max(DISTINCT layer_2_column_1) AS `layer_1_column_3`
   FROM
     (SELECT 1 AS `layer_2_column_0`,
             'aa' AS `layer_2_column_1`,
             NULL AS `layer_2_column_2`,
             max((layer_3_column_0 / layer_3_column_0) - layer_3_column_0) AS `layer_2_column_3`
      FROM
        (SELECT 'aa' AS `layer_3_column_0`,
                max(layer_4_column_0 - layer_4_column_2) AS `layer_3_column_1`
         FROM
           (SELECT 'aa' AS `layer_4_column_0`,
                   ((NULL) IS NOT TRUE) AS `layer_4_column_1`,
                   1 AS `layer_4_column_2`,
                   SUM(DISTINCT decimal_test) AS `layer_4_column_3`
            FROM `select_two_secondary_index`
            GROUP BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`
            ORDER BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`,
                     `layer_4_column_3`) AS `layer_3_table`
         GROUP BY `layer_3_column_0`
         ORDER BY `layer_3_column_0`,
                  `layer_3_column_1`) AS `layer_2_table`
      GROUP BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`
      ORDER BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`,
               `layer_2_column_3`) AS `layer_1_table`
   GROUP BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2` WITH ROLLUP
   ORDER BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2`,
            `layer_1_column_3`) AS `layer_0_table`
WHERE ((`layer_1_column_0` ^ YEAR('2008-02-20')) NOT BETWEEN 48 AND 64);
Empty set, 1 warning
```

Execute the inner query:
```
SELECT 'aa' AS `layer_1_column_0`,
          NOT(layer_2_column_1) AS `layer_1_column_1`,
          layer_2_column_2 AS `layer_1_column_2`,
          max(DISTINCT layer_2_column_1) AS `layer_1_column_3`
   FROM
     (SELECT 1 AS `layer_2_column_0`,
             'aa' AS `layer_2_column_1`,
             NULL AS `layer_2_column_2`,
             max((layer_3_column_0 / layer_3_column_0) - layer_3_column_0) AS `layer_2_column_3`
      FROM
        (SELECT 'aa' AS `layer_3_column_0`,
                max(layer_4_column_0 - layer_4_column_2) AS `layer_3_column_1`
         FROM
           (SELECT 'aa' AS `layer_4_column_0`,
                   ((NULL) IS NOT TRUE) AS `layer_4_column_1`,
                   1 AS `layer_4_column_2`,
                   SUM(DISTINCT decimal_test) AS `layer_4_column_3`
            FROM `select_two_secondary_index`
            GROUP BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`
            ORDER BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`,
                     `layer_4_column_3`) AS `layer_3_table`
         GROUP BY `layer_3_column_0`
         ORDER BY `layer_3_column_0`,
                  `layer_3_column_1`) AS `layer_2_table`
      GROUP BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`
      ORDER BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`,
               `layer_2_column_3`) AS `layer_1_table`
   GROUP BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2` WITH ROLLUP
   ORDER BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2`,
            `layer_1_column_3`;

+------------------+------------------+------------------------------------+------------------+
| layer_1_column_0 | layer_1_column_1 | layer_1_column_2                   | layer_1_column_3 |
+------------------+------------------+------------------------------------+------------------+
| NULL             |             NULL | NULL                               | aa               |
| aa               |             NULL | NULL                               | aa               |
| aa               |                1 | NULL                               | aa               |
| aa               |                1 | NULL                               | aa               |
+------------------+------------------+------------------------------------+------------------+
```

```
 select ('aa' ^ YEAR('2008-02-20')) NOT BETWEEN 48 AND 64;
+---------------------------------------------------+
| ('aa' ^ YEAR('2008-02-20')) NOT BETWEEN 48 AND 64 |
+---------------------------------------------------+
|                                                 1 |
+---------------------------------------------------+
```

We also can create tmp table for inner query use 'create table select' with sql_mode equal ''; and then select tmp table for outer query to get final result:

```
> SELECT layer_1_column_0 >'2011-06-22' AS `layer_0_column_0`
  FROM tmp WHERE ((`layer_1_column_0` ^ YEAR('2008-02-20')) NOT BETWEEN 48 AND 64);
+------------------+
| layer_0_column_0 |
+------------------+
|                1 |
|                1 |
|                1 |
+------------------+
3 rows in set, 3 warnings (0.00 sec)
```

How to repeat:
I paste the schema and data in private comment.
The query is:

```
SELECT layer_1_column_0 >'2011-06-22' AS `layer_0_column_0`
FROM
  (SELECT 'aa' AS `layer_1_column_0`,
          NOT(layer_2_column_1) AS `layer_1_column_1`,
          layer_2_column_2 AS `layer_1_column_2`,
          max(DISTINCT layer_2_column_1) AS `layer_1_column_3`
   FROM
     (SELECT 1 AS `layer_2_column_0`,
             'aa' AS `layer_2_column_1`,
             NULL AS `layer_2_column_2`,
             max((layer_3_column_0 / layer_3_column_0) - layer_3_column_0) AS `layer_2_column_3`
      FROM
        (SELECT 'aa' AS `layer_3_column_0`,
                max(layer_4_column_0 - layer_4_column_2) AS `layer_3_column_1`
         FROM
           (SELECT 'aa' AS `layer_4_column_0`,
                   ((NULL) IS NOT TRUE) AS `layer_4_column_1`,
                   1 AS `layer_4_column_2`,
                   SUM(DISTINCT decimal_test) AS `layer_4_column_3`
            FROM `select_two_secondary_index`
            GROUP BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`
            ORDER BY `layer_4_column_0`,
                     `layer_4_column_1`,
                     `layer_4_column_2`,
                     `layer_4_column_3`) AS `layer_3_table`
         GROUP BY `layer_3_column_0`
         ORDER BY `layer_3_column_0`,
                  `layer_3_column_1`) AS `layer_2_table`
      GROUP BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`
      ORDER BY `layer_2_column_0`,
               `layer_2_column_1`,
               `layer_2_column_2`,
               `layer_2_column_3`) AS `layer_1_table`
   GROUP BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2` WITH ROLLUP
   ORDER BY `layer_1_column_0`,
            `layer_1_column_1`,
            `layer_1_column_2`,
            `layer_1_column_3`) AS `layer_0_table`
WHERE ((`layer_1_column_0` ^ YEAR('2008-02-20')) NOT BETWEEN 48 AND 64);
```
[11 Jul 2022 10:23] MySQL Verification Team
Hello Ze Yang,

Thank you for the report and feedback.

regards,
Umesh
[20 Jul 2022 6:36] quan chen
Rollup on derived table that all columns are constants will cause problems too.

```
mysql> SELECT 'aa' AS `layer_1_column_0`,           NOT('aa') AS `layer_1_column_1`,           null AS `layer_1_column_2`,           max(DISTINCT 'aa') AS `layer_1_column_3`    FROM dual GROUP BY `layer_1_column_0`,              `layer_1_column_1`,             `layer_1_column_2` WITH ROLLUP having layer_1_column_0 is not null   ORDER BY `layer_1_column_0`,             `layer_1_column_1`,             `layer_1_column_2`,             `layer_1_column_3`;
+------------------+------------------+------------------+------------------+
| layer_1_column_0 | layer_1_column_1 | layer_1_column_2 | layer_1_column_3 |
+------------------+------------------+------------------+------------------+
| aa               |                1 |             NULL | aa               |
| aa               |                1 |             NULL | aa               |
| aa               |             NULL |             NULL | aa               |
+------------------+------------------+------------------+------------------+
3 rows in set, 1 warning (0.00 sec)
```

```
mysql> select * from (SELECT 'aa' AS `layer_1_column_0`,           NOT('aa') AS `layer_1_column_1`,           null AS `layer_1_column_2`,           max(DISTINCT 'aa') AS `layer_1_column_3`    FROM dual GROUP BY `layer_1_column_0`,              `layer_1_column_1`,             `layer_1_column_2` WITH ROLLUP having layer_1_column_0 is not null   ORDER BY `layer_1_column_0`,             `layer_1_column_1`,             `layer_1_column_2`,             `layer_1_column_3`) AS `layer_0_table`;
+------------------+------------------+------------------+------------------+
| layer_1_column_0 | layer_1_column_1 | layer_1_column_2 | layer_1_column_3 |
+------------------+------------------+------------------+------------------+
| aa               |                1 | NULL             | aa               |
+------------------+------------------+------------------+------------------+
1 row in set, 1 warning (0.01 sec)
```

The reason for this problem is that the estimated_rowcount of derived_unit is incorrectly estimated to be 1, so derived_table is considered as a const table.