Bug #120011 The normal query generated a non-optimal query plan with index.
Submitted: 9 Mar 10:29 Modified: 9 Mar 11:16
Reporter: chi zhang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:8.0.44 OS:Any
Assigned to: CPU Architecture:Any

[9 Mar 10:29] chi zhang
Description:
Hi,

In the following tese case, there are two equivalent query, one is a normal SELECT query, and another is a prepared SELECT query. Typically, because prepared statements contain unknown literals, the query plan generated during their compilation is often suboptimal. As a result, their execution time is generally expected to be longer than that of a normal statement. However, in this example,  its execution time is consistently faster than that of the normal SELECT statement. I run this example multiple times and get the same result. So the normal SELECT may choose a faster query plan.

```
CREATE TABLE t0(c0 LONGTEXT) ;
INSERT INTO t0 (c0)
WITH RECURSIVE seq AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 1 FROM seq WHERE n < 1000
)
SELECT MD5(RAND()) FROM seq;
CREATE INDEX i1 ON t0(c0(2) DESC);
SET PROFILING = 1;
SELECT SUM((t0.c0) LIKE (((CAST(-4.77828644E8 AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0;
SET @a = -4.77828644E8;
PREPARE prepare_query FROM 'SELECT ALL SUM((t0.c0) LIKE (((CAST(? AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0';
EXECUTE prepare_query USING @a;
SET PROFILING = 0;
SHOW PROFILES;
Query_ID        Duration        Query
1       0.00232075      SELECT SUM((t0.c0) LIKE (((CAST(-4.77828644E8 AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0
2       0.00007150      SET @a = -4.77828644E8
3       0.00015075      PREPARE prepare_query FROM 'SELECT ALL SUM((t0.c0) LIKE (((CAST(? AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0'
4       0.00077125      EXECUTE prepare_query USING @a

EXPLAIN ANALYZE SELECT SUM((t0.c0) LIKE (((CAST(-4.77828644E8 AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0;
-> Aggregate: sum((t0.c0 like ((cast(-(4.77828644E8) as double) <> 'OL') is true)))  (cost=330 rows=1) (actual time=1.51..1.51 rows=1 loops=1)\n    -> Covering index scan on t0 using i1  (cost=100 rows=999) (actual time=0.0112..1.23 rows=1000 loops=1)\n

SET @a = -4.77828644E8;
PREPARE prepare_query FROM 'EXPLAIN ANALYZE SELECT ALL SUM((t0.c0) LIKE (((CAST(? AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0';
EXECUTE prepare_query USING @a;
-> Aggregate: sum((t0.c0 like ((cast(-477828644 as double) <> 'OL') is true)))  (cost=330 rows=1) (actual time=0.84..0.84 rows=1 loops=1)\n    -> Table scan on t0  (cost=100 rows=999) (actual time=0.0155..0.522 rows=1000 loops=1)\n
```

How to repeat:
CREATE TABLE t0(c0 LONGTEXT) ;
INSERT INTO t0 (c0)
WITH RECURSIVE seq AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 1 FROM seq WHERE n < 1000
)
SELECT MD5(RAND()) FROM seq;
CREATE INDEX i1 ON t0(c0(2) DESC);
SET PROFILING = 1;
SELECT SUM((t0.c0) LIKE (((CAST(-4.77828644E8 AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0;
SET @a = -4.77828644E8;
PREPARE prepare_query FROM 'SELECT ALL SUM((t0.c0) LIKE (((CAST(? AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0';
EXECUTE prepare_query USING @a;
SET PROFILING = 0;
SHOW PROFILES;
EXPLAIN ANALYZE SELECT SUM((t0.c0) LIKE (((CAST(-4.77828644E8 AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0;
SET @a = -4.77828644E8;
PREPARE prepare_query FROM 'EXPLAIN ANALYZE SELECT ALL SUM((t0.c0) LIKE (((CAST(? AS DOUBLE)) NOT IN ("OL")) IS TRUE)) AS ref0 FROM t0';
EXECUTE prepare_query USING @a;
[9 Mar 11:16] Roy Lyseng
Thank you for the bug report.
Verified as described.