Bug #119955 Semantically equivalent scalar expressions yield different result sets in WHERE clause involving RPAD()
Submitted: 27 Feb 5:03 Modified: 2 Mar 12:46
Reporter: Jasper Andrew Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:8.0.43 (MySQL Community Server - GPL) OS:Any (Docker container)
Assigned to: CPU Architecture:Any (x86_64)
Tags: functions, rpad, SQL

[27 Feb 5:03] Jasper Andrew
Description:
When using a deterministic scalar function that always returns a constant empty string, queries involving RPAD() in a WHERE clause may produce different result sets compared to using the equivalent empty string literal.

Although the function is declared DETERMINISTIC and NO SQL, and its return value is constant for all rows, substituting the function call with the empty string literal does not preserve query semantics.

Specifically, the following two predicates are expected to be semantically equivalent, as both evaluate to an empty string for all rows:

c_0 < RPAD("", pkey, c_0)

and

c_0 < RPAD(CAST(__sqlsmith_stub_text() AS CHAR), pkey, c_0)

However, they produce different result sets when executed against the same data.

This indicates that expression evaluation semantics depend on whether the empty string originates from a literal or from a deterministic scalar function, which is observable at the SQL level and affects query results.

-- using literal
SELECT COUNT(*) FROM t_0
WHERE c_0 < RPAD("", pkey, c_0);
-- returns 21

-- using deterministic function
SELECT COUNT(*) FROM t_0
WHERE c_0 < RPAD(CAST(__sqlsmith_stub_text() AS CHAR), pkey, c_0);
-- returns 19

How to repeat:
Build the database environment:
```SQL
CREATE FUNCTION __sqlsmith_stub_int()
RETURNS INT
DETERMINISTIC
NO SQL
RETURN 0;

CREATE FUNCTION __sqlsmith_stub_text()
RETURNS TEXT
DETERMINISTIC
NO SQL
RETURN '';

create table t_0 (
pkey int,
c_0 varchar(255),
c_1 int
,
primary key (pkey)
);

insert into t_0 (pkey, c_0, c_1) values (
3,
__sqlsmith_stub_text(),
28);
insert into t_0 (pkey, c_0, c_1) values (
11,
'V7Kz',
75);

insert into t_0 (pkey, c_0, c_1) values (
12,
'Eg',
23);

insert into t_0 (pkey, c_0, c_1) values (
18,
'A1',
45);

insert into t_0 (pkey, c_0, c_1) values (
20,
'hT',
87);

insert into t_0 (pkey, c_0, c_1) values (
21,
'mC',
30);

insert into t_0 (pkey, c_0, c_1) values (
28,
'SwkL',
27);

insert into t_0 (pkey, c_0, c_1) values (
29,
case when 37.97 >= 27.1 then case when (cast(nullif(false,
          true) as SIGNED) OR cast(nullif(false,
          false) as SIGNED))
      and ('C0' <> REPEAT(
          cast('M4' as CHAR),
          cast(76 as CHAR))) then 'BQ3U' else 'N' end
     else 'dJ' end
  ,
89);

insert into t_0 (pkey, c_0, c_1) values (
33,
'Du',
84);

insert into t_0 (pkey, c_0, c_1) values (
39,
cast(nullif('lFK',
  't1') as CHAR),
2);

insert into t_0 (pkey, c_0, c_1) values (
41,
__sqlsmith_stub_text(),
cast(coalesce(66,
  44) as SIGNED));

insert into t_0 (pkey, c_0, c_1) values (
43,
case when EXISTS (
    select  
        case when ref_0.c_0 < ref_0.c_0 then ref_0.c_1 else ref_0.c_1 end
           as c0,
        ref_0.c_3 as c1,
        66 as c2,
        ref_0.c_0 as c3,
        __sqlsmith_stub_int() as c4,
        (select c_0 from t_1 limit 1 offset 53)
           as c5,
        ref_0.pkey as c6,
        ref_0.c_1 as c7,
        ref_0.c_2 as c8
      from
        t_1 as ref_0
      where ref_0.c_3 is not NULL
      limit 139) then 'Boj8' else 'WY' end
  ,
86);

insert into t_0 (pkey, c_0, c_1) values (
48,
'jvgd',
56);

insert into t_0 (pkey, c_0, c_1) values (
50,
__sqlsmith_stub_text(),
__sqlsmith_stub_int());
insert into t_0 (pkey, c_0, c_1) values (
53,
(select c_0 from t_1 limit 1 offset 5)
  ,
49);

insert into t_0 (pkey, c_0, c_1) values (
56,
'tsz4',
43);

insert into t_0 (pkey, c_0, c_1) values (
62,
(select c_0 from t_1 limit 1 offset 4)
  ,
__sqlsmith_stub_int());

insert into t_0 (pkey, c_0, c_1) values (
66,
'gO',
79);

insert into t_0 (pkey, c_0, c_1) values (
68,
'e',
21);

insert into t_0 (pkey, c_0, c_1) values (
81,
'j',
__sqlsmith_stub_int());

insert into t_0 (pkey, c_0, c_1) values (
84,
'8KgER',
case when 58 >= 75 then 89 else 78 end
  );

insert into t_0 (pkey, c_0, c_1) values (
85,
__sqlsmith_stub_text(),
12);

insert into t_0 (pkey, c_0, c_1) values (
90,
'MxMBv',
93);

insert into t_0 (pkey, c_0, c_1) values (
92,
__sqlsmith_stub_text(),
60);

insert into t_0 (pkey, c_0, c_1) values (
95,
'pbz4',
__sqlsmith_stub_int());

insert into t_0 (pkey, c_0, c_1) values (
99,
__sqlsmith_stub_text(),
77);

insert into t_0 (pkey, c_0, c_1) values (
106,
__sqlsmith_stub_text(),
__sqlsmith_stub_int());

insert into t_0 (pkey, c_0, c_1) values (
110,
'yL',
13);

insert into t_0 (pkey, c_0, c_1) values (
117,
'l',
46);

insert into t_0 (pkey, c_0, c_1) values (
120,
'bS',
84);
```

Execute the SQL:
```SQL
-- Query using empty string literal
SELECT COUNT(*) FROM t_0
WHERE c_0 < RPAD("", pkey, c_0);

-- Query using deterministic scalar function
SELECT COUNT(*) FROM t_0
WHERE c_0 < RPAD(
  CAST(__sqlsmith_stub_text() AS CHAR),
  pkey,
  c_0
);

```

Suggested fix:
Ensure that semantically equivalent scalar expressions produce the same results when used in predicates, regardless of whether the constant value originates from a literal or from a deterministic scalar function.
[2 Mar 12:46] Roy Lyseng
Thank you for the bug report.
Verified as described.