Bug #106444 ESCAPE keyword with non-string values
Submitted: 11 Feb 2022 21:42 Modified: 26 Feb 2022 1:23
Reporter: Yu Liang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.27, 8.0.28, 5.7.37 OS:Ubuntu (Ubuntu 20.04.3 LTS)
Assigned to: CPU Architecture:x86 (Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz)
Tags: escape

[11 Feb 2022 21:42] Yu Liang
Description:
For the following query: 

mysql> CREATE TABLE v0 ( c1 TEXT NOT NULL );
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO v0 VALUES ('100');
Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE '%';
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE '%') FROM v0;
+-------------------------+
| (c1 LIKE c1 ESCAPE '%') |
+-------------------------+
|                       1 |
+-------------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE TRUE;
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE TRUE) FROM v0;
+--------------------------+
| (c1 LIKE c1 ESCAPE TRUE) |
+--------------------------+
|                        0 |
+--------------------------+
1 row in set (0.00 sec)

-- Unexpected, the LIKE comparison result mismatched from the previous one. 

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE 1;
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE 1) FROM v0;
+-----------------------+
| (c1 LIKE c1 ESCAPE 1) |
+-----------------------+
|                     0 |
+-----------------------+
1 row in set (0.00 sec)

-- Unexpected, the LIKE comparison result mismatched from the previous one. 

-----------------------------------

From the query above, we create a table v0 with NOT NULL TEXT column c1. We insert value '100' into the table and then use SELECTs to access the contents. We know that LIKE ... ESCAPE ... keywords are used to match query patterns and the ESCAPE keyword takes string character as input. As demonstrated in the first and second SELECT statements, even if c1 doesn't have the special character '%', the comparison of c1 LIKE c1 is successfully and the value of c1 is returned. However, If we use non-string values in the ESCAPE, for example, using Boolean (the third/fourth SELECTs) or Integer (the fifth/sixth SELECTs), there are no errors reported and the results returned from the SELECT or WHERE clauses mismatched. This is unexpected, as we actually assume there will be error messages reported if we pass in non-string values in ESCAPE. Even if it accepts non-string values, the ESCAPE shouldn't affect the results from the LIKE comparison. 

We are not sure whether it is an expected behavior for ESCAPE to accept a non-string value. And we cannot find documents that explain this behavior. Thus, we report the interesting query here and look forward to some explanation of this query.  :-)

How to repeat:
In MySQL Server 8.0.27, use the following query commands:

mysql> CREATE TABLE v0 ( c1 TEXT NOT NULL );
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO v0 VALUES ('100');
Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE '%';
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE '%') FROM v0;
+-------------------------+
| (c1 LIKE c1 ESCAPE '%') |
+-------------------------+
|                       1 |
+-------------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE TRUE;
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE TRUE) FROM v0;
+--------------------------+
| (c1 LIKE c1 ESCAPE TRUE) |
+--------------------------+
|                        0 |
+--------------------------+
1 row in set (0.00 sec)

-- Unexpected, the LIKE comparison result mismatched from the previous one. 

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE 1;
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE 1) FROM v0;
+-----------------------+
| (c1 LIKE c1 ESCAPE 1) |
+-----------------------+
|                     0 |
+-----------------------+
1 row in set (0.00 sec)

-- Unexpected, the LIKE comparison result mismatched from the previous one. 

Suggested fix:
As we expect Errors from the ESCAPE if we pass in non-string values:

mysql> CREATE TABLE v0 ( c1 TEXT NOT NULL );
mysql> INSERT INTO v0 VALUES ('100');
mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE '%';
+-----+
| c1  |
+-----+
| 100 |
+-----+
1 row in set (0.00 sec)

mysql> SELECT (c1 LIKE c1 ESCAPE '%') FROM v0;
+-------------------------+
| (c1 LIKE c1 ESCAPE '%') |
+-------------------------+
|                       1 |
+-------------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE TRUE;
ERROR 1210 (HY000): Incorrect arguments to ESCAPE

mysql> SELECT (c1 LIKE c1 ESCAPE TRUE) FROM v0;
ERROR 1210 (HY000): Incorrect arguments to ESCAPE

mysql> SELECT * FROM v0 WHERE c1 LIKE c1 ESCAPE 1;
ERROR 1210 (HY000): Incorrect arguments to ESCAPE

mysql> SELECT (c1 LIKE c1 ESCAPE 1) FROM v0;
ERROR 1210 (HY000): Incorrect arguments to ESCAPE
[12 Feb 2022 5:49] MySQL Verification Team
Hello Yu Liang,

Thank you for the report and test case.

regards,
Umesh
[14 Feb 2022 13:52] Roy Lyseng
With a few exceptions, MySQL allows numeric expressions even when a character string is expected. But the expression is first converted into a string, thus ESCAPE 1 is equivalent to ESCAPE '1', and since TRUE is the equivalent of 1,
even ESCAPE TRUE is equivalent to ESCAPE '1'.

The real problem here is a shortcut taken during optimization of LIKE predicates in the WHERE clause.
[3 Mar 2022 21:39] Jon Stephens
Documented fix as follows in the MySQL 8.0.30 changelog:

    Previously, it was assumed that, when the same non-nullable
    expression was used as both the first and second arguments to
    LIKE, the result was always true, and so could be optimized
    away. This assumption turns out not to be valid, due to the fact
    that LIKE treats the backslash (\) as an escape character, even
    when ESCAPE is not specified. This led to different results when
    the condition was used in the SELECT list as opposed to the
    WHERE clause. To fix the problem, we no longer perform this
    optimization with LIKE, with or without an ESCAPE clause.

Closed.