Bug #114056 Performance degredation using REGEXP in statement with parameters vs query
Submitted: 19 Feb 2024 23:03 Modified: 22 Feb 2024 10:46
Reporter: Aaron Murray Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Prepared statements Severity:S5 (Performance)
Version:8.0.32, 8.0.22+ OS:Any
Assigned to: CPU Architecture:Any
Tags: performance, prepared statement, REGEXP, regression

[19 Feb 2024 23:03] Aaron Murray
Description:
Queries utilizing REGEXP take much longer to execute if they are run as a prepared statement with parameters rather than direct queries.

I first saw this in AWS with a custom query of mine, but I was able to replicate it easily in a local Docker container running mysql:8.0 (which installs 8.0.32) using the Employees DB test data.

mysql> show profiles;
+----------+------------+---------------------------------------------------------------+
| Query_ID | Duration   | Query                                                         |
+----------+------------+---------------------------------------------------------------+
|        1 | 1.98830700 | EXECUTE stmt1 USING @a                                        |
|        2 | 0.20462300 | SELECT count(*) FROM employees WHERE last_name REGEXP 'ullai' |
+----------+------------+---------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show profile for query 1;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.001493 |
| Executing hook on transaction  | 0.000024 |
| starting                       | 0.000958 |
| Executing hook on transaction  | 0.000021 |
| starting                       | 0.000007 |
| Opening tables                 | 0.000814 |
| checking permissions           | 0.000016 |
| init                           | 0.000988 |
| System lock                    | 0.000517 |
| optimizing                     | 0.000958 |
| statistics                     | 0.000105 |
| preparing                      | 0.001031 |
| executing                      | 1.979613 |
| end                            | 0.000049 |
| query end                      | 0.000654 |
| waiting for handler commit     | 0.000045 |
| closing tables                 | 0.000036 |
| query end                      | 0.000022 |
| closing tables                 | 0.000059 |
| freeing items                  | 0.000079 |
| cleaning up                    | 0.000820 |
+--------------------------------+----------+
21 rows in set, 1 warning (0.01 sec)

mysql> show profile for query 2;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000598 |
| Executing hook on transaction  | 0.000009 |
| starting                       | 0.000008 |
| checking permissions           | 0.000007 |
| Opening tables                 | 0.000620 |
| init                           | 0.000015 |
| System lock                    | 0.000009 |
| optimizing                     | 0.000009 |
| statistics                     | 0.000158 |
| preparing                      | 0.000116 |
| executing                      | 0.202010 |
| end                            | 0.000019 |
| query end                      | 0.000005 |
| waiting for handler commit     | 0.000010 |
| closing tables                 | 0.000019 |
| freeing items                  | 0.000060 |
| cleaning up                    | 0.000952 |
+--------------------------------+----------+
17 rows in set, 1 warning (0.00 sec)

How to repeat:
Spin up MySQL 8.0.32, install the Employees test database, then run the same query as a prepared statement and a literal query and view the results. The performance does improve slight on subsequent executions, but the difference between the execution methods remains the same.

mysql> PREPARE stmt1 FROM 'SELECT count(*) FROM employees WHERE last_name REGEXP ?';
Query OK, 0 rows affected (0.00 sec)
Statement prepared

mysql> SET @a = 'ullai';
Query OK, 0 rows affected (0.01 sec)

mysql> EXECUTE stmt1 USING @a;
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (1.99 sec)

mysql> SELECT count(*) FROM employees WHERE last_name REGEXP 'ullai';
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (0.21 sec)

mysql> EXECUTE stmt1 USING @a;
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (1.56 sec)

mysql> EXECUTE stmt1 USING @a;
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (1.22 sec)

mysql> EXECUTE stmt1 USING @a;
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (1.23 sec)

mysql> SELECT count(*) FROM employees WHERE last_name REGEXP 'ullai';
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (0.16 sec)

mysql> SELECT count(*) FROM employees WHERE last_name REGEXP 'ullai';
+----------+
| count(*) |
+----------+
|      177 |
+----------+
1 row in set (0.13 sec)
[22 Feb 2024 10:46] MySQL Verification Team
Hello Aaron Murray,

Thank you for the report and test case.
Verified as described.

regards,
Umesh