| Bug #107554 | regexp_replace in trigger VERY slow when using unicode category \\p{C} | ||
|---|---|---|---|
| Submitted: | 13 Jun 2022 20:28 | Modified: | 14 Jun 2022 9:58 |
| Reporter: | Patrick LeGros | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
| Version: | 8.0.29-0ubuntu0.20.04.3 | OS: | Ubuntu (20.04.4 LTS) |
| Assigned to: | CPU Architecture: | x86 | |
| Tags: | regexp regexp_replace sql \\p{C} Control Characetr | ||
[13 Jun 2022 21:59]
Patrick LeGros
Also tried the following test, inserted 1000 and it completed in ~0.04sec.
The REGEXP_REPLACE with \p{C} seems to only be problematic in the trigger.
DROP TABLE IF EXISTS test;
CREATE TABLE test (
id SERIAL,
name VARCHAR(64) NOT NULL DEFAULT '1\02\03',
namegen VARCHAR(64) AS (REGEXP_REPLACE(name, '\\p{C}', '')) STORED NOT NULL
);
[14 Jun 2022 9:33]
MySQL Verification Team
Hi Patrick, What version did you tested this with? Latest 8.0 or ? thanks
[14 Jun 2022 9:46]
Patrick LeGros
ran: mysql --version output: mysql Ver 8.0.29-0ubuntu0.20.04.3 for Linux on x86_64 ((Ubuntu)) (sorry, i think i cleaned up the version wrong and put 9)
[14 Jun 2022 9:58]
MySQL Verification Team
Hi,
Interesting bug! Thank you for report and way to reproduce. Verified.
mysql [localhost:8027] {msandbox} (b107554) > insert into experiment select null,null from control;
Query OK, 8192 rows affected (1 min 16.37 sec)
Records: 8192 Duplicates: 0 Warnings: 0
mysql [localhost:8027] {msandbox} (b107554) > SELECT BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{S}', ''));
+----------------------------------------------------+
| BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{S}', '')) |
+----------------------------------------------------+
| 0 |
+----------------------------------------------------+
1 row in set (0.00 sec)
mysql [localhost:8027] {msandbox} (b107554) > SELECT BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{C}', ''));
+----------------------------------------------------+
| BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{C}', '')) |
+----------------------------------------------------+
| 0 |
+----------------------------------------------------+
1 row in set (0.01 sec)
mysql [localhost:8027] {msandbox} (b107554) > insert into control select null,null from control;
Query OK, 8192 rows affected (1.92 sec)
Records: 8192 Duplicates: 0 Warnings: 0
kind regards
Bogdan

Description: When I put the following SET in an insert trigger and insert 1000 rows it becomes ~8seconds slower. SET NEW.name = REGEXP_REPLACE(' ', '\\p{C}', ''); However when I run a BENCHMARK of 1000 on the same expression, it only takes ~0.01seconds. SELECT BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{C}', '')); How to repeat: ---- control & experiment table set up ----- DELIMITER ;; DROP TABLE IF EXISTS control;; CREATE TABLE control ( id SERIAL, name VARCHAR(64) NOT NULL DEFAULT ' ');; DROP TRIGGER IF EXISTS control_insert_before;; CREATE TRIGGER control_insert_before BEFORE INSERT ON control FOR EACH ROW BEGIN SET NEW.name = REGEXP_REPLACE(' ', '\\p{S}', ''); -- SAME Results w/ NEW.name, but wanted to make similar to BENCHMARK -- SET NEW.name = REGEXP_REPLACE(NEW.name, '\\p{S}', ''); END;; DROP TABLE IF EXISTS experiment;; CREATE TABLE experiment ( id SERIAL, name VARCHAR(64) NOT NULL DEFAULT ' ');; DROP TRIGGER IF EXISTS experiment_insert_before;; CREATE TRIGGER experiment_insert_before BEFORE INSERT ON experiment FOR EACH ROW BEGIN SET NEW.name = REGEXP_REPLACE(' ', '\\p{C}', ''); -- SAME Results w/ NEW.name, but wanted to make similar to BENCHMARK -- SET NEW.name = REGEXP_REPLACE(NEW.name, '\\p{C}', ''); END;; ---- tests to run ---- INSERT INTO control VALUES (),(),,... 1000x INSERT INTO experiment VALUES (),(),... 1000x SELECT BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{S}', '')); SELECT BENCHMARK(1000, REGEXP_REPLACE(' ', '\\p{C}', '')); Suggested fix: I don't know.