| Bug #93479 | REGEXP_REPLACE incorrect results: spuriously concatenates data across rows | ||
|---|---|---|---|
| Submitted: | 4 Dec 2018 23:53 | Modified: | 5 Dec 2018 20:46 |
| Reporter: | Daniel Posch | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S1 (Critical) |
| Version: | 8.0.11 | OS: | Linux (Amazon RDS) |
| Assigned to: | CPU Architecture: | x86 (x86_64) | |
[5 Dec 2018 4:58]
MySQL Verification Team
Hello Daniel, Thank you for the report and test case. Observed this with 8.0.11 build but it is no longer reproducible in latest 8.0.13 build. This is most likely fixed after Bug #90803, which is fixed in 8.0.12. We don't fix bugs in old versions, don't backport bug fixes. So, please, upgrade and inform us if problem still exists. -- 8.0.11 - affected mysql> create table tmp(wat varchar(100)); Query OK, 0 rows affected (0.06 sec) mysql> insert into tmp(wat) values ('Jim Bob'), ('John Bobbs'), ('Tim Bub'); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select wat, regexp_replace(wat, '[^ ]* ', '') as lastName from tmp; +------------+-------------+ | wat | lastName | +------------+-------------+ | Jim Bob | Bob | | John Bobbs | BobBobbs | | Tim Bub | BobBobbsBub | +------------+-------------+ 3 rows in set (0.03 sec) mysql> show variables like "%version%"; +-------------------------+------------------------------+ | Variable_name | Value | +-------------------------+------------------------------+ | innodb_version | 8.0.11 | | protocol_version | 10 | | slave_type_conversions | | | tls_version | TLSv1,TLSv1.1,TLSv1.2 | | version | 8.0.11 | | version_comment | MySQL Community Server - GPL | | version_compile_machine | x86_64 | | version_compile_os | linux-glibc2.12 | | version_compile_zlib | 1.2.11 | +-------------------------+------------------------------+ 9 rows in set (0.01 sec) -- 8.0.13 - not affected mysql> create table tmp(wat varchar(100)); Query OK, 0 rows affected (0.05 sec) mysql> insert into tmp(wat) values ('Jim Bob'), ('John Bobbs'), ('Tim Bub'); Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select wat, regexp_replace(wat, '[^ ]* ', '') as lastName from tmp; +------------+----------+ | wat | lastName | +------------+----------+ | Jim Bob | Bob | | John Bobbs | Bobbs | | Tim Bub | Bub | +------------+----------+ 3 rows in set (0.00 sec) mysql> show variables like "%version%"; +-------------------------+------------------------------+ | Variable_name | Value | +-------------------------+------------------------------+ | innodb_version | 8.0.13 | | protocol_version | 10 | | slave_type_conversions | | | tls_version | TLSv1,TLSv1.1,TLSv1.2 | | version | 8.0.13 | | version_comment | MySQL Community Server - GPL | | version_compile_machine | x86_64 | | version_compile_os | linux-glibc2.12 | | version_compile_zlib | 1.2.11 | +-------------------------+------------------------------+ 9 rows in set (0.01 sec) regards, Umesh
[5 Dec 2018 20:42]
Daniel Posch
Thanks! Curious, which bug is this a duplicate of? I searched for REGEXP_REPLACE before filing, got no results.
[5 Dec 2018 20:46]
Daniel Posch
Never mind, I see https://bugs.mysql.com/bug.php?id=90803 now. Bug search is still a bit broken (searching for REGEXP_REPLACE should've returned that bug) but no problem.

Description: REGEXP_REPLACE appears to be prepending data from **other rows in the result set** into each computed value. In certain situations, this could be a security vulnerability. Correctly written applications could return sensitive information to the wrong parties, since the database is combining unrelated data across rows in an invalid way. How to repeat: Minimal steps to reproduce: create table tmp(wat varchar(100)); insert into tmp(wat) values ('Jim Bob'), ('John Bobbs'), ('Tim Bub'); select wat, regexp_replace(wat, '[^ ]* ', '') as lastName from tmp; Expected result: wat, lastName Jim Bob, Bob John Bobbs, Bobbs Tim Bub, Bub Observed result: wat, lastName Jim Bob, Bob John Bobbs, BobBobbs Tim Bub, BobBobbsBub Data from OTHER ROWS is concatenated into the result on the current row. (!) I suspect that there may be either: - Some state in the regex engine that is not being cleared properly between rows - A buggy interaction between REGEXP_REPLACE and window functionality (both new in MySQL 8)