Bug #95347 | REGEXP negative match on ], like [^][], works in 5.7, not in 8.0 | ||
---|---|---|---|
Submitted: | 13 May 2019 9:00 | Modified: | 23 Jul 2020 5:14 |
Reporter: | Daniel Black | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S2 (Serious) |
Version: | 8.0.16, 8.0.11 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | regression |
[13 May 2019 9:00]
Daniel Black
[13 May 2019 9:20]
MySQL Verification Team
Hello Daniel, Thank you for the report and test case. Verified as described. regards, Umesh
[13 May 2019 9:21]
MySQL Verification Team
- 5.7.26 bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.26 MySQL Community Server (GPL) Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.00 sec) mysql> use test Database changed mysql> CREATE TABLE vars (varname VARCHAR(20)); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO vars VALUES -> ('hello world'), -> ('hello [world'), -> ('hello world]'), -> ('hello [world]'); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> mysql> SELECT varname, varname RLIKE '^[^][]+$' AS `match` -> FROM vars -> ; +---------------+-------+ | varname | match | +---------------+-------+ | hello world | 1 | | hello [world | 0 | | hello world] | 0 | | hello [world] | 0 | +---------------+-------+ 4 rows in set (0.00 sec) - 8.0.16 bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 8.0.16 MySQL Community Server - GPL Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.00 sec) mysql> use test Database changed mysql> CREATE TABLE vars (varname VARCHAR(20)); Query OK, 0 rows affected (0.01 sec) mysql> INSERT INTO vars VALUES -> ('hello world'), -> ('hello [world'), -> ('hello world]'), -> ('hello [world]'); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> SELECT varname, varname RLIKE '^[^][]+$' AS `match` -> FROM vars; ERROR 3696 (HY000): The regular expression contains an unclosed bracket expression. mysql> \q Bye
[13 May 2019 9:23]
MySQL Verification Team
- lowest version checked in 8.0 bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 8.0.11 MySQL Community Server - GPL Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.20 sec) mysql> use test Database changed mysql> mysql> CREATE TABLE vars (varname VARCHAR(20)); Query OK, 0 rows affected (0.05 sec) mysql> INSERT INTO vars VALUES -> ('hello world'), -> ('hello [world'), -> ('hello world]'), -> ('hello [world]'); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> mysql> SELECT varname, varname RLIKE '^[^][]+$' AS `match` -> FROM vars; ERROR 3696 (HY000): The regular expression contains an unclosed bracket expression. mysql>
[28 May 2019 7:46]
MySQL Verification Team
*** MHANSSON martin.hansson Wed May 15 2019 20:10:56 GMT+0530 (IST)*** Can you please elaborate on what you perceive as the problem? First, the syntax [^][] is mentioned, which is not valid and is documented as such. Later, the example of [^]] is brought up, which is valid and documented as valid. The "how to repeat" boils down to the former syntax: mysql> SELECT varname, varname RLIKE '[^][]' AS `match` FROM vars; ERROR 3696 (HY000): The regular expression contains an unclosed bracket expression Either way, the problem is easily circumvented with a backslash escape. Which is the actual issue?
[28 May 2019 12:20]
Daniel Black
Quite right, the only issue is if '[' is valid in a character set. The docs for https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp under the ' [a-dX], [^a-dX] ' paragraphs heading (same for 5.7) doesn't indicate that '[[]' is invalid. Isn't every character(like '[') apart from the terminating ']' valid? Not sure which criteria is being used to invalidate it. So previously in 5.7 the following worked. select '[' RLIKE '[[]' 1 8.0 requires escapes like select '[' RLIKE '[\\[]' 1 The 8.0 syntax parses in 5.7 however select '\\' RLIKE '[^]\\[]' will return 0 in 5.7 and 1 in 8.0 So 2 different behaviours has been introduced.
[28 May 2019 12:54]
MySQL Verification Team
Thank you, Daniel. I'll forward your responses to Dev. regards, Umesh
[29 Jul 2019 9:57]
Martin Hansson
Hi Daniel, between 5.8 and 8.0, we introduced a new regular expression library so that we could support new functions (https://mysqlserverteam.com/new-regular-expression-functions-in-mysql-8-0/). This library obviously has a different implementation, so it will behave differently in some corner cases. Formally, I suppose you are always required to escape meta-characters in a regular expression syntax, but some libraries are more liberal in the cases when it is clear what the user means. These little quirks are rarely documented in the libraries' documentation. As for the second example, select '\\' RLIKE '[^]\\[]', the result in 8.0 is more reasonable, IMHO: The library searches a string consisting of "\" (since the mysql parser "eats" one backslash). Likewise, the regexp library sees the regexp "[^]\[]", meaning "match either something that is not a closing bracket or an opening bracket", (since it "eats" the remaining backslash.) Backslash is not a closing bracket, so it matches.
[25 Apr 2021 15:56]
Rick James
This is one of many cases where 8.0 REGEXPs are incompatible with 5.x. "Word boundary" is another, possibly more common, incompatibility. For some syntaxes, it seems like it might be possible to 'fix' the syntax on the fly?