Bug #104387 | CHARACTER_SET_MISMATCH issue with regex comparisons | ||
---|---|---|---|
Submitted: | 22 Jul 2021 21:24 | Modified: | 18 Aug 2021 0:29 |
Reporter: | Leonardo Fernandes | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 8.0.22, 8.0.23, 8.0.24, 8.0.25, 8.0.26 | OS: | Any |
Assigned to: | Jon Stephens | CPU Architecture: | Any |
Tags: | regression |
[22 Jul 2021 21:24]
Leonardo Fernandes
[23 Jul 2021 6:12]
MySQL Verification Team
Hello Leonardo, Thank you for the report and feedback. Observed that 8.0.26 is affected(no issues observed in 5.7.35). -- - 8.0.26 bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 7 Server version: 8.0.26 MySQL Community Server - GPL Copyright (c) 2000, 2021, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.01 sec) mysql> use test Database changed mysql> create table colltest ( id int unsigned not null primary key auto_increment, cbin varbinary(140), ctxt text character set utf8mb3 collate utf8mb3_bin, cstr varchar(255) character set utf8mb4); Query OK, 0 rows affected, 2 warnings (0.02 sec) mysql> mysql> insert into colltest set cbin='asdf',ctxt='asdf',cstr='asdf'; Query OK, 1 row affected (0.00 sec) mysql> mysql> select * from colltest where cbin rlike cstr; ERROR 3995 (HY000): Character set 'binary' cannot be used in conjunction with 'utf8mb4_0900_ai_ci' in call to regexp_like. mysql> mysql> select * from colltest where cbin rlike ctxt; ERROR 3995 (HY000): Character set 'binary' cannot be used in conjunction with 'utf8_bin' in call to regexp_like. mysql> mysql> show errors; +-------+------+----------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------------------------------------------------------------+ | Error | 3995 | Character set 'binary' cannot be used in conjunction with 'utf8_bin' in call to regexp_like. | +-------+------+----------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> show warnings; +-------+------+----------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------------------------------------------------------------+ | Error | 3995 | Character set 'binary' cannot be used in conjunction with 'utf8_bin' in call to regexp_like. | +-------+------+----------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> -- 5.7.35 bin/mysql -uroot -S /tmp/mysql_ushastry.sock Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.7.35 MySQL Community Server (GPL) Copyright (c) 2000, 2021, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database test; Query OK, 1 row affected (0.00 sec) mysql> use test Database changed mysql> create table colltest ( id int unsigned not null primary key auto_increment, cbin varbinary(140), ctxt text character set utf8mb3 collate utf8mb3_bin, cstr varchar(255) character set utf8mb4); Query OK, 0 rows affected (0.02 sec) mysql> insert into colltest set cbin='asdf',ctxt='asdf',cstr='asdf'; Query OK, 1 row affected (0.00 sec) mysql> select * from colltest where cbin rlike cstr; +----+------+------+------+ | id | cbin | ctxt | cstr | +----+------+------+------+ | 1 | asdf | asdf | asdf | +----+------+------+------+ 1 row in set (0.00 sec) mysql> select * from colltest where cbin rlike ctxt; +----+------+------+------+ | id | cbin | ctxt | cstr | +----+------+------+------+ | 1 | asdf | asdf | asdf | +----+------+------+------+ 1 row in set (0.00 sec) From 8.0.22 change log this seems to be intentional change but verifying since those who upgrade from 5.7.35(or earlier) to 8.0 GA(or <8.0.22) are surely affected from this change. - Quoting from 8.0.22 change log - https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-22.html "Regular expression functions such as REGEXP_LIKE() yielded inconsistent results with binary string arguments. These functions now reject binary strings with an error. (Bug #31031886, Bug #98951, Bug #31031888, Bug #98950)" regards, Umesh
[26 Jul 2021 15:12]
Leonardo Fernandes
Thank you, Umesh, I hadn't noticed it was a documented change: "Regular expression functions such as REGEXP_LIKE() yielded inconsistent results with binary string arguments. These functions now reject binary strings with an error. (Bug #31031886, Bug #98951, Bug #31031888, Bug #98950)"
[27 Jul 2021 19:51]
Trey Raymond
fun fact: one of the documented examples (casting to binary) doesn't work now due to this https://dev.mysql.com/doc/refman/8.0/en/regexp.html#operator_regexp mysql> SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A'; ERROR 3995 (HY000): Character set 'utf8mb4_0900_ai_ci' cannot be used in conjunction with 'binary' in call to regexp_like.
[10 Aug 2021 17:39]
Jon Stephens
See also BUG#103223.
[18 Aug 2021 0:29]
Jon Stephens
Fixed in mysqldoc rev 70644. Closed.