| Bug #100345 | The results of REGEXP_SUBSTR() are different for same string. | ||
|---|---|---|---|
| Submitted: | 28 Jul 2020 5:44 | Modified: | 4 Aug 2020 14:00 |
| Reporter: | James Lee | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
| Version: | 8.0.20, 8.0.21 | OS: | Linux |
| Assigned to: | CPU Architecture: | Any | |
[28 Jul 2020 5:44]
James Lee
[28 Jul 2020 8:00]
MySQL Verification Team
Hello James Lee, Thank you for the report and test case. regards, Umesh
[4 Aug 2020 14:00]
Martin Hansson
Posted by developer:
This is expected behavior, as Erlend pointed out. It can also be observed the following way:
mysql> SELECT REGEXP_SUBSTR('11a22A33a', '[^A]+', 1, 1, 'c'); # Case sensitive
+------------------------------------------------+
| REGEXP_SUBSTR('11a22A33a', '[^A]+', 1, 1, 'c') |
+------------------------------------------------+
| 11a22 |
+------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT REGEXP_SUBSTR('11a22A33a', '[^A]+', 1, 1, 'i'); # Case insensitive
+------------------------------------------------+
| REGEXP_SUBSTR('11a22A33a', '[^A]+', 1, 1, 'i') |
+------------------------------------------------+
| 11 |
+------------------------------------------------+
[4 Aug 2020 17:02]
Erlend Dahl
Replacing the case-insensitive collation with utf8mb4_0900_as_cs also makes the problem go away.
