Bug #101833 Mysql Regex bug with multiline
Submitted: 2 Dec 2020 13:19 Modified: 7 Dec 2020 14:01
Reporter: soheil rahsaz Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.22 OS:Any
Assigned to: CPU Architecture:Any

[2 Dec 2020 13:19] soheil rahsaz
Description:
When matching a multi line string and pattern after new line, it returns false.

How to repeat:
Mysql documentation test which is ok:
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '^f.*$', 'm');
+---------------------------------------+
| REGEXP_LIKE('fo\r\nfo', '^f.*$', 'm') |
+---------------------------------------+
|                                     1 |
+---------------------------------------+
1 row in set (0.00 sec)

adding 'o' in the end, still ok:
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '^f.*o$', 'm');
+----------------------------------------+
| REGEXP_LIKE('fo\r\nfo', '^f.*o$', 'm') |
+----------------------------------------+
|                                      1 |
+----------------------------------------+
1 row in set (0.00 sec)

this one should have return true:
mysql> SELECT REGEXP_LIKE('fo\r\nfo', '^f.*fo$', 'm');
+-----------------------------------------+
| REGEXP_LIKE('fo\r\nfo', '^f.*fo$', 'm') |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
1 row in set (0.00 sec)

also this one should have been true:
mysql> SELECT REGEXP_LIKE('fo\r\n123fo', '^f.*fo$', 'm');
+--------------------------------------------+
| REGEXP_LIKE('fo\r\n123fo', '^f.*fo$', 'm') |
+--------------------------------------------+
|                                          0 |
+--------------------------------------------+
1 row in set (0.00 sec)

with (?m) not working too:
mysql> SELECT REGEXP_LIKE('fo\r\n123fo', '(?m)^f.*fo$');
+-------------------------------------------+
| REGEXP_LIKE('fo\r\n123fo', '(?m)^f.*fo$') |
+-------------------------------------------+
|                                         0 |
+-------------------------------------------+
1 row in set (0.00 sec)

if i remove 'f' in pattern, it works:
mysql> SELECT REGEXP_LIKE('fo\r\n123fo', '^f.*o$', 'm');
+-------------------------------------------+
| REGEXP_LIKE('fo\r\n123fo', '^f.*o$', 'm') |
+-------------------------------------------+
|                                         1 |
+-------------------------------------------+
1 row in set (0.00 sec)
[2 Dec 2020 14:36] MySQL Verification Team
Hi Mr. rahsaz,

Thank you for your bug report.

However, this is not a bug.

Since '$' means end-of-the-line and end-of-the-line is denoted by '\r' or '\n' or both, hence it is logical that your other expressions return 0.

Not a bug.
[3 Dec 2020 5:10] soheil rahsaz
Thank you for your response.
Unfortunately, I am not convinced that this is not a bug.

Based on the documentation:
". Match any character (including carriage return and newline, although to match these in the middle of a string, the m (multiple line) match-control character or the (?m) within-pattern modifier must be given)."

So based on this description, I expect '.' to match the '\n' and move over.
Another example here which doesn't work and doesn't include $ sign:

select regexp_like('ma123sb', 'ma.*sb', 'm');
+---------------------------------------+
| regexp_like('ma123sb', 'ma.*sb', 'm') |
+---------------------------------------+
|                                     1 |
+---------------------------------------+
1 row in set (0.00 sec)

mysql> select regexp_like('ma12\n3sb', 'ma.*sb', 'm');
+-----------------------------------------+
| regexp_like('ma12\n3sb', 'ma.*sb', 'm') |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
1 row in set (0.00 sec)

Also here the '.' is not matching '\n' character while 'm' is provided.
[3 Dec 2020 13:00] MySQL Verification Team
Still not a bug.

That option is just for recognising special characters like new-line or carriage-return.

Not a bug.
[7 Dec 2020 14:01] soheil rahsaz
Thank you for your response.

Based on your answer I think that the documentation provided here is not clear enough to me and might be unclear to others too.
https://dev.mysql.com/doc/refman/8.0/en/regexp.html

So I suggest that you add more explanation and examples here(mostly because in 5.7 '.' would match a new line in a very different way)
[8 Dec 2020 13:15] MySQL Verification Team
Hi,

What you are requesting is already described in the Changelogs of 5.7 and 8.0. Those can be found on our site, dev.mysql.com.