Bug #399 regexp arguments seem to get unescaped when matching a string starting with '+'
Submitted: 8 May 2003 8:06 Modified: 30 May 2003 5:56
Reporter: Hans Rakers Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:3.23.54 and 4.0.12 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[8 May 2003 8:06] Hans Rakers
Description:
When matching a string starting with a '+' (like a international phonenr "+1234567890") mysql seems to de-escape the regex before matching.

Since my initial plan to match "^\+[0-9]+$" kept returning "ERROR 1139: Got error 'repetition-operator operand invalid' from regexp" i brought the regexp down to matching all phonenrs starting with a +:

mysql> select techcgsm from address where techcgsm regexp "^\+";
ERROR 1139: Got error 'repetition-operator operand invalid' from regexp

MySQL sees the escaped + character as a repetition operator?!

Tested single quotes around regex, no luck. Tested with double escaped + char:

mysql> select techcgsm from address where techcgsm regexp "^\\+";

<-- snip lots of results -->

124 rows in set (0.00 sec)

So it seems MySQL de-escapes the regex before matching.

Tested on 3.23.54 and 4.0.12.

How to repeat:
Match a column containing strings starting with '+' char using regex "^\+" like:

mysql> select techcgsm from address where techcgsm regexp "^\+";
ERROR 1139: Got error 'repetition-operator operand invalid' from regexp

Suggested fix:
don't de-escape regexp args before matching
[30 May 2003 5:56] Alexander Keremidarski
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

The problem is that backslashes are parsed twice. First time by query parser and second by function which handles REGEXP match.

When you type: "^\+[0-9]+$" parser escapes \+ and REGEXP recieves following "^+[0-9]+$"

or for your simpler example  "^\+" becomes "^+"

As you noticed solution is to escape backslash with one more backslash
"^\\+"

Here it is:
mysql> select "+123456" regexp "^\+[0-9]+$";
ERROR 1139: Got error 'repetition-operator operand invalid' from regexp

mysql> select "+123456" regexp "^\\+[0-9]+$";
+--------------------------------+
| "+123456" regexp "^\\+[0-9]+$" |
+--------------------------------+
|                              1 |
+--------------------------------+

Because at that point REGEXP recieves "^\+[0-9]+$" as you wanted.

I have to admit it is not well documented in manual. Should be fixed.