| Bug #29590 | Request to add support for back-references in regular expressions | ||
|---|---|---|---|
| Submitted: | 6 Jul 2007 5:30 | Modified: | 6 Jul 2007 18:45 |
| Reporter: | Robbie Haertel | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: General | Severity: | S4 (Feature request) |
| Version: | OS: | Any | |
| Assigned to: | CPU Architecture: | Any | |
[6 Jul 2007 5:30]
Robbie Haertel
[6 Jul 2007 9:48]
Sergei Golubchik
To my surprise, back references are NOT part of the POSIX standard, at least according to "man 7 regex" :
.....
DESCRIPTION
Regular expressions (``RE''s), as defined in POSIX.2, come in two
forms: modern REs (roughly those of egrep; POSIX.2 calls these
``extended'' REs) and obsolete REs (roughly those of ed(1); POSIX.2
``basic'' REs). Obsolete REs mostly exist for backward compatibility
in some old programs; they will be discussed at the end.
.....
Obsolete (``basic'') regular expressions differ in several respects.
.....
parenthesized subexpression (after a possible leading `^'). Finally,
there is one new type of atom, a back reference: `\' followed by a non-
zero decimal digit d matches the same sequence of characters matched by
.....
As you can see back references are only supported in *basic* REs, not in the extended REs.
Henry Spencer regex library, that we use in MySQL, supports back references, but only in basic RE mode. MySQL uses extended REs.
[6 Jul 2007 18:45]
Robbie Haertel
Perhaps there could be an option (either at compile-time or run-time, but preferably the latter) to choose between the basic and extended syntax???
[6 Jul 2007 19:20]
Sergei Golubchik
Yes, this is a possibility. My choice would be, though, to move to a completely different regexp library, more powerful, and with support for multi-byte character sets :)
