Bug #1693 collating element not accepted in REGEX
Submitted: 28 Oct 2003 20:01 Modified: 13 Dec 2003 3:34
Reporter: Christian Hammers (Silver Quality Contributor) (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S3 (Non-critical)
Version:4.0.16 OS:Linux (Debian GNU/Linux)
Assigned to: Alexander Barkov CPU Architecture:Any

[28 Oct 2003 20:01] Christian Hammers
Description:
This was reported as Debian bug report and is available at
http://bugs.debian.org/214952

--------------------------------------------------------
A SELECT query with "WHERE field REGEXP 'characters[^[.sequence.]]'"
gives "invalid collating element" if sequence is longer than 1 char.
example: SELECT * FROM table1 WHERE url REGEXP 'http://w[^[.ww\..]]';
-----------------------------------------------------------

(hm, it does not even seem to accept '[[.ch.]]*c' which is the example used in
http://www.mysql.com/doc/en/Regexp.html)

How to repeat:
mysql> SELECT * FROM test1 WHERE url REGEXP '[[.ch.]]*c';
ERROR 1139: Got error 'invalid collating element' from regexp

Suggested fix:
not known
[28 Oct 2003 23:43] Alexander Barkov
The regex library used in MySQL doesn't seem to support
this style of collating sequences. 

I'm wondering if we should fix it or close as "Won't fix".

Christian, do you need collating sequences for work?

Or did you probably just wanted to let other people know when you
noticed this, without having a real demond to use this feature?
[29 Oct 2003 13:32] Christian Hammers
The bug was reported to the Debian bug tracking system by Khalid Shukri <khalid@empi.dnsalias.com>. I will ask him if he desperately needs this feature :)
In the meantime it should at least noted in the documentation that this sequence does not work and maybe given a hint that the exact regex definition is in
"man 7 regex" or whereever the used regex lib is documented.
[13 Dec 2003 3:34] Michael Widenius
I have now update the manual to reflect how the regexp library works:

[[.characters.]]
The sequence of characters of that collating element. 'characters'
is either a single character or a character name like 'newline'.
You can find the full list of character names in 'regexp/cname.h'

Regards,
Monty