Bug #71779 | regexp :upper: should work in case insensitive fields normally. | ||
---|---|---|---|
Submitted: | 20 Feb 2014 13:48 | Modified: | 24 Jul 2014 19:40 |
Reporter: | Sergio Abreu | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | OS: | Any | |
Assigned to: | CPU Architecture: | Any | |
Tags: | case insensitive, REGEXP, upper |
[20 Feb 2014 13:48]
Sergio Abreu
[20 Feb 2014 14:16]
Peter Laursen
I don't think it is related to collation being case sensitive or case insensitive (in other words, I belive you would have exactly the same issue with a CS or binary collation). As far as I remember from a previous discussion here, the implementation of regular expression simply does not check for and do not consider case sensitivity of collations. In my understanding they are also incompatible. However I find (to my pleasant surprise) that you may use constructions like SELECT 'abc' REGEXP 'a' COLLATE utf8_bin; -- returns 1/true SELECT 'abc' REGEXP 'A' COLLATE utf8_bin; -- returns 0/false (where it is not the regular expression that handles the case sensivity, but the COLLATE clause does) .. and this could be a point to add to documentation somewhere, I think actually. Peter (not a MySQL/Oracle person)
[20 Feb 2014 14:34]
Sergio Abreu
Thank you for the quick response, and I found an easier way: Just have to add BINARY in the query. Select .... from ... where info REGEXP BINARY '[[:upper:]]+' but IMHO, I think the BINARY behavior of REGEXP function "should the DEFAULT BEHAVIOR" because it is BAD that REGEXP '[A-Z]' matches 'abcde' that are all lower case... and that "LOWER(field) REGEXP '[[:upper:]]+'" matches anything, because there should not found any CAPITAL LETTER inside a LOWER converted field! So, my comment is: MySql REGEXP should use the BINARY behavior BY DEFAULT, and to do a case-insensitive search, there could be add different command... Because Regular expressions should be very restrictive and precise when looking for patterns. It is a fault of MySql, in my oppinion, that we have to use REGEXP BINARY to accomplish the correct behavior of REGEXP...
[20 Feb 2014 14:37]
Sergio Abreu
I think it does get related with the field collation. Because when I created a general_cs field to test the same SELECT, the REGEXP behavior was exactly the REGEXP BINARY one.
[24 Jul 2014 19:40]
Sveta Smirnova
Thank you for the report. Closing as "Not a bug", because last comment.