Bug #52080 | Make REGEXP to work properly with a multibyte character sets | ||
---|---|---|---|
Submitted: | 16 Mar 2010 6:57 | Modified: | 27 Aug 2010 9:59 |
Reporter: | Pavel Sirovatsky | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: General | Severity: | S4 (Feature request) |
Version: | 5.0.27 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | REGEXP |
[16 Mar 2010 6:57]
Pavel Sirovatsky
[16 Mar 2010 8:32]
Valeriy Kravchuk
While this is easily repeatable, our manual (http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp) explains this IMHO: "Warning The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal."
[17 Mar 2010 7:38]
Pavel Sirovatsky
change status to feature request
[17 Mar 2010 7:45]
Valeriy Kravchuk
Making REGEXP to work properly with a multibyte character sets sounds like a reasonable and nice feature request.
[27 Aug 2010 9:59]
Valeriy Kravchuk
Duplicate of Bug#30241.