Bug #34017 REGEXP doesn't work with ucs2 charset columns.
Submitted: 23 Jan 2008 19:45 Modified: 29 Oct 2019 22:45
Reporter: Dae San Hwang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S4 (Feature request)
Version:4.1 OS:Any (Mac OS X 10.4.11, Linux)
Assigned to: Assigned Account CPU Architecture:Any
Tags: charset, REGEXP, ucs2

[23 Jan 2008 19:45] Dae San Hwang
Description:
When REGEXP is used in WHERE clause against ucs2 charset columns, mysql client outputs the following error:

ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp

How to repeat:
mysql> CREATE TABLE users (id INT NOT NULL PRIMARY KEY, name VARCHAR(50)) DEFAULT CHARSET ucs2;
mysql> SELECT * FROM users WHERE name REGEXP 'a';
ERROR 1139 (42000): Got error 'empty (sub)expression' from regexp
[25 Jan 2008 4:27] MySQL Verification Team
I think the patch for bug #31081 should have fixed this.  Can you try a newer version and check ?
[25 Jan 2008 11:41] Sveta Smirnova
Thank you for the report.

Verified as described with 4.1 development sources.

In other trees bug does not exist.
[25 Jan 2008 11:42] Sveta Smirnova
Workaround: use UTF8
[4 Feb 2008 9:15] Alexander Barkov
The fix for the bug#34950 not only fixed the crash,
but it also made REGEX work with UCS2 character set
(with the same restrictions applicable to utf8).

"Full" REGEXP support for multi-byte characters is 
described under WL#353. See here for details.
http://forge1.mysql.com/worklog/task.php?id=353

Changing category to "Feature request".
[29 Oct 2019 22:45] Roy Lyseng
Posted by developer:
 
Implemented in 8.0 with ICU REGEXP library (WL#8987)