Bug #21560 | RLIKE resp. REGEXP throw ERROR 2013 when used with utf8 | ||
---|---|---|---|
Submitted: | 10 Aug 2006 9:04 | Modified: | 26 Sep 2006 11:46 |
Reporter: | Ralf Wohner | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 3.23, 4.1, 5.0 | OS: | Windows (Win9x, XP, Linux) |
Assigned to: | CPU Architecture: | Any | |
Tags: | ERROR 2013, REGEXP, RLIKE, utf8 |
[10 Aug 2006 9:04]
Ralf Wohner
[10 Aug 2006 11:42]
MySQL Verification Team
Thank you for the bug report. Could you please provide a small test case with create table and some insert commands as well as your setup regarding character set fro server and client. Thanks in advance.
[11 Aug 2006 7:42]
Ralf Wohner
Here a script to create the test case. I think it will reproduce the error no matter what default charset/collation the client and the server use, as I pin it to utf8_unicode_ci. Currently I use mySQL 4.1.5-gamma, but as I said, I confirmed or read about this on various versions. Server and client use latin1 by default, I think this is the swedish factory setting still. I didn't want to rely on especially configured servers with my project, so my DBs, tables and columns are "forced" to use utf8_unicode_ci. The script: CREATE TABLE `testtable` ( `id` int(11) NOT NULL auto_increment, `utf8column` varchar(64) collate utf8_unicode_ci default NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; INSERT INTO testtable VALUES (null,'this is utf8.'),(null,'this, too. No special characters used, by the way.'); SELECT * FROM testtable WHERE utf8column LIKE '%too%'; -- Works SELECT * FROM testtable WHERE utf8column REGEXP '.*too.*'; -- Doesn't. Throws the error 2013 SELECT * FROM testtable WHERE CONVERT(utf8column USING latin1) REGEXP _latin1 '.*too.*'; -- Works, but a test case for non-latin1-characters failed: -- I added a 3rd row containing text with an german a-umlaut (see HTML-Entity auml) -- Although the german a-umlaut in the column isn't changed when using SELECT CONVERT(utf8column USING latin1) FROM testtable; -- , the 3rd SELECT doesn't find anything when searching for it instead of 'too', -- neither when looking for a simple 'a' (of course it finds record #2 then, as -- there are some 'a's in there.) -- Perhaps someone can recommend a better alternative to latin1.
[26 Aug 2006 11:46]
Valeriy Kravchuk
Sorry, but I was not able to repeat the behaviour described: openxs@suse:~/dbs/5.1> bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 to server version: 5.0.25-debug Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> CREATE TABLE `testtable` ( -> `id` int(11) NOT NULL auto_increment, -> `utf8column` varchar(64) collate utf8_unicode_ci default NULL, -> PRIMARY KEY (`id`) -> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Query OK, 0 rows affected (0.04 sec) mysql> INSERT INTO testtable VALUES (null,'this is utf8.'),(null,'this, too. No special '> characters used, by the way.'); Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM testtable WHERE utf8column LIKE '%too%'; +----+----------------------------------------------------+ | id | utf8column | +----+----------------------------------------------------+ | 2 | this, too. No special characters used, by the way. | +----+----------------------------------------------------+ 1 row in set (0.02 sec) mysql> SELECT * FROM testtable WHERE utf8column REGEXP '.*too.*'; +----+----------------------------------------------------+ | id | utf8column | +----+----------------------------------------------------+ | 2 | this, too. No special characters used, by the way. | +----+----------------------------------------------------+ 1 row in set (0.01 sec) mysql> select version(); +--------------+ | version() | +--------------+ | 5.0.25-debug | +--------------+ 1 row in set (0.01 sec) So, please, try to repeat with a newer version (5.0.24, 5.1.11 or 4.1.21), and inform about the results.
[26 Sep 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".