| Bug #21560 | RLIKE resp. REGEXP throw ERROR 2013 when used with utf8 | ||
|---|---|---|---|
| Submitted: | 10 Aug 2006 9:04 | Modified: | 26 Sep 2006 11:46 |
| Reporter: | Ralf Wohner | Email Updates: | |
| Status: | No Feedback | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 3.23, 4.1, 5.0 | OS: | Windows (Win9x, XP, Linux) |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | ERROR 2013, REGEXP, RLIKE, utf8 | ||
[10 Aug 2006 9:04]
Ralf Wohner
[10 Aug 2006 11:42]
MySQL Verification Team
Thank you for the bug report. Could you please provide a small test case with create table and some insert commands as well as your setup regarding character set fro server and client. Thanks in advance.
[11 Aug 2006 7:42]
Ralf Wohner
Here a script to create the test case. I think it will reproduce the error no matter what default charset/collation the client and the server use, as I pin it to utf8_unicode_ci. Currently I use mySQL 4.1.5-gamma, but as I said, I confirmed or read about this on various versions. Server and client use latin1 by default, I think this is the swedish factory setting still. I didn't want to rely on especially configured servers with my project, so my DBs, tables and columns are "forced" to use utf8_unicode_ci. The script: CREATE TABLE `testtable` ( `id` int(11) NOT NULL auto_increment, `utf8column` varchar(64) collate utf8_unicode_ci default NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; INSERT INTO testtable VALUES (null,'this is utf8.'),(null,'this, too. No special characters used, by the way.'); SELECT * FROM testtable WHERE utf8column LIKE '%too%'; -- Works SELECT * FROM testtable WHERE utf8column REGEXP '.*too.*'; -- Doesn't. Throws the error 2013 SELECT * FROM testtable WHERE CONVERT(utf8column USING latin1) REGEXP _latin1 '.*too.*'; -- Works, but a test case for non-latin1-characters failed: -- I added a 3rd row containing text with an german a-umlaut (see HTML-Entity auml) -- Although the german a-umlaut in the column isn't changed when using SELECT CONVERT(utf8column USING latin1) FROM testtable; -- , the 3rd SELECT doesn't find anything when searching for it instead of 'too', -- neither when looking for a simple 'a' (of course it finds record #2 then, as -- there are some 'a's in there.) -- Perhaps someone can recommend a better alternative to latin1.
[26 Aug 2006 11:46]
Valeriy Kravchuk
Sorry, but I was not able to repeat the behaviour described:
openxs@suse:~/dbs/5.1> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3 to server version: 5.0.25-debug
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> CREATE TABLE `testtable` (
-> `id` int(11) NOT NULL auto_increment,
-> `utf8column` varchar(64) collate utf8_unicode_ci default NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Query OK, 0 rows affected (0.04 sec)
mysql> INSERT INTO testtable VALUES (null,'this is utf8.'),(null,'this, too. No
special
'> characters used, by the way.');
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM testtable WHERE utf8column LIKE '%too%';
+----+----------------------------------------------------+
| id | utf8column |
+----+----------------------------------------------------+
| 2 | this, too. No special
characters used, by the way. |
+----+----------------------------------------------------+
1 row in set (0.02 sec)
mysql> SELECT * FROM testtable WHERE utf8column REGEXP '.*too.*';
+----+----------------------------------------------------+
| id | utf8column |
+----+----------------------------------------------------+
| 2 | this, too. No special
characters used, by the way. |
+----+----------------------------------------------------+
1 row in set (0.01 sec)
mysql> select version();
+--------------+
| version() |
+--------------+
| 5.0.25-debug |
+--------------+
1 row in set (0.01 sec)
So, please, try to repeat with a newer version (5.0.24, 5.1.11 or 4.1.21), and inform about the results.
[26 Sep 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
