Bug #5324 Bug in UCA collations with LIKE comparisons and INDEX
Submitted: 31 Aug 2004 20:20 Modified: 6 Sep 2004 17:07
Reporter: Alexander Barkov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.1.4 OS:
Assigned to: Alexander Barkov CPU Architecture:Any

[31 Aug 2004 20:20] Alexander Barkov
Description:
Jeremy March writes:

I've found two bugs which appear to be related when performing a LIKE
comparison using the unicode uca collations.  For the bug to occur there
must be more than one row in the table being selected and there must be
an index on the column.  With utf8_unicode_ci columns this causes mysqld
to hang and with ucs2_unicode_ci columns mysql doesn't find the rows
when it should.  Here are 3 tests.  I performed them on MySQL 4.1.4 with
the most recent bitkeeper pulls.  I used Red Hat linux 9.0 on a pentium
III processor.

1) with utf8_general_ci which works as expected.

2) with ucs2_unicode_ci which doesn't find any rows when it should find
2.

3) with utf8_unicode_ci which causes mysqld to hang.

When these tests are done without an index they work fine.

How to repeat:

#test 1
 DROP TABLE IF EXISTS t;
 
CREATE TABLE t (c varchar(255) NOT NULL COLLATE utf8_general_ci, INDEX
 (c));
 
INSERT INTO t VALUES (CONVERT(_ucs2 0x039C03C903B403B11F770308 USING
utf8));
 
SELECT * FROM t WHERE c LIKE CONVERT(_ucs2 0x039C0025 USING utf8)
COLLATE utf8_general_ci;
#ok with just one row
 
INSERT INTO t VALUES (CONVERT(_ucs2 0x039C03C903B4 USING utf8));

SELECT * FROM t WHERE c LIKE CONVERT(_ucs2 0x039C0025 USING utf8)
COLLATE utf8_general_ci;
#works fine with utf8_general_ci
 
DROP TABLE t;
 
#test2
DROP TABLE IF EXISTS t;

CREATE TABLE t (c varchar(255) NOT NULL COLLATE ucs2_unicode_ci, INDEX
(c));

INSERT INTO t VALUES (_ucs2 0x039C03C903B403B11F770308);

SELECT * FROM t WHERE c LIKE _ucs2 0x039C0025 COLLATE ucs2_unicode_ci;
#ok with just one row

INSERT INTO t VALUES (_ucs2 0x039C03C903B4);

SELECT * FROM t WHERE c LIKE _ucs2 0x039C0025 COLLATE ucs2_unicode_ci;
#doesn't find anything?

DROP TABLE t;

#test 3
DROP TABLE IF EXISTS t;
CREATE TABLE t (c varchar(255) NOT NULL COLLATE utf8_unicode_ci, INDEX (c));
INSERT INTO t VALUES (CONVERT(_ucs2 0x039C03C903B403B11F770308 USING utf8));
 
SELECT * FROM t WHERE c LIKE CONVERT(_ucs2 0x039C0025 USING utf8)
COLLATE utf8_unicode_ci;
#ok with just one row
 
INSERT INTO t VALUES (CONVERT(_ucs2 0x039C03C903B4 USING utf8));

SELECT * FROM t WHERE c LIKE CONVERT(_ucs2 0x039C0025 USING utf8)
COLLATE utf8_unicode_ci;
#mysqld hangs here
 
DROP TABLE t;
[2 Sep 2004 16:07] MySQL Verification Team
Verified with latest BK 4.1 tree, but last SELECT statement doesn't cause mysqld to hang, it returns empty result set.
[3 Sep 2004 1:12] Jeremy March
On my machine the last SELECT query causes mysqld to hang.  It never returns a result to the client.  When I log into the server from a different client and execute SHOW PROCESSLIST; I get the following:

mysql> show processlist\G
*************************** 1. row ***************************
     Id: 1
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 67
  State: statistics
   Info: SELECT * FROM t WHERE c LIKE CONVERT(_ucs2 0x039C0025 USING utf8)
COLLATE utf8_unicode_ci
*************************** 2. row ***************************
     Id: 2
   User: root
   Host: localhost
     db: NULL
Command: Query
   Time: 0
  State: NULL
   Info: show processlist
2 rows in set (0.00 sec)

I am not able to kill this query and top shows that mysqld is using 100% of the cpu.  The only solution is to kill -9 mysqld.

best regards,
Jeremy
[6 Sep 2004 17:07] Alexander Barkov
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html