Bug #6019 SELECT tries to use too short prefix index on utf8 data
Submitted: 11 Oct 2004 12:03 Modified: 29 Oct 2004 8:00
Reporter: Marko Mäkelä Email Updates:
Status: Closed
Category:Server Severity:S1 (Critical)
Version:4.1.6-bk OS:Linux (GNU/Linux)
Assigned to: Sergei Golubchik Target Version:

[11 Oct 2004 12:03] Marko Mäkelä
Description:
MySQL is using a 3-character prefix index in a prefix match (LIKE 'foob%'), although the
length of the search string exceeds the length of the prefix index. As a result, it will
not find any records.

How to repeat:
create table foo(a int primary key,b varchar(6),index b3(b(3))) engine=innodb default
character set=utf8;
insert into foo values(1,'foo'),(2,'foobar');
select * from foo where b like 'foob%';

MySQL will report "Empty set", although it should find the record (2,'foobar').

Suggested fix:
Blank the excessive characters in the LIKE prefix with NUL characters and scan the
resultset against the original LIKE prefix query.
[11 Oct 2004 12:16] Marko Mäkelä
Whoever fixes this bug should also consider adding an extra 0xff byte to the upper limit
of the search area. Otherwise, if there are 3-byte UTF8 characters that are equivalent
with 1-byte or 2-byte UTF8 characters, the search results may be inconsistent.

See the function my_strnncoll_utf8(). There, the end of string is treated as the
'infimum' character and 0xff (an illegal UTF8 character) is treated as the 'supremum'.
Thus, "foo" < "foobar" but "foo\xff" > "foobar". Now consider the case where "foo"
consists of 3-byte UTF8 characters. It is best to pad the upper limit of the LIKE prefix
search with an extra 0xff byte, i.e., "foo\xff".
[11 Oct 2004 19:03] Victoria Reznichenko
Verified with 4.1.6-gamma-debug-log
[15 Oct 2004 10:05] Alexander Barkov
I fixed the bug in MySQL code, so InnoDB now works fine.

But there is still a problem with BDB, I'm continueing working on it.
[29 Oct 2004 8:00] Sergei Golubchik
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

fixed in 4.1.7