MySQL Bugs: #6019: SELECT tries to use too short prefix index on utf8 data

Bug #6019	SELECT tries to use too short prefix index on utf8 data
Submitted:	11 Oct 2004 12:03	Modified:	29 Oct 2004 8:00
Reporter:	Marko Mäkelä	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S1 (Critical)
Version:	4.1.6-bk	OS:	Linux (GNU/Linux)
Assigned to:	Sergei Golubchik	CPU Architecture:	Any

Description:
MySQL is using a 3-character prefix index in a prefix match (LIKE 'foob%'), although the length of the search string exceeds the length of the prefix index. As a result, it will not find any records.

How to repeat:
create table foo(a int primary key,b varchar(6),index b3(b(3))) engine=innodb default character set=utf8;
insert into foo values(1,'foo'),(2,'foobar');
select * from foo where b like 'foob%';

MySQL will report "Empty set", although it should find the record (2,'foobar').

Suggested fix:
Blank the excessive characters in the LIKE prefix with NUL characters and scan the resultset against the original LIKE prefix query.

Whoever fixes this bug should also consider adding an extra 0xff byte to the upper limit of the search area. Otherwise, if there are 3-byte UTF8 characters that are equivalent with 1-byte or 2-byte UTF8 characters, the search results may be inconsistent.

See the function my_strnncoll_utf8(). There, the end of string is treated as the 'infimum' character and 0xff (an illegal UTF8 character) is treated as the 'supremum'. Thus, "foo" < "foobar" but "foo\xff" > "foobar". Now consider the case where "foo" consists of 3-byte UTF8 characters. It is best to pad the upper limit of the LIKE prefix search with an extra 0xff byte, i.e., "foo\xff".

Verified with 4.1.6-gamma-debug-log

I fixed the bug in MySQL code, so InnoDB now works fine.

But there is still a problem with BDB, I'm continueing working on it.

Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

fixed in 4.1.7