Bug #60848 Prefix index on UTF-8
Submitted: 13 Apr 2011 9:25 Modified: 28 Apr 2011 17:16
Reporter: Kanako Nakai Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0,5.1 OS:Any
Assigned to: CPU Architecture:Any
Tags: Prefix index

[13 Apr 2011 9:25] Kanako Nakai
Description:
If I use prefix index on Innodb,I could not get correct result.
I checked on 5.0.83(innodb) and 5.1.53(innodb-plugin),Both resolt was not good.

Please check following test case.

How to repeat:
create table test (
 val varchar(255),
 KEY idx_1(val(5))
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into test values
("あ"),("あい"),("あいう"),("あいうえ"),("あいうえお");

mysql> select * from test;
+-----------------+
| val             |
+-----------------+
| あ              |
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
5 rows in set (0.00 sec)

mysql> select * from test where val like "あ%";
+-----------------+
| val             |
+-----------------+
| あ              |
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
5 rows in set (0.00 sec)

mysql> select * from test where val like "あい%";
+-----------------+
| val             |
+-----------------+
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
4 rows in set (0.00 sec)

mysql> select * from test where val like "あいう%";
+-----------------+
| val             |
+-----------------+
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
3 rows in set (0.00 sec)

mysql> select * from test where val like "あいうえ%";
+--------------+
| val          |
+--------------+
| あいうえ     |
+--------------+
1 row in set (0.00 sec)

#This resolt rows should return 2, but I got only 1 row. 

mysql> select * from test where val like "あいうえお%";
+-----------------+
| val             |
+-----------------+
| あいうえお      |
+-----------------+
1 row in set (0.00 sec)
[13 Apr 2011 16:31] Valeriy Kravchuk
With current mysql-5.5 everything works as expected:

macbook-pro:5.5 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.12-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table test (
    ->  val varchar(255),
    ->  KEY idx_1(val(5))
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.17 sec)

mysql> insert into test values
    -> ("あ"),("あい"),("あいう"),("あいうえ"),("あいうえお");
Query OK, 5 rows affected (0.14 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from test;
+-----------------+
| val             |
+-----------------+
| あ              |
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
5 rows in set (0.00 sec)

mysql> select * from test where val like "あい%";
+-----------------+
| val             |
+-----------------+
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
4 rows in set (0.05 sec)

mysql> select * from test where val like "あいう%";
+-----------------+
| val             |
+-----------------+
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
3 rows in set (0.00 sec)

mysql> select * from test where val like "あいうえ%";
+-----------------+
| val             |
+-----------------+
| あいうえ        |
| あいうえお      |
+-----------------+
2 rows in set (0.00 sec)

Same for current mysql-5.1:

macbook-pro:5.1 openxs$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.57-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table test (  val varchar(255),  KEY idx_1(val(5)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.08 sec)

mysql> insert into test values ("あ"),("あい"),("あいう"),("あいうえ"),("あいうえお");
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from test where val like "あいう%";
+-----------------+
| val             |
+-----------------+
| あいう       |
| あいうえ    |
| あいうえお |
+-----------------+
3 rows in set (0.07 sec)

mysql> select * from test where val like "あいうえ%";
+-----------------+
| val             |
+-----------------+
| あいうえ    |
| あいうえお |
+-----------------+
2 rows in set (0.00 sec)

So, what exact versions (5.0.x and 5.1.x) are affected? Do I miss something in my tests above?
[13 Apr 2011 19:15] Peter Laursen
om 5.1.56 the query 
SELECT * FROM test WHERE val LIKE "あいうえ%";
returns only 1 row on my environment (win7/64 w. 64 bit server)

.. but on 5.1.10 I get 2 rows (also 64 bit server).  So definitely a bug with 5.1.

Peter
(Not a MySQL person)
[13 Apr 2011 19:15] Peter Laursen
sorry for typo.  I meant:

.. but on 5.5.10 I get 2 rows
[14 Apr 2011 3:14] Valeriy Kravchuk
I still would like to know what exact version(s), and on what OS, demonstrate the problem for the original bug reporter. 

For me, on Mac OS X, it looks like the problem is NOT repeatable with current mysql-5.1 and mysql-.5.5.
[14 Apr 2011 3:17] Valeriy Kravchuk
Ah, sorry, I had noted versions (5.0.83 and 5.1.53) in the initial description. So, the questions that remain are:

- What OS do you use?
- Is it still repeatable for you with recent versions, 5.0.92 and 5.1.56, on that OS?
[14 Apr 2011 4:46] Kanako Nakai
- What OS do you use?
- Is it still repeatable for you with recent versions, 5.0.92 and 5.1.56, on that OS?

I'm using following OS.

5.1.53/5.0.83 on CentOS release 5.5 (Final)
    using binary distribution
    mysql-5.0.83-linux-x86_64-glibc23
    mysql-5.1.53-linux-x86_64-glibc23

5.0.83 on Solaris 10 10/08 s10s_u6wos_07b SPARC
    using source distribution
    

>5.0.92 and 5.1.56

I did not check yet,but another reporter report repeatable on 5.1.56 on Mac OS X.
[19 Apr 2011 12:01] Kanako Nakai
I report additional information.

following case was OK.

- using binary option

mysql> select * from test where binary val like "あいうえ%";
+-----------------+
| val             |
+-----------------+
| あいうえ        |
| あいうえお      |
+-----------------+
2 rows in set (0.01 sec)

- equal search

mysql> select * from test where val = "あいうえ";
+--------------+
| val          |
+--------------+
| あいうえ     |
+--------------+
1 row in set (0.00 sec)

- single byte

mysql> insert into test values ("a"),("ab"),("abc"),("abcd"),("abcde");
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from test where val like "a%";
+-------+
| val   |
+-------+
| a     |
| ab    |
| abc   |
| abcd  |
| abcde |
+-------+
5 rows in set (0.01 sec)

mysql> select * from test where val like "ab%";
+-------+
| val   |
+-------+
| ab    |
| abc   |
| abcd  |
| abcde |
+-------+
4 rows in set (0.00 sec)

mysql> select * from test where val like "abc%";
+-------+
| val   |
+-------+
| abc   |
| abcd  |
| abcde |
+-------+
3 rows in set (0.00 sec)

mysql> select * from test where val like "abcd%";
+-------+
| val   |
+-------+
| abcd  |
| abcde |
+-------+
2 rows in set (0.00 sec)

mysql> select * from test where val like "abcde%";
+-------+
| val   |
+-------+
| abcde |
+-------+
1 row in set (0.00 sec)
[28 Apr 2011 17:16] Sveta Smirnova
Thank you for the feedback.

Closed as "Can't repeat" as problem is not repeatable with versions 5.5 and up anymore.