Bug #19960 | Inconsistent results when joining InnoDB tables using partial UTF8 indexes | ||
---|---|---|---|
Submitted: | 19 May 2006 23:54 | Modified: | 6 Oct 2006 2:51 |
Reporter: | Kyle Anderson | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 5.0.19, 5.0.20/5.0BK/5.1BK | OS: | MacOS (OSX 10.4, Linux (Debian)) |
Assigned to: | Alexander Barkov | CPU Architecture: | Any |
[19 May 2006 23:54]
Kyle Anderson
[23 May 2006 17:29]
MySQL Verification Team
Thank you for the bug report. miguel@hegel:~/dbs/5.0> bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 to server version: 5.0.22-debug Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> CREATE TABLE `test1` ( -> `colA` int(11) NOT NULL, -> `colB` varchar(255) character set utf8 NOT NULL, -> PRIMARY KEY (`colA`) -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Query OK, 0 rows affected (0.01 sec) mysql> mysql> INSERT INTO `test1` (`colA`, `colB`) VALUES (1, 'foo'), (2, 'foo bar'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> CREATE TABLE `test2` ( -> `colA` int(11) NOT NULL, -> `colB` varchar(255) character set utf8 NOT NULL, -> KEY `bad` (`colA`,`colB`(3)) -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1; Query OK, 0 rows affected (0.01 sec) mysql> mysql> INSERT INTO `test2` (`colA`, `colB`) VALUES (1, 'foo'),(2, 'foo bar'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM `test2` WHERE `colA` =2 AND `colB` = "foo bar"; +------+---------+ | colA | colB | +------+---------+ | 2 | foo bar | +------+---------+ 1 row in set (0.02 sec) mysql> SELECT * -> FROM `test1` -> JOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` = -> `test2`.`colB`; +------+---------+------+---------+ | colA | colB | colA | colB | +------+---------+------+---------+ | 1 | foo | 1 | foo | | 2 | foo bar | 2 | foo bar | +------+---------+------+---------+ 2 rows in set (0.01 sec) mysql> SELECT * -> FROM `test1` -> JOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` = -> `test2`.`colB` -> WHERE `test1`.`colA` < 3; +------+------+------+------+ | colA | colB | colA | colB | +------+------+------+------+ | 1 | foo | 1 | foo | +------+------+------+------+ 1 row in set (0.01 sec) mysql> ALTER TABLE test1 ENGINE=MyISAM; Query OK, 2 rows affected (0.02 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test2 ENGINE=MyISAM; Query OK, 2 rows affected (0.02 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> SELECT * -> FROM `test1` -> JOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` = -> `test2`.`colB` -> WHERE `test1`.`colA` < 3; +------+---------+------+---------+ | colA | colB | colA | colB | +------+---------+------+---------+ | 1 | foo | 1 | foo | | 2 | foo bar | 2 | foo bar | +------+---------+------+---------+ 2 rows in set (0.00 sec) ------------------------------------------------------------------------------------- mysql> SELECT * -> FROM `test1` -> JOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` = -> `test2`.`colB` -> WHERE `test1`.`colA` < 3; +------+------+------+------+ | colA | colB | colA | colB | +------+------+------+------+ | 1 | foo | 1 | foo | +------+------+------+------+ 1 row in set (0.01 sec) mysql> select version(); +-------------------+ | version() | +-------------------+ | 5.1.11-beta-debug | +-------------------+ 1 row in set (0.00 sec) mysql> ------------------------------------------------------------------------------------ mysql> SELECT * -> FROM `test1` -> JOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` = -> `test2`.`colB` -> WHERE `test1`.`colA` < 3; +------+---------+------+---------+ | colA | colB | colA | colB | +------+---------+------+---------+ | 1 | foo | 1 | foo | | 2 | foo bar | 2 | foo bar | +------+---------+------+---------+ 2 rows in set (0.02 sec) mysql> SELECT version(); +--------------+ | version() | +--------------+ | 4.1.19-debug | +--------------+ 1 row in set (0.00 sec) mysql>
[26 May 2006 13:55]
Heikki Tuuri
Hi! This is probably a bug in MySQL's interpreter. It uses the multi-column index 'bad' in test2, and passes the WHOLE string 'foo bar' as the second field in the key value. Since only the 3 first characters are stored in the index, it should pass just 'foo'. MySQL does reserve 3 * 3 = 9 bytes for the column prefix, but the bytes after the 3 first characters should not be used. Regards, Heikki Stack trace from 5.0.18: (gdb) bt #0 row_sel_convert_mysql_key_to_innobase (tuple=0x42356468, buf=0x899fb4c "", buf_len=836, index=0x42359a68, key_ptr=0x89a0b58 "\002", key_len=15, trx=0x4234cc68) at row0sel.c:2177 #1 0x082196ff in ha_innobase::index_read (this=0x89807b8, buf=0x899fb4c "", key_ptr=0x89a0b58 "\002", key_len=144309068, find_flag=HA_READ_KEY_EXACT) at ha_innodb.cc:3976 #2 0x081b87d5 in join_read_always_key (tab=0x89807b8) at sql_select.cc:10150 #3 0x081b7840 in sub_select (join=0x899a370, join_tab=0x89a09f4, end_of_records=4) at sql_select.cc:9654 #4 0x081b7993 in evaluate_join_record (join=0x899a370, join_tab=0x89a0890, error=4, report_error=0x899fb4c "") at sql_select.cc:9770 #5 0x081b77bd in sub_select (join=0x899a370, join_tab=0x89a0890, end_of_records=4) at sql_select.cc:9661 #6 0x081b7392 in do_select (join=0x899a370, fields=0x897ee88, table=0x0, procedure=0x0) at sql_select.cc:9418 #7 0x081abf08 in JOIN::exec (this=0x899a370) at sql_select.cc:1721 #8 0x081ac86a in mysql_select (thd=0x897eb98, rref_pointer_array=0x897ef24, tables=0x8999670, wild_num=1, fields=@0x899fb4c, conds=0x899a228, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2156153344, result=0x899a360, unit=0x897ebe4, select_lex=0x897ee04) at sql_select.cc:1885 #9 0x081a89a2 in handle_select (thd=0x897eb98, lex=0x897ebd4, result=0x899a360, setup_tables_done_option=0) at sql_select.cc:238 #10 0x08175d1c in mysql_execute_command (thd=0x897eb98) at sql_parse.cc:2494 #11 0x0817e610 in mysql_parse (thd=0x897eb98, inBuf=0x89994f0 "SELECT * \nFROM `test1`\nJOIN `test2` ON `test1`.`colA` = `test2`.`colA` AND `test1`.`colB` =\n`test2`.`colB`\nWHERE `test1`.`colA` < 3", length=144174036) at sql_parse.cc:5628 #12 0x08174483 in dispatch_command (command=144173976, thd=0x897eb98, packet=0x89914c1 "", packet_length=144282864) at sql_parse.cc:1713 #13 0x08173fcd in do_command (thd=0x897eb98) at sql_parse.cc:1514 #14 0x081733a2 in handle_one_connection (arg=0x897eb98) at sql_parse.cc:1158 #15 0x40041b63 in start_thread () from /lib/tls/libpthread.so.0 #16 0x4024b18a in clone () from /lib/tls/libc.so.6 (gdb) x/30c key_ptr 0x89a0b58: 2 '\002' 0 '\0' 0 '\0' 0 '\0' 7 '\a' 0 '\0' 102 'f'111 'o' 0x89a0b60: 111 'o' 32 ' ' 98 'b' 97 'a' 114 'r' 0 '\0' 0 '\0' 0 '\0' 0x89a0b68: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0x89a0b70: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' (gdb)
[12 Sep 2006 11:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/11759 ChangeSet@1.2236, 2006-09-12 16:01:23+05:00, bar@mysql.com +3 -0 Bug#19960 Inconsistent results when joining InnoDB tables using partial UTF8 indexes Adding a multibyte-aware VARCHAR copying function, to put correct column prefix, taking in account number of characters (instead just limiting on number of bytes). For example, for a KEY(col(3)) on a UTF8 column when copying the string 'foo bar foo', we should put only 3 leftmost characters: 'foo'. 9 characters were incorrectly put before this fix.
[29 Sep 2006 11:17]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/12830 ChangeSet@1.2282, 2006-09-29 16:15:57+05:00, bar@mysql.com +3 -0 Bug#19960 Inconsistent results when joining InnoDB tables using partial UTF8 indexes Adding a multibyte-aware VARCHAR copying function, to put correct column prefix, taking in account number of characters (instead just limiting on number of bytes). For example, for a KEY(col(3)) on a UTF8 column when copying the string 'foo bar foo', we should put only 3 leftmost characters: 'foo'. 9 characters were incorrectly put before this fix.
[3 Oct 2006 20:02]
Chad MILLER
Available in 5.0.26.
[3 Oct 2006 20:09]
Chad MILLER
Available in 5.1.12-beta.
[6 Oct 2006 2:51]
Paul DuBois
Noted in 5.0.26, 5.1.12 changelogs.