Bug #9314 | Bug in range searches on prefix primary key on varchar column | ||
---|---|---|---|
Submitted: | 21 Mar 2005 14:13 | Modified: | 23 Mar 2005 16:48 |
Reporter: | Marko Mäkelä | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.3 | OS: | Linux (GNU/Linux) |
Assigned to: | Marko Mäkelä | CPU Architecture: | Any |
[21 Mar 2005 14:13]
Marko Mäkelä
[21 Mar 2005 14:18]
Marko Mäkelä
The bug doesn't seem to be present in 4.1.11-bk. This is most likely related to true VARCHAR support.
[22 Mar 2005 8:51]
Marko Mäkelä
The VARCHAR record passed to ha_innobase::position() is prefixed with the column length. The data that ha_innobase::position() passes to ha_innobase::store_key_val_for_row() should not include any length bytes.
[22 Mar 2005 9:18]
Heikki Tuuri
Marko, the MySQL row format does contain the length of the true VARCHAR field: ha_innobase::store_key_val_for_row( ... 4. We always use a 2 byte length for a true >= 5.0.3 VARCHAR. Note that in the MySQL row format, the length is stored in 1 or 2 bytes, depending on the maximum allowed length. But in the MySQL key value format, the length always takes 2 bytes. Note that in the MySQL key value format, the length is always stored in 2 bytes, while in the MySQL row format it is stored in 1 or 2 bytes, to save space. Regards, Heikki
[22 Mar 2005 9:27]
Marko Mäkelä
Ignore the previous comment. The length field is read properly in row_mysql_read_true_varchar(). However, the first calls to that function return length 4, while the problematic call returns 8 (which corresponds to 4 two-byte characters). Somewhere, there is a false assumption that one character equals one byte.
[22 Mar 2005 14:17]
Marko Mäkelä
There is a typo in "Suggested fix". Of course, space in ucs2 is 0x0020, while 0x2020 is a dagger symbol (U+2020). This bug may be caused by wrong parameters being passed to handler::records_in_range(). For debugging, I used the following gdb commands: file mysqld break ha_myisam::records_in_range(unsigned int, st_key_range*, st_key_range*) run display/x *max_key->key@max_key->length display/x *min_key->key@min_key->length Then try the following SQL: create table utf8bug(a varchar(4) not null, primary key(a(3))) charset=utf8; update utf8bug set a='' where a=0x21e280a122; Note that records_in_range() should probably receive min_key=max_key=0x21e280a122 (3 characters, 5 bytes). However, in my test, it was: - min_key = 0x21e280 (3 bytes, one complete and one truncated character) - max_key = 0x21e280202020202020 (9 bytes: one complete character, two malformed bytes and six complete characters) MySQL 4.1.11 didn't use records_in_range() for the UPDATE statement, because it doesn't support true VARCHAR. The method will be passed nonsensical range endpoints also for UCS2 data: create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=ucs2; update ucs2bug set a='' where a=0x0120202120202020; 2: /x *max_key->key @ max_key->length = {0x6, 0x0, 0x1, 0x20, 0x20, 0x21, 0x20, 0x20} 1: /x *min_key->key @ min_key->length = {0x4, 0x0, 0x1, 0x20, 0x20, 0x21, 0x20, 0x20} Contrary to what I expect, the first two bytes (true varchar length) appear to be characters, not bytes. This probably causes the ERROR 1032 in InnoDB. Also, min_key is truncated from 0x012020212020 to 0x01202021.
[22 Mar 2005 21:53]
Heikki Tuuri
Hi! Looks like Sergei's bug fix: http://lists.mysql.com/internals/23297 fixes the UTF-8 case of this bug. Now that I look in gdb at an InnoDB search, start_key and end_key look sensible. mysql> create table utf8bug(a varchar(4) not null, primary key(a(3))) charset=ut f8 type=innodb; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> insert into utf8bug values (0x21e280a122); Query OK, 1 row affected (0.00 sec) mysql> update utf8bug set a='' where a=0x21e280a122; Query OK, 1 row affected (1 min 41.20 sec) Rows matched: 1 Changed: 1 Warnings: 0 (gdb) bt #0 row_search_for_mysql ( buf=0xa3aecc8 "\005!â\200¡\"", '¥' <repeats 26 times>, mode=2, prebuilt=0x43ede468, match_mode=0, direction=0) at row0sel.c:3033 #1 0x08276dad in ha_innobase::index_read(char*, char const*, unsigned, ha_rkey_ function) (this=0xa3aeb98, buf=0xa3aecc8 "\005!â\200¡\"", '¥' <repeats 26 times>, key_ptr=0xa39a2e8 "\005", key_len=11, find_flag=HA_READ_KEY_OR_NEXT) at ha_innodb.cc:3402 #2 0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co nst*, bool, bool) (this=0xa3aeb98, start_key=0xa399a10, end_key=0xa399a1c, eq_range_arg=false, sorted=false) at handler.cc:2249 #3 0x08269a8c in handler::read_multi_range_first(st_key_multi_range**, st_key_m ulti_range*, unsigned, bool, st_handler_buffer*) (this=0xa3aeb98, found_range_p=0x88f7d308, ranges=0xa399a10, range_count=1, sorted=false, buffer=0x0) at handler.cc:2121 #4 0x08258090 in QUICK_RANGE_SELECT::get_next() (this=0xa39a1b8) at opt_range.cc:6052 #5 0x08261621 in rr_quick (info=0x88f7d49c) at records.cc:157 #6 0x08216d5f in mysql_update(THD*, st_table_list*, List<Item>&, List<Item>&, I tem*, unsigned, st_order*, unsigned long, enum_duplicates, bool) ( thd=0xa37d1c0, table_list=0xa3749c8, fields=@0xa37d44c, values=@0xa37d608, conds=0xa374d10, order_num=0, order=0x0, limit=4294967295, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:332 #7 0x081b7089 in mysql_execute_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:3099 #8 0x081bd1cb in mysql_parse(THD*, char*, unsigned) (thd=0xa37d1c0, inBuf=0xa374948 "update utf8bug set a='' where a=0x21e280a122", length=44) at sql_parse.cc:5170 #9 0x081b30f7 in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY, thd=0xa37d1c0, packet=0xa3959c1 "update utf8bug set a='' where a=0x21e280a122", packet_length=45) at sql_parse.cc:1647 #10 0x081b293d in do_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:1453 #11 0x081b1acc in handle_one_connection (arg=0xa37d1c0) at sql_parse.cc:1110 #12 0x40062f60 in pthread_start_thread () from /lib/i686/libpthread.so.0 #13 0x400630fe in pthread_start_thread_event () from /lib/i686/libpthread.so.0 #14 0x401f5327 in clone () from /lib/i686/libc.so.6 (gdb) frame 2 #2 0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co nst*, bool, bool) (this=0xa3aeb98, start_key=0xa399a10, end_key=0xa399a1c, eq_range_arg=false, sorted=false) at handler.cc:2249 2249 result= index_read(table->record[0], Current language: auto; currently c++ (gdb) x/11b start_key->key 0xa39a2e8: 0x05 0x00 0x21 0xe2 0x80 0xa1 0x22 0x00 0xa39a2f0: 0x00 0x00 0x00 (gdb) x/11b end_key->key 0xa39a2f8: 0x09 0x00 0x21 0xe2 0x80 0xa1 0x22 0x20 0xa39a300: 0x20 0x20 0x20 (gdb) But the UCS-2 case of this bug remains. start_key and end_key are strange. Regards, Heikki mysql> create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=uc s2 type=innodb; ERROR 1050 (42S01): Table 'ucs2bug' already exists mysql> drop table ucs2bug; Query OK, 0 rows affected (0.01 sec) mysql> create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=uc s2 type=innodb; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> update ucs2bug set a='' where a=0x0120202120202020; (gdb) bt #0 row_search_for_mysql ( buf=0xa39c9d8 "\006\001 ! ", '¥' <repeats 25 times>, mode=2, prebuilt=0x43edf468, match_mode=0, direction=0) at row0sel.c:3033 #1 0x08276dad in ha_innobase::index_read(char*, char const*, unsigned, ha_rkey_ function) (this=0xa39c8a8, buf=0xa39c9d8 "\006\001 ! ", '¥' <repeats 25 times>, key_ptr=0xa39a2e8 "\004", key_len=8, find_flag=HA_READ_KEY_OR_NEXT) at ha_innodb.cc:3402 #2 0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co nst*, bool, bool) (this=0xa39c8a8, start_key=0xa37f938, end_key=0xa37f944, eq_range_arg=false, sorted=false) at handler.cc:2249 #3 0x08269a8c in handler::read_multi_range_first(st_key_multi_range**, st_key_m ulti_range*, unsigned, bool, st_handler_buffer*) (this=0xa39c8a8, found_range_p=0x88f7d308, ranges=0xa37f938, range_count=1, sorted=false, buffer=0x0) at handler.cc:2121 #4 0x08258090 in QUICK_RANGE_SELECT::get_next() (this=0xa39a1b8) at opt_range.cc:6052 #5 0x08261621 in rr_quick (info=0x88f7d49c) at records.cc:157 #6 0x08216d5f in mysql_update(THD*, st_table_list*, List<Item>&, List<Item>&, I tem*, unsigned, st_order*, unsigned long, enum_duplicates, bool) ( thd=0xa37d1c0, table_list=0xa3749d0, fields=@0xa37d44c, values=@0xa37d608, conds=0xa374d28, order_num=0, order=0x0, limit=4294967295, handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:332 #7 0x081b7089 in mysql_execute_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:3099 #8 0x081bd1cb in mysql_parse(THD*, char*, unsigned) (thd=0xa37d1c0, inBuf=0xa374948 "update ucs2bug set a='' where a=0x0120202120202020", length=50) at sql_parse.cc:5170 #9 0x081b30f7 in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY, thd=0xa37d1c0, packet=0xa3959c1 "update ucs2bug set a='' where a=0x0120202120202020", packet_length=51) at sql_parse.cc:1647 #10 0x081b293d in do_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:1453 #11 0x081b1acc in handle_one_connection (arg=0xa37d1c0) at sql_parse.cc:1110 #12 0x40062f60 in pthread_start_thread () from /lib/i686/libpthread.so.0 #13 0x400630fe in pthread_start_thread_event () from /lib/i686/libpthread.so.0 #14 0x401f5327 in clone () from /lib/i686/libc.so.6 (gdb) frame 2 #2 0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co nst*, bool, bool) (this=0xa39c8a8, start_key=0xa37f938, end_key=0xa37f944, eq_range_arg=false, sorted=false) at handler.cc:2249 2249 result= index_read(table->record[0], Current language: auto; currently c++ (gdb) x/11b start_key->key 0xa39a2e8: 0x04 0x00 0x01 0x20 0x20 0x21 0x20 0x20 0xa39a2f0: 0x01 0xa5 0xa5 (gdb) x/11b end_key->key 0xa39a2f8: 0x06 0x00 0x01 0x20 0x20 0x21 0x20 0x20 0xa39a300: 0x37 0xa5 0xa5 (gdb)
[23 Mar 2005 11:25]
Sergei Golubchik
There are two problems here. One is with space padding/trimming in opt_range.cc - but it's only an annoyance, because it does not affect the result (MySQL askes for larger range than it's really necessary). You can see it, because your test case works with MyISAM. Anyway, I fixed it The real bug in in ha_innobase::position(). You see, because you update the same column that is used in WHERE, MySQL cannot update it directly - it may be an endless loop for a query like UPDATE a=a+1 WHERE a > 0 So MySQL first collects all matching rows (row positions) in a temporary IO_CACHE, and then updates all them. If you'll step over mysql_update you will see that first read_record(&info) at line 332 succeeds, then MySQL calls table->file->position() and the recorded position is NOT correct. No wonder second read_record() at line 401 fails. In short, ha_innobase::position works wrong when you have primary key on a part of the field. Here's the test case without ucs2 involved: mysql> create table t1 (a varchar(20), primary key (a(5))) type=innodb; mysql> insert t1 values ('1234567890'); mysql> update t1 set a='12' where a='1234567890'; ERROR 1032 (HY000): Can't find record in 't1' btw, check 4.1 and 4.0 - they could be affected too
[23 Mar 2005 15:49]
Heikki Tuuri
Hi! I tested the t1 test of Sergei with InnoDB-4.1, with both latin1 and utf8, and it worked ok there. Next I will look at the 5.0 failure. Regards, Heikki
[23 Mar 2005 16:38]
Heikki Tuuri
Hi! I found the bug. The patch to follow in 3 minutes. --Heikki
[23 Mar 2005 16:46]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/23314
[23 Mar 2005 16:48]
Heikki Tuuri
The fix pushed to the 5.0.3 tree. Thank you to Marko and Sergei for finding this bug. Regards, Heikki