Bug #9314 Bug in range searches on prefix primary key on varchar column
Submitted: 21 Mar 2005 14:13 Modified: 23 Mar 2005 16:48
Reporter: Marko Mäkelä Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0.3 OS:Linux (GNU/Linux)
Assigned to: Marko Mäkelä CPU Architecture:Any

[21 Mar 2005 14:13] Marko Mäkelä
Description:
Something is wrong with UPDATE, prefix indexes and UCS2 character set in InnoDB.

How to repeat:
create table ucs2bug(a varchar(4) not null, primary key(a(3))) engine=innodb charset=ucs2;
insert into ucs2bug values(0x0120202120202020);
update ucs2bug set a=0x0120203f where a=0x0120202120202020;
-- ERROR 1032: Can't find record in 'ucs2bug'
update ucs2bug set a=0x0120203f where a=0x0120202120200020;
-- Query OK, 0 rows affected (0.00 sec)

Suggested fix:
Make sure that trailing spaces (0x2020 in UCS2) are stripped and padded correctly.
[21 Mar 2005 14:18] Marko Mäkelä
The bug doesn't seem to be present in 4.1.11-bk. This is most likely related to true VARCHAR support.
[22 Mar 2005 8:51] Marko Mäkelä
The VARCHAR record passed to ha_innobase::position() is prefixed with the column length. The data that ha_innobase::position() passes to ha_innobase::store_key_val_for_row() should not include any length bytes.
[22 Mar 2005 9:18] Heikki Tuuri
Marko,

the MySQL row format does contain the length of the true VARCHAR field:

ha_innobase::store_key_val_for_row(
...
        4. We always use a 2 byte length for a true >= 5.0.3 VARCHAR. Note that
        in the MySQL row format, the length is stored in 1 or 2 bytes,
        depending on the maximum allowed length. But in the MySQL key value
        format, the length always takes 2 bytes.

Note that in the MySQL key value format, the length is always stored in 2 bytes, while in the MySQL row format it is stored in 1 or 2 bytes, to save space.

Regards,

Heikki
[22 Mar 2005 9:27] Marko Mäkelä
Ignore the previous comment. The length field is read properly in row_mysql_read_true_varchar(). However, the first calls to that function return length 4, while the problematic call returns 8 (which corresponds to 4 two-byte characters). Somewhere, there is a false assumption that one character equals one byte.
[22 Mar 2005 14:17] Marko Mäkelä
There is a typo in "Suggested fix". Of course, space in ucs2 is 0x0020, while 0x2020 is a dagger symbol (U+2020).

This bug may be caused by wrong parameters being passed to handler::records_in_range(). For debugging, I used the following gdb commands:

file mysqld
break ha_myisam::records_in_range(unsigned int, st_key_range*, st_key_range*)
run
display/x *max_key->key@max_key->length
display/x *min_key->key@min_key->length

Then try the following SQL:
create table utf8bug(a varchar(4) not null, primary key(a(3))) charset=utf8;
update utf8bug set a='' where a=0x21e280a122;

Note that records_in_range() should probably receive min_key=max_key=0x21e280a122 (3 characters, 5 bytes). However, in my test, it was:
- min_key = 0x21e280 (3 bytes, one complete and one truncated character)
- max_key = 0x21e280202020202020 (9 bytes: one complete character, two malformed bytes and six complete characters)

MySQL 4.1.11 didn't use records_in_range() for the UPDATE statement, because it doesn't support true VARCHAR.

The method will be passed nonsensical range endpoints also for UCS2 data:

create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=ucs2;
update ucs2bug set a='' where a=0x0120202120202020;

2: /x *max_key->key @ max_key->length = {0x6, 0x0, 0x1, 0x20, 0x20, 0x21, 0x20, 0x20}
1: /x *min_key->key @ min_key->length = {0x4, 0x0, 0x1, 0x20, 0x20, 0x21, 0x20, 0x20}

Contrary to what I expect, the first two bytes (true varchar length) appear to be characters, not bytes. This probably causes the ERROR 1032 in InnoDB. Also, min_key is truncated from 0x012020212020 to 0x01202021.
[22 Mar 2005 21:53] Heikki Tuuri
Hi!

Looks like Sergei's bug fix:
http://lists.mysql.com/internals/23297
fixes the UTF-8 case of this bug. Now that I look in gdb at an InnoDB search, start_key and end_key look sensible.

mysql> create table utf8bug(a varchar(4) not null, primary key(a(3))) charset=ut
f8 type=innodb;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> insert into utf8bug values (0x21e280a122);
Query OK, 1 row affected (0.00 sec)

mysql> update utf8bug set a='' where a=0x21e280a122;
Query OK, 1 row affected (1 min 41.20 sec)
Rows matched: 1  Changed: 1  Warnings: 0

(gdb) bt
#0  row_search_for_mysql (
    buf=0xa3aecc8 "\005!â\200¡\"", '¥' <repeats 26 times>, mode=2,
    prebuilt=0x43ede468, match_mode=0, direction=0) at row0sel.c:3033
#1  0x08276dad in ha_innobase::index_read(char*, char const*, unsigned, ha_rkey_
function) (this=0xa3aeb98,
    buf=0xa3aecc8 "\005!â\200¡\"", '¥' <repeats 26 times>,
    key_ptr=0xa39a2e8 "\005", key_len=11, find_flag=HA_READ_KEY_OR_NEXT)
    at ha_innodb.cc:3402
#2  0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co
nst*, bool, bool) (this=0xa3aeb98, start_key=0xa399a10, end_key=0xa399a1c,
    eq_range_arg=false, sorted=false) at handler.cc:2249
#3  0x08269a8c in handler::read_multi_range_first(st_key_multi_range**, st_key_m
ulti_range*, unsigned, bool, st_handler_buffer*) (this=0xa3aeb98,
    found_range_p=0x88f7d308, ranges=0xa399a10, range_count=1, sorted=false,
    buffer=0x0) at handler.cc:2121
#4  0x08258090 in QUICK_RANGE_SELECT::get_next() (this=0xa39a1b8)
    at opt_range.cc:6052
#5  0x08261621 in rr_quick (info=0x88f7d49c) at records.cc:157
#6  0x08216d5f in mysql_update(THD*, st_table_list*, List<Item>&, List<Item>&, I
tem*, unsigned, st_order*, unsigned long, enum_duplicates, bool) (
    thd=0xa37d1c0, table_list=0xa3749c8, fields=@0xa37d44c, values=@0xa37d608,
    conds=0xa374d10, order_num=0, order=0x0, limit=4294967295,
    handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:332
#7  0x081b7089 in mysql_execute_command(THD*) (thd=0xa37d1c0)
    at sql_parse.cc:3099
#8  0x081bd1cb in mysql_parse(THD*, char*, unsigned) (thd=0xa37d1c0,
    inBuf=0xa374948 "update utf8bug set a='' where a=0x21e280a122", length=44)
    at sql_parse.cc:5170
#9  0x081b30f7 in dispatch_command(enum_server_command, THD*, char*, unsigned)
    (command=COM_QUERY, thd=0xa37d1c0,
    packet=0xa3959c1 "update utf8bug set a='' where a=0x21e280a122",
    packet_length=45) at sql_parse.cc:1647
#10 0x081b293d in do_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:1453
#11 0x081b1acc in handle_one_connection (arg=0xa37d1c0) at sql_parse.cc:1110
#12 0x40062f60 in pthread_start_thread () from /lib/i686/libpthread.so.0
#13 0x400630fe in pthread_start_thread_event () from /lib/i686/libpthread.so.0
#14 0x401f5327 in clone () from /lib/i686/libc.so.6
(gdb) frame 2
#2  0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co
nst*, bool, bool) (this=0xa3aeb98, start_key=0xa399a10, end_key=0xa399a1c,
    eq_range_arg=false, sorted=false) at handler.cc:2249
2249        result= index_read(table->record[0],
Current language:  auto; currently c++
(gdb) x/11b start_key->key
0xa39a2e8:      0x05    0x00    0x21    0xe2    0x80    0xa1    0x22    0x00
0xa39a2f0:      0x00    0x00    0x00
(gdb) x/11b end_key->key
0xa39a2f8:      0x09    0x00    0x21    0xe2    0x80    0xa1    0x22    0x20
0xa39a300:      0x20    0x20    0x20
(gdb)

But the UCS-2 case of this bug remains. start_key and end_key are strange.

Regards,

Heikki

mysql> create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=uc
s2 type=innodb;
ERROR 1050 (42S01): Table 'ucs2bug' already exists
mysql> drop table ucs2bug;
Query OK, 0 rows affected (0.01 sec)

mysql> create table ucs2bug(a varchar(4) not null, primary key(a(3))) charset=uc
s2 type=innodb;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> update ucs2bug set a='' where a=0x0120202120202020;

(gdb) bt
#0  row_search_for_mysql (
    buf=0xa39c9d8 "\006\001  !  ", '¥' <repeats 25 times>, mode=2,
    prebuilt=0x43edf468, match_mode=0, direction=0) at row0sel.c:3033
#1  0x08276dad in ha_innobase::index_read(char*, char const*, unsigned, ha_rkey_
function) (this=0xa39c8a8,
    buf=0xa39c9d8 "\006\001  !  ", '¥' <repeats 25 times>,
    key_ptr=0xa39a2e8 "\004", key_len=8, find_flag=HA_READ_KEY_OR_NEXT)
    at ha_innodb.cc:3402
#2  0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co
nst*, bool, bool) (this=0xa39c8a8, start_key=0xa37f938, end_key=0xa37f944,
    eq_range_arg=false, sorted=false) at handler.cc:2249
#3  0x08269a8c in handler::read_multi_range_first(st_key_multi_range**, st_key_m
ulti_range*, unsigned, bool, st_handler_buffer*) (this=0xa39c8a8,
    found_range_p=0x88f7d308, ranges=0xa37f938, range_count=1, sorted=false,
    buffer=0x0) at handler.cc:2121
#4  0x08258090 in QUICK_RANGE_SELECT::get_next() (this=0xa39a1b8)
    at opt_range.cc:6052
#5  0x08261621 in rr_quick (info=0x88f7d49c) at records.cc:157
#6  0x08216d5f in mysql_update(THD*, st_table_list*, List<Item>&, List<Item>&, I
tem*, unsigned, st_order*, unsigned long, enum_duplicates, bool) (
    thd=0xa37d1c0, table_list=0xa3749d0, fields=@0xa37d44c, values=@0xa37d608,
    conds=0xa374d28, order_num=0, order=0x0, limit=4294967295,
    handle_duplicates=DUP_ERROR, ignore=false) at sql_update.cc:332
#7  0x081b7089 in mysql_execute_command(THD*) (thd=0xa37d1c0)
    at sql_parse.cc:3099
#8  0x081bd1cb in mysql_parse(THD*, char*, unsigned) (thd=0xa37d1c0,
    inBuf=0xa374948 "update ucs2bug set a='' where a=0x0120202120202020",
    length=50) at sql_parse.cc:5170
#9  0x081b30f7 in dispatch_command(enum_server_command, THD*, char*, unsigned)
    (command=COM_QUERY, thd=0xa37d1c0,
    packet=0xa3959c1 "update ucs2bug set a='' where a=0x0120202120202020",
    packet_length=51) at sql_parse.cc:1647
#10 0x081b293d in do_command(THD*) (thd=0xa37d1c0) at sql_parse.cc:1453
#11 0x081b1acc in handle_one_connection (arg=0xa37d1c0) at sql_parse.cc:1110
#12 0x40062f60 in pthread_start_thread () from /lib/i686/libpthread.so.0
#13 0x400630fe in pthread_start_thread_event () from /lib/i686/libpthread.so.0
#14 0x401f5327 in clone () from /lib/i686/libc.so.6
(gdb) frame 2
#2  0x08269e40 in handler::read_range_first(st_key_range const*, st_key_range co
nst*, bool, bool) (this=0xa39c8a8, start_key=0xa37f938, end_key=0xa37f944,
    eq_range_arg=false, sorted=false) at handler.cc:2249
2249        result= index_read(table->record[0],
Current language:  auto; currently c++
(gdb) x/11b start_key->key
0xa39a2e8:      0x04    0x00    0x01    0x20    0x20    0x21    0x20    0x20
0xa39a2f0:      0x01    0xa5    0xa5
(gdb) x/11b end_key->key
0xa39a2f8:      0x06    0x00    0x01    0x20    0x20    0x21    0x20    0x20
0xa39a300:      0x37    0xa5    0xa5
(gdb)
[23 Mar 2005 11:25] Sergei Golubchik
There are two problems here.

One is with space padding/trimming in opt_range.cc - but it's only
an annoyance, because it does not affect the result
(MySQL askes for larger range than it's really necessary).
You can see it, because your test case works with MyISAM.
Anyway, I fixed it

The real bug in in ha_innobase::position().

You see, because you update the same column that is used in WHERE, MySQL cannot update it directly - it may be an endless loop for a query like

  UPDATE a=a+1 WHERE a > 0

So MySQL first collects all matching rows (row positions) in a temporary IO_CACHE, and then updates all them.

If you'll step over mysql_update you will see that first read_record(&info) at line 332 succeeds, then MySQL calls table->file->position() and the recorded position is NOT correct. No wonder second read_record() at line 401 fails.

In short, ha_innobase::position works wrong when you have primary key on a part of the field. Here's the test case without ucs2 involved:

mysql> create table t1 (a varchar(20), primary key (a(5))) type=innodb;
mysql> insert t1 values ('1234567890');
mysql> update t1 set a='12' where a='1234567890';
ERROR 1032 (HY000): Can't find record in 't1'

btw, check 4.1 and 4.0 - they could be affected too
[23 Mar 2005 15:49] Heikki Tuuri
Hi!

I tested the t1 test of Sergei with InnoDB-4.1, with both latin1 and utf8, and it worked ok there. Next I will look at the 5.0 failure.

Regards,

Heikki
[23 Mar 2005 16:38] Heikki Tuuri
Hi!

I found the bug. The patch to follow in 3 minutes.

--Heikki
[23 Mar 2005 16:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/23314
[23 Mar 2005 16:48] Heikki Tuuri
The fix pushed to the 5.0.3 tree. Thank you to Marko and Sergei for finding this bug.

Regards,

Heikki