Bug #5640 innodb problems with utf8 and prefix-indexes on string column
Submitted: 18 Sep 2004 9:54 Modified: 18 Oct 2004 16:35
Reporter: Bernd Heller Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:4.1.4-gamma OS:MacOS (Mac OS X 10.3.5)
Assigned to: Heikki Tuuri CPU Architecture:Any

[18 Sep 2004 9:54] Bernd Heller
Description:
I'm using the following table:

CREATE TABLE `test` (
  `CityID` bigint(20) unsigned NOT NULL auto_increment,
  `Name` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`CityID`),
  KEY `Name` (`Name`(2))
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

When the column contains a lot of names with odd characters (like 0xC481 or similar) and I 
truncate the table, I get errors like this:

InnoDB: error in sec index entry update in
InnoDB: index `Name` of table `partnerminedb/test`
InnoDB: tuple DATA TUPLE: 2 fields;
 0: len 6; hex 4768756cc481; asc Ghul  ;; 1: len 8; hex 000000000000000b; asc         ;;

InnoDB: record PHYSICAL RECORD: n_fields 2; 1-byte offs TRUE; info bits 32
 0: len 6; hex 4768756c6164; asc Ghulad;; 1: len 8; hex 000000000000000a; asc         ;;

TRANSACTION 0 7723, ACTIVE 0 sec, OS thread id 9105408 updating or deleting, thread 
declared inside InnoDB 479
mysql tables in use 1, locked 1
2 lock struct(s), heap size 320, undo log entries 11
MySQL thread id 940, query id 12816 localhost root updating
TRUNCATE  TABLE  `test`

InnoDB: Submit a detailed bug report to http://bugs.mysql.com

The problem does NOT show up if I do one of the following
1) extend the index on the name column to full length
2) Change collation to utf8_bin
3) only insert ascii characters

This bug is related to #5630 that I reported earlier. I opened this one because it's a much more 
precise one and can probably be fixed more easily. My strong guess is that this bug was also the 
reason for the data corruption in bug #5630.

How to repeat:
I was able to create a testcase of about 25 records which I will attach to this bug as file 
"testcase.sql".
To repeat it feed this sql into a database like
mysql < testcase.sql
which will create the table "test" with some records and the problematic index.
Then do "truncate table test" and the error pasted above should appear in the log.

This is not a simple bug like "deleting one particular record will trigger it". So far, I only managed
to trigger it with a bigger number of records and only when bulk deleting them.
[18 Sep 2004 9:55] Bernd Heller
test case to repeat this bug

Attachment: testcase.sql (application/octet-stream, text), 1.86 KiB.

[18 Sep 2004 19:50] Alexander Keremidarski
Can't repeat it under Linux with current 4.1 from bk tree
ChangeSet@1.2015.1.10, 2004-09-16 22:13:24+04:00, konstantin@mysql.com
[20 Sep 2004 11:44] Heikki Tuuri
Hi!

Hmm... this may be the same column prefix index + UTF-8 bug that Jan Lindström has a patch for. The patch will make InnoDB to store in Name(2) ONLY the 2 first UTF-8 characters (which may be up to 3 bytes long). Currently, MySQL instructs InnoDB to store 3 x 2 = 6 first BYTES to the column prefix index. That may be up to 6 UTF-8 characters.

The reason why TRUNCATE digs up the problem is that it actually does internally a DELETE FROM ..., and it searches for all the secondary index records when it delete-marks the rows.

The bug may also be something different. Jan Lindström has to investigate this more. We will try to repeat the bug here.

Thank you for the bug report.

Heikki
[8 Oct 2004 6:08] Jan Lindström
I could not repeat this problem using 4.1.6.
[20 Dec 2005 11:52] Dimitrij HIlt
Hi,

we have same problem with mysql-4.1.13.
|CREATE TABLE `USEROFFER` (
  `Id` bigint(20) unsigned NOT NULL auto_increment,
  `Lastchanged` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_T
IMESTAMP,
  `Name` varchar(255) collate latin1_german2_ci NOT NULL default '',
  `Countrycode` smallint(5) unsigned default '0',
  `IDNname` varchar(255) collate latin1_german2_ci default NULL,
  `Revname` varchar(255) collate latin1_german2_ci NOT NULL default 'Z',
  `Visitors` int(11) default NULL,
  `Mode` enum('0','1','2') collate latin1_german2_ci NOT NULL default '0',
  `Type` enum('D','P','F') collate latin1_german2_ci NOT NULL default 'D',
  `Tosell` enum('Y','N') collate latin1_german2_ci NOT NULL default 'Y',
  `Portfolio` enum('Y','N') collate latin1_german2_ci NOT NULL default 'Y',
  `Comment` text collate latin1_german2_ci,
  `Infourl` varchar(255) collate latin1_german2_ci default NULL,
  `Price` int(11) default NULL,
  `Minprice` int(11) default NULL,
  `Currency` tinyint(3) unsigned default '0',
  `Namelanguage` tinyint(3) unsigned default '0',
  `Inserted` datetime default NULL,
  `Lastchecked` date default NULL,
  `Escrow` smallint(6) unsigned NOT NULL default '0',
  `Origin` enum('S','T','A','E') collate latin1_german2_ci NOT NULL default 'S',
  `Member` mediumint(9) NOT NULL default '0',
  `Interesting` tinyint(3) default NULL,
  `Pageviews` int(11) default NULL,
  `Members` int(11) default NULL,
  `Targetgroup` varchar(255) collate latin1_german2_ci default NULL,
  `Linkpopularity` mediumint(8) NOT NULL default '-1',
  `Preferedcontent` mediumint(8) unsigned NOT NULL default '0',
  `Bidrange` smallint(6) NOT NULL default '100',
  `Parking_allowed` enum('Y','N') collate latin1_german2_ci NOT NULL default 'Y'
,
  `Etracked` enum('Y','N') collate latin1_german2_ci NOT NULL default 'N',
  `Hitlist_allowed` enum('Y','N') collate latin1_german2_ci NOT NULL default 'Y'
,
  `Lock_Tosell` enum('Y','N') collate latin1_german2_ci NOT NULL default 'N',
  PRIMARY KEY  (`Id`),
  KEY `Member_idx` (`Member`),
  KEY `Name_cc_idx` (`Name`,`Countrycode`),
  KEY `Escrow_idx` (`Escrow`),
  KEY `IDNname_idx` (`IDNname`(8)),
  KEY `Type_idx` (`Type`),
  KEY `Inserted_idx` (`Inserted`),
  KEY `Lastchanged_idx` (`Lastchanged`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_german2_ci PACK_KEYS=1

InnoDB: error in sec index entry update in
InnoDB: index `IDNname_idx` of table `dbsedo/USEROFFER`
InnoDB: tuple DATA TUPLE: 2 fields;
 0: len 3; hex 6f6b74; asc okt;; 1: len 8; hex 0000000000410b21; asc      A !;;

InnoDB: record PHYSICAL RECORD: n_fields 2; 1-byte offs TRUE; info bits 0
 0: len 8; hex f66c627574696b65; asc  lbutike;; 1: len 8; hex 000000000052973a; 
asc      R :;;

TRANSACTION 0 37463464, ACTIVE 0 sec, process no 2920, OS thread id 585872 updat
ing or deleting, thread declared inside InnoDB 499
mysql tables in use 1, locked 1
2 lock struct(s), heap size 320, undo log entries 1
MySQL thread id 397968, query id 1633314 
DELETE FROM USEROFFER WHERE Id=4262689

And on lot another tables on this database.

DImi
[12 Apr 2011 10:19] Kanako Nakai
this bug was alrady fixed?
I could not get correct resolt.
I use prefix index on 5.0.83(innodb) and 5.1.53(innodb-plugin).
both resolt was NG.
please check following test case.

create table test (
 val varchar(255),
 KEY idx_1(val(5))
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into test values ("あ"),("あい"),("あいう"),("あいうえ"),("あいうえお");

mysql> select * from test;
+-----------------+
| val             |
+-----------------+
| あ              |
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
5 rows in set (0.00 sec)

mysql> select * from test where val like "あ%";
+-----------------+
| val             |
+-----------------+
| あ              |
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
5 rows in set (0.00 sec)

mysql> select * from test where val like "あい%";
+-----------------+
| val             |
+-----------------+
| あい            |
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
4 rows in set (0.00 sec)

mysql> select * from test where val like "あいう%";
+-----------------+
| val             |
+-----------------+
| あいう          |
| あいうえ        |
| あいうえお      |
+-----------------+
3 rows in set (0.00 sec)

mysql> select * from test where val like "あいうえ%";
+--------------+
| val          |
+--------------+
| あいうえ     |
+--------------+
1 row in set (0.00 sec)

#This resolt rows should return 2, but I got only 1 row. 

mysql> select * from test where val like "あいうえお%";
+-----------------+
| val             |
+-----------------+
| あいうえお      |
+-----------------+
1 row in set (0.00 sec)