Bug #26835 Repeatable corruption of utf8-enabled tables inside InnoDB
Submitted: 5 Mar 2007 7:52 Modified: 18 Jun 2010 12:51
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0-bk, 5.1-bk OS:Linux (Linux, MacOSX, ..)
Assigned to: Marko Mäkelä CPU Architecture:Any
Tags: corruption, innodb

[5 Mar 2007 7:52] Domas Mituzas
Description:
Immediately rewritten UTF8 data in InnoDB corrupts tables, possibly due to adaptive hash index problems.

How to repeat:
Run attached testcase.

Suggested fix:
n/a
[5 Mar 2007 7:58] Domas Mituzas
To look for:

mismatching number of records in indexes, error-log entry:
Error: index `phrases$i_id` of table `test`.`phrases` contains 14 entries, should be 15

data in some fields ends up corrupted.
[5 Mar 2007 8:28] Domas Mituzas
Reduced repeatable testcase

Attachment: test2.sql (application/octet-stream, text), 428 bytes.

[5 Mar 2007 8:32] Domas Mituzas
The bug seems to be caused by different (missing?) normalization routines for adaptive hash index.
[5 Mar 2007 13:16] Marko Mäkelä
The function row_upd_changes_field_size_or_external() should note that UTF-8 CHAR columns in ROW_FORMAT=COMPACT are actually variable-length, and test the actual storage size of the new value.
[5 Mar 2007 13:57] Marko Mäkelä
The bug is in rec_offs_nth_size(), which returns invalid data for n==0. The function is only called by row_upd_changes_field_type_or_external(). Thus, the bug only manifests itself when attempting to update the first column of a record (in a secondary index or in the clustered index).

Fix: Add

	if (!n) {
		return(rec_offs_base(offsets)[1 + n] & REC_OFFS_MASK);
	}

before the return statement in rec_offs_nth_size().
[5 Mar 2007 14:19] Marko Mäkelä
The bug should only manifest itself under the following conditions:

(1) the clustered index (primary key or unique key) of the table begins with a char or varchar column.

(2) a record is deleted

(3) before the record is purged, an equivalent record (under the charset-collation of the column) is inserted so that the size changes

In test2.sql, the three-byte char U+FF24 (0xefbca4, D) is equivalent to the one-byte char D.
[5 Mar 2007 19:10] James Day
Marko, just to be certain, please confirm that this can only affect the compact row format, so it can only happen in versions 5.0 and later.
[5 Mar 2007 20:55] Marko Mäkelä
James,

this bug could affect ROW_FORMAT=REDUNDANT too. The error is that rec_offs_nth_size(offsets, 0) would subtract rec_offs_extra_size(offsets) from the result.

In test2.sql, extra_size==6 and the length of the first column (DDD) is 9 bytes. Because 9-6=3, InnoDB would allow an in-place update with the 3-byte value DDD.

I believe that the test case can be modified for ROW_FORMAT=REDUNDANT by adding variable-length columns so that extra_size will be a multiple of 3.

Note that in UPDATE, InnoDB uses a binary equality test to enable update-in-place.  This bug should only affect INSERT by updating a delete-marked record in the clustered index, and only if the first column of the clustered index uses a collation where byte sequences of differing length can be considered equal.
[5 Mar 2007 20:58] Marko Mäkelä
Oh yes, I can confirm that this bug only exists in MySQL 5.0.3 and later.  The function rec_get_offsets() and the functions beginning with rec_offs_ were introduced in MySQL 5.0.3.
[6 Mar 2007 7:25] Marko Mäkelä
Here is a test case for ROW_FORMAT=REDUNDANT:

DROP TABLE IF EXISTS t1;
CREATE TABLE `t1` (
  `word` varchar(50) collate utf8_unicode_ci NOT NULL PRIMARY KEY
) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;

INSERT INTO t1 VALUES (0xEFBCA4EFBCA4EFBCA4EFBCA4EFBCA441);
DELETE FROM t1;
INSERT INTO t1 VALUES  (0x4444444444C384);
SELECT * FROM t1;

That is, we insert and delete the string DDDDDA and then insert the string DDDDDÄ.
[6 Apr 2007 17:21] Bugs System
Pushed into 5.0.40
[6 Apr 2007 17:23] Bugs System
Pushed into 5.1.18-beta
[6 Apr 2007 17:48] Paul DuBois
Noted in 5.0.40, 5.1.18 changelog.

For InnoDB tables having a clustered index that began with a CHAR or
VARCHAR column, deleting a record and then inserting another before
the deleted record was purged could result in table corruption.
[5 May 2010 15:05] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 2:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 5:46] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:16] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:44] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:51] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 11:47] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:24] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:11] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)