MySQL Bugs: #26835: Repeatable corruption of utf8-enabled tables inside InnoDB

Bug #26835	Repeatable corruption of utf8-enabled tables inside InnoDB
Submitted:	5 Mar 2007 7:52	Modified:	18 Jun 2010 12:51
Reporter:	Domas Mituzas	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S1 (Critical)
Version:	5.0-bk, 5.1-bk	OS:	Linux (Linux, MacOSX, ..)
Assigned to:	Marko Mäkelä	CPU Architecture:	Any
Tags:	corruption, innodb

Description:
Immediately rewritten UTF8 data in InnoDB corrupts tables, possibly due to adaptive hash index problems.

How to repeat:
Run attached testcase.

Suggested fix:
n/a

To look for:

mismatching number of records in indexes, error-log entry:
Error: index `phrases$i_id` of table `test`.`phrases` contains 14 entries, should be 15

data in some fields ends up corrupted.

Reduced repeatable testcase

Attachment: test2.sql (application/octet-stream, text), 428 bytes.

The bug seems to be caused by different (missing?) normalization routines for adaptive hash index.

The function row_upd_changes_field_size_or_external() should note that UTF-8 CHAR columns in ROW_FORMAT=COMPACT are actually variable-length, and test the actual storage size of the new value.

The bug is in rec_offs_nth_size(), which returns invalid data for n==0. The function is only called by row_upd_changes_field_type_or_external(). Thus, the bug only manifests itself when attempting to update the first column of a record (in a secondary index or in the clustered index).

Fix: Add

	if (!n) {
		return(rec_offs_base(offsets)[1 + n] & REC_OFFS_MASK);
	}

before the return statement in rec_offs_nth_size().

The bug should only manifest itself under the following conditions:

(1) the clustered index (primary key or unique key) of the table begins with a char or varchar column.

(2) a record is deleted

(3) before the record is purged, an equivalent record (under the charset-collation of the column) is inserted so that the size changes

In test2.sql, the three-byte char U+FF24 (0xefbca4, Ｄ) is equivalent to the one-byte char D.

Marko, just to be certain, please confirm that this can only affect the compact row format, so it can only happen in versions 5.0 and later.

James,

this bug could affect ROW_FORMAT=REDUNDANT too. The error is that rec_offs_nth_size(offsets, 0) would subtract rec_offs_extra_size(offsets) from the result.

In test2.sql, extra_size==6 and the length of the first column (ＤＤＤ) is 9 bytes. Because 9-6=3, InnoDB would allow an in-place update with the 3-byte value DDD.

I believe that the test case can be modified for ROW_FORMAT=REDUNDANT by adding variable-length columns so that extra_size will be a multiple of 3.

Note that in UPDATE, InnoDB uses a binary equality test to enable update-in-place.  This bug should only affect INSERT by updating a delete-marked record in the clustered index, and only if the first column of the clustered index uses a collation where byte sequences of differing length can be considered equal.

Oh yes, I can confirm that this bug only exists in MySQL 5.0.3 and later.  The function rec_get_offsets() and the functions beginning with rec_offs_ were introduced in MySQL 5.0.3.

Here is a test case for ROW_FORMAT=REDUNDANT:

DROP TABLE IF EXISTS t1;
CREATE TABLE `t1` (
  `word` varchar(50) collate utf8_unicode_ci NOT NULL PRIMARY KEY
) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;

INSERT INTO t1 VALUES (0xEFBCA4EFBCA4EFBCA4EFBCA4EFBCA441);
DELETE FROM t1;
INSERT INTO t1 VALUES  (0x4444444444C384);
SELECT * FROM t1;

That is, we insert and delete the string ＤＤＤＤＤA and then insert the string DDDDDÄ.

Pushed into 5.0.40

Pushed into 5.1.18-beta

Noted in 5.0.40, 5.1.18 changelog.

For InnoDB tables having a clustered index that began with a CHAR or
VARCHAR column, deleting a record and then inserting another before
the deleted record was purged could result in table corruption.

Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.

Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)

Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.

Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)

Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)