| Bug #39648 | Replication failure on RBR + Innodb + 2 bit fields + LIMIT + no PK | ||
|---|---|---|---|
| Submitted: | 25 Sep 2008 12:34 | Modified: | 12 Mar 2009 23:17 |
| Reporter: | Philip Stoev | ||
| Status: | Closed | ||
| Category: | Server: InnoDB | Severity: | S2 (Serious) |
| Version: | 5.1, 6.0 bzr | OS: | Any |
| Assigned to: | Marko Mäkelä | Target Version: | 5.1+ |
| Triage: | Triaged: D2 (Serious) / R2 (Low) / E3 (Medium) | ||
[25 Sep 2008 12:34]
Philip Stoev
[25 Sep 2008 18:41]
Sveta Smirnova
Thank you for the report. I can not repeat error with 5.1 and with 6.0 I get: Last_Error Could not execute Delete_rows event on table test.t1; Unknown table engine 'innodb', Error_code: 1286; Using storage engine MyISAM for table 't1', Error_code: 1266; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 540 Is it same as in your case? Could you please confirm or reject if problem is repeatable with last 5.1. sources.
[25 Sep 2008 18:59]
Philip Stoev
From this message "Unknown table engine 'innodb', Error_code: 1286; Using storage engine MyISAM for table 't1'," it appears that your servers do not start with innodb enabled. So I can not tell if it is the same bug or not. Please make sure you have innodb enabled and try again.
[25 Sep 2008 20:26]
Sveta Smirnova
Thank you for the feedback. Forgot to start slave with option --innodb. Verified as described with error: Last_SQL_Errno 0 Last_SQL_Error Could not execute Delete_rows event on table test.t1; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 540
[23 Oct 2008 11:39]
Lars Thalmann
On Thu, Oct 23, 2008 at 11:32:24AM +0200, Mats Kindahl wrote: > Here is a detailed description of the problem and the solution. > > The cause of BUG#39648 is the way that InnoDB handles BIT fields that are NULL. > In this particular case, we have the following scenario: > > - We have a table t1(a BIT, b BIT) holding the tuple (0, NULL). > > - We get a delete row from the master representing the tuple (0, NULL) > > The default record for the table is then [FF 00 00], i.e., the NULL bits are set > for the record and the contents of the fields are 0. > > Delete proceeds by locating the row to delete using one of four different > techniques, three of these require primary key or index, and the last fall-back > is using range search. In this case, a range search will be used. > > To perform a range search, we call rnd_init() to start the range search over the > entire table (the same applies when doing an index search using index_next() and > friends), and then use rnd_next() to fetch the records one by one. > > Row-based replication does the following steps for each row: > > 1. Copy the default record into record[0] (which is used for searching). > > a) This is done so that engines do not have to update fields that are > not in the read set or which are NULL and the memcmp() described below > will still result in a match. > > 2. Fill in record[0] with the fields received from the master (which is not > necessarily a full row). In this particular case, it will be [FD 00 *00*], > where the marked byte is the value from the default record while the other 0 > is the "0" in the tuple. Note that FD == 11111101, i.e., the NULL bit for the > first column is clear while the NULL bit for the second column is set (the > least significant bit is the X bit). > > 3. Copy the resulting record to record[1] > > 4. Call rnd_next() to get the next record into record[0]. Here it is expected > that the storage engine do not touch any fields in the record unless the read > bit is set and the field is non-NULL. > > a) InnoDB pads all the NULL BIT fields with 0x20 (space) *and* sets the NULL > bit, with the resulting record [FD 00 20] > > 5. The replication code does a memory compare of the record to decide if there > is a match. > > a) Since InnoDB wrote 0x20 into the NULL field, they don't match. > > The alternative is to do a field-by-field comparison, which will catch the fact > that the field is NULL and ignore the fact that the value was updated by InnoDB. > However, that will be horribly slow compared to a simple memcmp(), so in that > case, I will just add that workaround for the InnoDB engine.
[24 Oct 2008 9:18]
Mats Kindahl
Here is a detailed description of the problem and the solution. The cause of BUG#39648 is the way that InnoDB handles BIT fields that are NULL. In this particular case, we have the following scenario: - Table t1 holds the tuple (0, NULL). - A delete row arrives from the master representing the tuple (0, NULL), i.e., the tuple in the table shall be deleted. The default record for the table is then [FF 00 00], i.e., the NULL bits are set for the record and the contents of the fields are 0. (The first byte holds the NULL bits, the filler bits, and the X bit.) Note that since the NULL bits are set, the values in the record is not looked at when computing the value of the field for presenting to, e.g., a client. Delete proceeds by locating the row to delete using one of four different techniques, three of these require primary key or index, and the last fall-back is using range search. In this case, a range search will be used. To perform a range search, we call rnd_init() to start the range search over the entire table and then use rnd_next() to fetch the records one by one. The same applies when doing an index search using index_next() and friends, but we do not consider this case here. Row-based replication does the following steps for each row: 1. Copy the default record into record[0] (which is used for searching). a) This is done so that engines do not have to update fields that are not in the read set or which are NULL and the memcmp() described below will still result in a match. b) It also allows engines that don't know the contents of the row to "fake" finding the row and get the correct contents out. This is used by the Blackhole engine. 2. Fill in record[0] with the fields received from the master (which is not necessarily a full row). In this particular case, it will be [FD 00 *00*], where the marked byte is the value from the default record while the other 00 is the "0" in the tuple to delete. Note that FD == 11111101, i.e., the NULL bit for the first column is clear while the NULL bit for the second column is set (the least significant bit is the X bit). 3. Copy the resulting record to record[1] 4. Call rnd_next() to get the next record *in the table* into record[0]. Here it is expected that the storage engine do not touch any fields in the record unless the read bit is set and the field is non-NULL. In the event of a NULL field with the read bit set, the engine shall *only* touch the NULL bit (to set it). a) Unfortunately, since InnoDB treats BIT as equivalent to CHAR(1), the NULL BIT field is padded with 0x20 (space) *and* the NULL bit is set, with the resulting record [FD 00 20]. 5. The replication code does a memory compare of the record to decide if there is a match. a) Since InnoDB wrote 0x20 into the NULL field, they don't match, even though it is the correct record.
[10 Dec 2008 13:19]
Marko Mäkelä
Does the row-based replication always call handler::extra(HA_EXTRA_KEYREAD_PRESERVE_FIELDS)? If it does, I may have a patch, which must pass our internal review first. If not, then I must point out that CHECKSUM TABLE is apparently still committing the "old sin" of not initializing records[0] between retrieving rows. I found this out by making row_sel_store_mysql_rec() and row_sel_pop_cached_row_for_mysql() avoid writing any data for NULL columns. innodb.test would report mismatch for CHECKSUM TABLE.
[10 Dec 2008 14:54]
Mats Kindahl
I don't think that flags is currently set, but that can be arranged. It seems no other engine is using it anyway, so it should not cause any problems.
[12 Dec 2008 15:16]
Marko Mäkelä
Given that there are code paths that do not initialize all fields of records[0] before fetching a row from the storage engine (CHECKSUM TABLE being an example), I think that it would be safer to pad BIT columns with 0x00 instead of 0x20. I will try to come up with such a patch next week.
[17 Dec 2008 9:07]
Mats Kindahl
Marko,
Please consider using the value from the default record instead. In the event that the
default value for the field is something that is not 0, there will be a mismatch again
for tables like:
create table t1(a bit default b'1');
[4 Mar 2009 8:01]
Zhenxing He
BUG#40638 was marked as Duplicate of this bug.
[12 Mar 2009 23:17]
Paul DuBois
Noted in 5.1.32, 6.0.11 changelogs. With row-based binary logging, replication of InnoDB tables containing NULL-valued BIT columns could fail.
[3 Jun 2009 10:58]
Marko Mäkelä
Bug #40565 (different symptoms) shares the same cause.
[7 Jul 2009 9:52]
Bugs System
Pushed into 5.0.84 (revid:joro@sun.com-20090707074938-ksah1ibn0vs92cem) (version source revid:satya.bn@sun.com-20090625095026-w8pdxn8ls04g7tcp) (merge vers: 5.0.84) (pib:11)
[8 Jul 2009 15:30]
Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090708131116-kyz8iotbum8w9yic) (version source revid:staale.smedseng@sun.com-20090626084621-g6zteyvzehto95j0) (merge vers: 5.1.37) (pib:11)
[9 Jul 2009 9:35]
Bugs System
Pushed into 5.0.84 (revid:joro@sun.com-20090707074938-ksah1ibn0vs92cem) (version source revid:satya.bn@sun.com-20090625095026-w8pdxn8ls04g7tcp) (merge vers: 5.0.84) (pib:11)
[9 Jul 2009 9:36]
Bugs System
Pushed into 5.1.37 (revid:joro@sun.com-20090708131116-kyz8iotbum8w9yic) (version source revid:staale.smedseng@sun.com-20090626084621-g6zteyvzehto95j0) (merge vers: 5.1.37) (pib:11)
[10 Jul 2009 13:21]
Bugs System
Pushed into 5.4.4-alpha (revid:anozdrin@bk-internal.mysql.com-20090710111017-bnh2cau84ug1hvei) (version source revid:sergey.glukhov@sun.com-20090625092510-lq3mm6ytxb2ldfim) (merge vers: 5.4.4-alpha) (pib:11)
[26 Aug 2009 15:46]
Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 15:46]
Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 15:48]
Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 18:33]
Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
