Bug #21474 | There is a rotation before the last table_map | ||
---|---|---|---|
Submitted: | 7 Aug 2006 8:23 | Modified: | 26 Oct 2006 3:11 |
Reporter: | Sveta Smirnova | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S3 (Non-critical) |
Version: | 5.1 | OS: | |
Assigned to: | Mats Kindahl | CPU Architecture: | Any |
[7 Aug 2006 8:23]
Sveta Smirnova
[7 Aug 2006 14:07]
MySQL Verification Team
Bug: http://bugs.mysql.com/bug.php?id=21479 was marked as duplicate of this one.
[6 Oct 2006 20:03]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13282 ChangeSet@1.2345, 2006-10-06 22:03:12+02:00, mats@romeo.(none) +1 -0 BUG#21474 (There is a rotation before the last table map): Removing code to step the group log position and just stepping the event log position. If the group log position were stepped one time too many, it might be that the group starts at a position that is not possible, e.g., at a Rows_log_event, or between an Intvar_log_event and the following associated Query_log_event.
[10 Oct 2006 13:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/13394 ChangeSet@1.2345, 2006-10-10 15:18:21+02:00, mats@romeo.(none) +2 -0 BUG#21474 (There is a rotation before the last table map): Removing code to step the group log position and just stepping the event log position. If the group log position were stepped one time too many, it might be that the group starts at a position that is not possible, e.g., at a Rows_log_event, or between an Intvar_log_event and the following associated Query_log_event.
[11 Oct 2006 9:54]
Lars Thalmann
Pushed into 5.1.12. ----------------- Guilhem described the problem like this (slightly edited by Lars): - Master has a Table_map and a Rows event to send to slave (the Rows event is about the table id defined in the Table_map). This can also apply to a transaction (some "BEGIN" query event, then some "INSERT", then some "COMMIT"). - Master sends the Table_map, slave receives it, writes it to relay log, notices that relay log is too big (option --max-relay-log-size): creates a new relay log; - Master sends the Rows event, slave receives it, writes to 2nd relay log. So we have on slave: - first relay log: Table_map event Rotate to 2nd relay log (this event has server id of the slave) - second relay log: Format_description event (this event has server id of the slave) Rows event Slave SQL thread comes to executing the Table_map, then the Rotate would cause inc_group_relay_log_pos() to be called, thus recording in relay-log.info that the last group starts at the position of the Rotate. If now slave is stopped, it will restart from the Rotate, come to the Rows, and wonder for what table this table id is. In other words, the Rows is logically part of the same group as the Table_map, and the Rotate got in the way. ----------------- The fix is making sure that skipped events due to same server id do not increase the group position.
[11 Oct 2006 9:57]
Lars Thalmann
Note that it is not required that table maps and rows events that belong together are in the same relay log file. What is required is that the group position (the position stored in relay_log.info) is always correct.
[11 Oct 2006 10:10]
Lars Thalmann
See also BUG#23171
[11 Oct 2006 17:40]
Wei Li
I think the fix should apply to all cases that might call flush_relay_log_info(). It is possible that some other events might call flush_relay_log_info() For example, Format_description_log_event::exec_event() also makes the call.
[12 Oct 2006 8:47]
Mats Kindahl
This is entirely correct and the second patch do not step the group log position nor flush the relay log info to disk when executing Format_description_log_event. Although this solution eliminates the immediate potential for failure, the solution is not entirely satisfactory. For that reason, we created BUG#23171, which will implement a complete solution so that skipping and execution will update relay log information in an identical manner.
[26 Oct 2006 3:11]
Paul DuBois
Noted in 5.1.12 changelog.