Bug #45940 | issues around rli->last_event_start_time | ||
---|---|---|---|
Submitted: | 3 Jul 2009 14:58 | Modified: | 27 Jul 2009 12:53 |
Reporter: | Andrei Elkin | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.1+ | OS: | Any |
Assigned to: | Andrei Elkin | CPU Architecture: | Any |
[3 Jul 2009 14:58]
Andrei Elkin
[3 Jul 2009 15:03]
Andrei Elkin
For prioritizing please consider PB2 witnesses p.4 as http://pb2.norway.sun.com/web.py?template=mysql_show_test_failuretest_failure_id=2235535 Notice, it's a hanging not the crash but backtracing indeed.
[6 Jul 2009 13:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/78022 2812 Andrei Elkin 2009-07-06 Bug #45940 issues around rli->last_event_start_time there are several issues in implementation of the feature: 1. the scope of the timer is not the group but the statement It's set in Rows_log_event::do_apply_event() rli->last_event_start_time = time(0) and reset in Relay_log_info::cleanup_context() Besides not any statement but one marked with STMT_END_F. That makes possible the instant stop. However, consequences are well-known: bug#319, bug#38205. 2. the timer does not apply to SBR leaving out of consideration the mixed transaction (i.e involving a non-trans table) 3. the pk-case instant break of the group is not complimented with a warning suggesting to restart in the Idempotent mode 4. the timer is incoherent with fixes for bug#38205 that made the mixed engines group unbreakable but with KILL. Particularly, the timer's giving-out can't get control due to if (rli->abort_slave && rli->is_in_group() && thd->transaction.all.modified_non_trans_table) DBUG_RETURN(0); which can lead to STOP SLAVE hanging. Fixed 1-4 with incorporating the timer's logics with transaction.all.modified_non_trans_table flag and simplifying the timer's routine to be set and reset inside sql_slave_killed(). Couple of failure simulation are added and a former row-based format test extened to apply to any format.
[8 Jul 2009 12:49]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/78203 2813 Andrei Elkin 2009-07-08 Bug #45940 issues around rli->last_event_start_time post-reviewing patch with small improvements.
[23 Jul 2009 10:24]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090723102221-ps4uaphwbxzj8p0q) (version source revid:aelkin@mysql.com-20090708180952-73h6cpmg3r5wtxwa) (merge vers: 5.4.4-alpha) (pib:11)
[27 Jul 2009 12:53]
Jon Stephens
Documented bugfix in the 5.4.4 changelog as follows: When using row-based replication, the incomplete logging of a group of events involving both transaction and non-transactional tables could cause STOP SLAVE to hang.
[11 Mar 2010 18:02]
Paul DuBois
Noted in 5.5.3 changelog.
[26 May 2010 11:35]
Sveta Smirnova
Bug #53985 was marked as duplicate of this one.