Bug #8325 | Deadlock in replication thread stops replication | ||
---|---|---|---|
Submitted: | 4 Feb 2005 19:35 | Modified: | 23 Mar 2005 19:20 |
Reporter: | Chris Tucker | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 4.0.22 | OS: | Linux (Linux) |
Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
[4 Feb 2005 19:35]
Chris Tucker
[5 Feb 2005 9:04]
Heikki Tuuri
Chris, I had already decided to change 4.1.10 or 4.1.11 so that the replication thread is never chosen as the deadlock victim, and also, that the lock wait timeout is raised for the replication thread. If the problem comes from: INSERT INTO t SELECT ... FROM s; where s is a replicated table, then an upcoming change to the option innodb_locks_unsafe_for_binlog in 4.1 will also help. We intend to make the read of s a consistent, non-locking read if that option is set. Regards, Heikki
[6 Feb 2005 19:16]
Chris Tucker
Hi Heikki, I'm glad to hear this is a problem that is being addressed. However, I'm not sure always choosing the non-replication thread as the victim is necessarily a good thing to do: in my situation, particularly, I may have a read transaction that has been open for 30 minutes, while the replication transaction is typically under a second. In that case, I would much prefer the replication transaction be chosen as the victim, and then retried automatically (via a SLAVE START) than have to redo the expensive read queries. As my read transactions get longer (perhaps to the order of a couple of hours or so), there's a growing risk of them never completing successfully if they're always chosen as the deadlock victim, rather than using a who-has-done-the-most-work metric of some kind. Regards, Chris
[22 Feb 2005 20:29]
Guilhem Bichot
Heikki, I'm stealing this bug from you as agreed. Chris, I'm preparing a fix which will treat slave thread as a normal thread and make it retry transaction if it gets rolled back by deadlock or lock timeout.
[2 Mar 2005 17:00]
Guilhem Bichot
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: Fixed in 4.1.11 and 5.0.3, by introducing the SLAVE_TRANSACTION_RETRIES option (will be documented in the next days); the slave shall be started with --slave-transaction-retries=x where x>0 and then it will retry a failed transaction for x times before giving up.
[23 Mar 2005 18:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/internals/23323
[23 Mar 2005 19:20]
Guilhem Bichot
Sorry, the previous message "a patch has been committed" was created by an automated robot; all I did today is making small modifications to the patch I had already committed; i.e. nothing new. The feature --slave_transaction_retries appears in 4.1.11 and 5.0.3.