Bug #41461 | Purging causes errors and lock wait timeouts when several agents connected | ||
---|---|---|---|
Submitted: | 15 Dec 2008 8:42 | Modified: | 17 Jul 2009 9:23 |
Reporter: | Simon Mudd (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Enterprise Monitor: Web | Severity: | S3 (Non-critical) |
Version: | 2.0.0.7101 | OS: | Any |
Assigned to: | Andy Bang | CPU Architecture: | Any |
Tags: | windmill |
[15 Dec 2008 8:42]
Simon Mudd
[15 Dec 2008 9:02]
Simon Mudd
Seems this may have been related to restarting the merlin server and enabling purging. I see in the merlin log many error messages relating to lock timeouts and retrying transactions. I'll try and make the merlin log file available for analysis.
[15 Dec 2008 9:25]
Simon Mudd
Merlin is monitoring 134 instances and merlin db occupies 35GB. Purging set to 7 days after having previously been set to never.
[15 Dec 2008 11:04]
Eric Herman
sjmudd will try boosting innodb_buffer_pool from 1024M to 4096M
[15 Dec 2008 11:28]
Eric Herman
sjmudd will try shutting down most agents to try to give purging a chance to act with less incoming agent data.
[15 Dec 2008 12:07]
Eric Herman
sjmudd sees that tmp_table_size and max_heap_table_size are 64M I advised him to set to 512M
[5 Jan 2009 17:01]
Josh Sled
As there is lock contention between the purge/delete operation and other inserts from DC collection, even though we know that these operations are not going to conflict (deleting "old" data while inserting "new" data, and the lock waits are gap locks probably on instance_attribute_id in the primary index) … adding "innodb_locks_unsafe_for_binlog=1" to options will prevent the lock wait timeouts. As well, mysql 5.1 has other changes which should eliminate the lock wait timeouts independent of that option. Looking to move to use of 5.1 in MEM 2.1.
[12 Jan 2009 20:48]
Gary Whizin
For 2.0: installer should set innodb_locks_unsafe_for_binlog (will not permit replicating Monitor repo) For 2.1: 1) Need BR to enable RBR binlog by default 2) switch mysql 5.1 3) clear innodb_locks_unsafe_for_binlog from my.cnf
[12 Jan 2009 20:50]
Andy Bang
I'll handle the 2.0 fix myself. I'll coordinate the 2.1 stuff with you separately.
[12 Jan 2009 23:26]
Andy Bang
For 2.0 I added "innodb_locks_unsafe_for_binlog = 1" to my.cnf/my.ini. Committed revision 760. However, we'll have a different fix in 2.1, so after you test this, please put this bug back to Verified and change the Target Version to 2.1.
[13 Jan 2009 8:13]
Simon Mudd
-- quote -- For 2.0: installer should set innodb_locks_unsafe_for_binlog (will not permit replicating Monitor repo) -- quote -- Is this really necessary? If the change is needed due to larger sites where the db is busy then make the change BY HAND. You now know it needs doing. Doing it everywhere means you CAN'T EVER safely replicate the merlin db even on sites where the db is not so busy. It doesn't strike me as being a good idea. Checking and removing the config for 5.1 boxes is probably a good idea.
[13 Jan 2009 10:02]
Mark Leith
Hi Simon, Across all of the installations I have helped people with I have never once heard of anybody replicating the merlin repository (or indeed asking for advice on enabling log-bin for recovery), whilst I have heard of quite a number of issues with lock wait timeouts. Therefore the benefits outweigh the downsides in this case. We would prefer to walk people through disabling innodb_locks_unsafe_for_binlogs and enabling log-bin (by far the minority), and discussing the potential impacts, than we would the opposite. 2.1 should make this issue go away anyway, as we are upgrading the repository to 5.1. Cheers, Mark
[13 Jan 2009 12:01]
Simon Mudd
Hi Mark, Fair enough. We'd like to use merlin as an enterprise monitoring system perhaps later replacing our use of other tools to do this job (for the MySQL servers at least). When you get to depend on merlin then you want to have a backup strategy and replication would be one way of doing that at least for the db. Certainly there's no direct support for multi-server merlin instances to provide some sort of cold/hot failover at the moment, so perhaps I'm worrying about this unnecessarily and thinking too far ahead. I still think that these type of tweaks really means that you can no longer depend on a database to recover if it crashes and is busy doing large transactions. Perhaps I'm just unduly overcautious.
[13 Jan 2009 13:13]
Mark Leith
Indeed, it is of course a valid concern for people that entirely depend on the MEM installation, and rest assured that we are not standing still where this is concerned (we are now actively getting the 2.1 installer made with 5.1). I'm also looking at some other solutions for the purging as part of a slightly different, but related, bug: http://bugs.mysql.com/bug.php?id=42061 Cheers, Mark
[16 Jan 2009 20:19]
Bill Weber
verified "innodb_locks_unsafe_for_binlog = 1" was added to my.cnf/my.ini in build 2.0.3.7134 - setting back to "Verified" for 2.1
[26 Feb 2009 20:25]
Keith Russell
Patch installed in versions => 2.1.0.1011.
[17 Mar 2009 19:12]
Bill Weber
verified the my.cnf/ini is exactly as listed above in build 2.1.0.1015
[17 Jul 2009 9:23]
Tony Bedford
An entry was added to the 2.1.0 changelog: In the Enterprise Dashboard, when a new server group was clicked in the main tab an error message was generated. On checking the Monitor log there were many error messages related to lock timeouts and having to retry transactions. This problem occurred after enabling purging of the Repository.