MySQL Bugs: #42951: Slave connections hang on "freeing items"

Bug #42951	Slave connections hang on "freeing items"
Submitted:	18 Feb 2009 1:50	Modified:	1 Apr 2009 3:23
Reporter:	Ian White	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S1 (Critical)
Version:	5.1.31	OS:	Linux (Red Hat 5.2 (x86_64))
Assigned to:		CPU Architecture:	Any
Tags:	active-passive master-master, freeing items, freeze lock, hanging

Description:
I have a setup with two databases in 'Active-Passive Master-Master' (both replicate to each other, but only one acts as the master at any given time). When the problem occurs, the passive DB is used only as a read slave. Servers are both dedicated MySQL boxes using SBR and all tables are MyISAM.

After upgrading from MySQL 5.0 to MySQL 5.1, the slave database started using up all the connections until even logging in as root was getting a "too many connections" error. Looking at the processlist, there's always a common theme:

1) The longest running thread is stuck on 'freeing items'.
2) The 'slave' thread (system user) is 'Locked' on an update.
3) All other threads are stuck on 'freeing items', 'Locked', and occasionally 'Sending data'.
4) The database is under a fair amount of load, but under max connections until things go haywire.

Similar (possibly related) bugs:
http://bugs.mysql.com/bug.php?id=40842
http://bugs.mysql.com/bug.php?id=42749

How to repeat:
Not sure...

The lockup never happens unless there is load on the read slave, I can't yet isolate the problem but it seems like the ingredients are:

1) Setup SBR in active-passive master-master
2) Create a 'user' table with auto-increment id.
3) Create a 'thing' table with a 'userId' that references the 'user' table.
4) Hammer the slave DB with queries for: select 'thing', left join user on user.id = thing.userId.
5) Update some 'user' records on the master.

Thank you for the report.

Does slave hang executing same queries every time? If so, please, provide these queries and  output of SHOW CREATE TABLE and SHOW TABLE STATUS for all involved queries. Do you experience this behavior for any binary log format, i.e. did you try MIXED or ROW?

Another similar bug: http://bugs.mysql.com/bug.php?id=42951
I also cannot kill threads while the DB is locked up, and unless I'm already watching the DB I can't connect (even as root) because of 'too many connections' error.

Sorry, that last bug reference should have been: http://bugs.mysql.com/bug.php?id=41901

Also, after restarting the server I get client errors on the tables that locked up:

+---------------+-------+----------+----------------------------------------------------------+
| Table         | Op    | Msg_type | Msg_text                                                 |
+---------------+-------+----------+----------------------------------------------------------+
| tweem.favorite | check | warning  | 7 clients are using or haven't closed the table properly | 
| tweem.favorite | check | status   | OK                                                       | 
+---------------+-------+----------+----------------------------------------------------------+

Ian, it's expected that you will see MyISAM index corruption after a server crash sometimes, so that's no surprise. Just repair them.

Please turn off the query cache with SET GLOBAL query_cache_size = 0; and tell us whether it solves your problem over in bug #41901. We're currently working to produce a test case for a bug with occasional hangs when the query cache is turned on.

If you'd like to check whether you're possibly affected use:

gdb -p <pid of mysqld>
set pagination off
thread apply all bt

and look for lots of threads waiting on Query_cache::wait_while_table_flush_is_in_progress

If you can't confirm that yourself and if you have tried turning off the query cache and that hasn't helped, you might want to upload the gdb output so one of our bugs team members can take a look.

Ian, how is it going so far?

So far the slave is OK. I set the query cache to 0, gdb isn't showing lots of "Query_cache::wait_while_table_flush_is_in_progress". I'm going to turn the query cache on tonight and have a look to see if it start up.

Something interesting is that our active master, which is 2x quad-core has no problems, while the slave is 2x dual-core. Not sure why this would make a difference, but otherwise the hardware/software is the same.

Query cache is still set to 64M on the active master.

Still haven't seen the problem show up since setting query cache to 0.

Because setting query cache to 0 seems to solve your problem I am pretty sure this is a duplicate to bug #43758.

Will set this as duplicate.

Setting query cache to 0 stops the problem with "freeing items" but there is still a problem where threads will lock up with status "sending data".

This seems to often occur on generally slow queries (1 - 8 seconds) and when another thread is inserting/updating/deleting to one of the tables being queried.

Ian, try turning off the adaptive hash index. InnoDB sometimes holds a lock on that when the server is off doing other things and turning it off is the quickest way to rule that possibility in or out. We've been gradually eliminating the ways that can happen and it's possible that you're hitting one we haven't taken care of yet. Selects update the adaptive hash index and are what causes the lock to be taken, no data changes required.

Good to read that the original problem seems to be the previously identified one, no rule against you having two different problems, though.