Bug #42951 | Slave connections hang on "freeing items" | ||
---|---|---|---|
Submitted: | 18 Feb 2009 1:50 | Modified: | 1 Apr 2009 3:23 |
Reporter: | Ian White | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.1.31 | OS: | Linux (Red Hat 5.2 (x86_64)) |
Assigned to: | CPU Architecture: | Any | |
Tags: | active-passive master-master, freeing items, freeze lock, hanging |
[18 Feb 2009 1:50]
Ian White
[18 Feb 2009 8:36]
Ian White
The lockup never happens unless there is load on the read slave, I can't yet isolate the problem but it seems like the ingredients are: 1) Setup SBR in active-passive master-master 2) Create a 'user' table with auto-increment id. 3) Create a 'thing' table with a 'userId' that references the 'user' table. 4) Hammer the slave DB with queries for: select 'thing', left join user on user.id = thing.userId. 5) Update some 'user' records on the master.
[18 Feb 2009 9:56]
Sveta Smirnova
Thank you for the report. Does slave hang executing same queries every time? If so, please, provide these queries and output of SHOW CREATE TABLE and SHOW TABLE STATUS for all involved queries. Do you experience this behavior for any binary log format, i.e. did you try MIXED or ROW?
[24 Feb 2009 1:31]
Ian White
Another similar bug: http://bugs.mysql.com/bug.php?id=42951 I also cannot kill threads while the DB is locked up, and unless I'm already watching the DB I can't connect (even as root) because of 'too many connections' error.
[26 Feb 2009 2:27]
Ian White
Sorry, that last bug reference should have been: http://bugs.mysql.com/bug.php?id=41901
[12 Mar 2009 1:37]
Ian White
Also, after restarting the server I get client errors on the tables that locked up: +---------------+-------+----------+----------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +---------------+-------+----------+----------------------------------------------------------+ | tweem.favorite | check | warning | 7 clients are using or haven't closed the table properly | | tweem.favorite | check | status | OK | +---------------+-------+----------+----------------------------------------------------------+
[19 Mar 2009 19:30]
James Day
Ian, it's expected that you will see MyISAM index corruption after a server crash sometimes, so that's no surprise. Just repair them. Please turn off the query cache with SET GLOBAL query_cache_size = 0; and tell us whether it solves your problem over in bug #41901. We're currently working to produce a test case for a bug with occasional hangs when the query cache is turned on. If you'd like to check whether you're possibly affected use: gdb -p <pid of mysqld> set pagination off thread apply all bt and look for lots of threads waiting on Query_cache::wait_while_table_flush_is_in_progress If you can't confirm that yourself and if you have tried turning off the query cache and that hasn't helped, you might want to upload the gdb output so one of our bugs team members can take a look.
[20 Mar 2009 23:20]
James Day
Ian, how is it going so far?
[21 Mar 2009 0:22]
Ian White
So far the slave is OK. I set the query cache to 0, gdb isn't showing lots of "Query_cache::wait_while_table_flush_is_in_progress". I'm going to turn the query cache on tonight and have a look to see if it start up. Something interesting is that our active master, which is 2x quad-core has no problems, while the slave is 2x dual-core. Not sure why this would make a difference, but otherwise the hardware/software is the same. Query cache is still set to 64M on the active master.
[22 Mar 2009 18:59]
Ian White
Still haven't seen the problem show up since setting query cache to 0.
[23 Mar 2009 15:08]
Susanne Ebrecht
Because setting query cache to 0 seems to solve your problem I am pretty sure this is a duplicate to bug #43758. Will set this as duplicate.
[1 Apr 2009 3:23]
Ian White
Setting query cache to 0 stops the problem with "freeing items" but there is still a problem where threads will lock up with status "sending data". This seems to often occur on generally slow queries (1 - 8 seconds) and when another thread is inserting/updating/deleting to one of the tables being queried.
[1 Apr 2009 13:26]
James Day
Ian, try turning off the adaptive hash index. InnoDB sometimes holds a lock on that when the server is off doing other things and turning it off is the quickest way to rule that possibility in or out. We've been gradually eliminating the ways that can happen and it's possible that you're hitting one we haven't taken care of yet. Selects update the adaptive hash index and are what causes the lock to be taken, no data changes required. Good to read that the original problem seems to be the previously identified one, no rule against you having two different problems, though.