Bug #43758 Query cache can lock up threads in 'freeing items' state
Submitted: 19 Mar 2009 19:21 Modified: 6 Oct 2009 2:15
Reporter: Harrison Fisk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Query Cache Severity:S2 (Serious)
Version:5.1.32, 5.0 OS:Any
Assigned to: Kristofer Pettersson CPU Architecture:Any
Tags: hang, query cache

[19 Mar 2009 19:21] Harrison Fisk
Description:
The query cache can cause a lockup on a thread in some cases.  The thread will be stuck in the 'freeing items' state.

The threads are always in Query_cache::wait_while_table_flush_is_in_progress on the line pthread_cond_wait(&COND_cache_status_changed, &structure_guard_mutex).  The m_

This seems to be directly related to the changes done for Bug #21074

How to repeat:
I have managed to repeat it by having many threads do the following:

1500 SELECTS using a tinyint range of values (so 255 possible values)
1 insert
1 delete

I will attach the gypsy script I used to duplicate this.  It often needs to run for a very long time to generate this.

Eventually you will occasionally get threads that start getting stuck into the 'freeing items' state.

Suggested fix:
I would guess there is something wrong with the broadcast of the signal or similar.

As a workaround, you can disable the query cache.
[19 Mar 2009 19:25] Harrison Fisk
The processlist:

mysql> show processlist;
+----+------+-----------+------+---------+------+---------------+----------------------------------------------+
| Id | User | Host      | db   | Command | Time | State         | Info                                         |
+----+------+-----------+------+---------+------+---------------+----------------------------------------------+
| 37 | root | localhost | test | Query   | 3139 | freeing items | SELECT * FROM qc_test WHERE num = 49 LIMIT 5 | 
| 48 | root | localhost | NULL | Query   |    0 | NULL          | show processlist                             | 
+----+------+-----------+------+---------+------+---------------+----------------------------------------------+
2 rows in set (0.00 sec)

The gdb backtrace and info:

(gdb) bt
#0  0xb7fde410 in __kernel_vsyscall ()
#1  0xb7fb8aa5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/i686/cmov/libpthread.so.0
#2  0x0832d4a4 in query_cache_end_of_result (thd=0x96fb038) at sql_cache.cc:1638
#3  0x08216324 in dispatch_command (command=COM_QUERY, thd=0x96fb038, packet=0x970cf19 "", packet_length=44) at sql_parse.cc:1587
#4  0x08217be0 in do_command (thd=0x96fb038) at sql_parse.cc:857
#5  0x08208833 in handle_one_connection (arg=0x96abb00) at sql_connect.cc:1115
#6  0xb7fb44fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#7  0xb7ec2e5e in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) frame 2
#2  0x0832d4a4 in query_cache_end_of_result (thd=0x96fb038) at sql_cache.cc:1638
1638	      pthread_cond_wait(&COND_cache_status_changed, &structure_guard_mutex);
(gdb) print query_cache.m_cache_status
$1 = Query_cache::NO_FLUSH_IN_PROGRESS
[19 Mar 2009 19:31] Harrison Fisk
gypsy test case

Attachment: qc_test.sql (text/x-sql), 28 bytes.

[19 Mar 2009 19:34] Harrison Fisk
table definition and population

Attachment: qc_setup.query (application/octet-stream, text), 237 bytes.

[20 Mar 2009 1:21] Harrison Fisk
This seems to have something to do with connections being closed/terminated.  I can only seem to repeat it as my test case ends and only occasionally.
[20 Mar 2009 21:06] James Day
The bug that prompted the investigation leading to this bug s bug #41901. Possibly related open bugs are bug #42951 (connections accumulate, hang on freeing items, same symptoms, only MyISAM in use) and bug #40482.
[23 Mar 2009 15:08] Susanne Ebrecht
Set bug #41951 as duplicate of this bug here.
[23 Mar 2009 15:09] Susanne Ebrecht
Sorry, typo. Set bug #42951 as duplicate of this bug here.
[24 Mar 2009 15:59] Kristofer Pettersson
Preliminary progress report: It is possible for more than one thread to enter the condition in query_cache_insert(), but the condition predicate is to signal one thread each time the cache status changes between the following states: {NO_FLUSH_IN_PROGRESS, FLUSH_IN_PROGRESS,                   TABLE_FLUSH_IN_PROGRESS}

Consider three threads THD1, THD2, THD3

THD2: select ... => Got a writer in ::store_query
THD3: select ... => Got a writer in ::store_query
THD1: flush tables => qc status= FLUSH_IN_PROGRESS; new writers are blocked.
THD2: select ... => Still got a writer and enters cond in query_cache_insert
THD3: select ... => Still got a writer and enters cond in query_cache_insert
THD1: flush tables => finished and signal status change.
THD2: select ... => Wakes up and completes the insert.
THD3: select ... => Happily waiting for better times. Why hurry?

Saturating everything with thundering broadcasting won't help probably.

Possibly picking between
1) Drop writer and invalidate result set if status is FLUSH_IN_PROGRESS or TABLE_FLUSH_IN_PROGRESS.
2) Introduce a new writer queue which can be broadcasted on a status change.

Option 1 is probably easiest but performance might hurt for certain cases. There is also a question on how we're going to invalidate the broken result set when a FLUSH is going on.
Option 2 adds more complexity to a component which already is too complex for its job.
Then there is also:
3) Cheat by inserting signals to the waiting writers on each successful store_query since the probability of the hang seems to be low even on a loaded system.
4) Just wait on the mutex, the lock strategy for preventing a freeze during invalidation was a bad one (bug#21074).
[1 Apr 2009 15:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71099

2823 Kristofer Pettersson	2009-04-01
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      Early patch submitted for discussion.
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
      THD2: select ... => Got a writer in ::store_query
      THD3: select ... => Got a writer in ::store_query
      THD1: flush tables => qc status= FLUSH_IN_PROGRESS; 
                            new writers are blocked.
      THD2: select ... => Still got a writer and enters cond in
                          query_cache_insert
      THD3: select ... => Still got a writer and enters cond in
                          query_cache_insert
      THD1: flush tables => finished and signal status change.
      THD2: select ... => Wakes up and completes the insert.
      THD3: select ... => Happily waiting for better times. Why hurry?
      
      A solution to this situation is to use a broadcast on the thread
      group which contain result set writers.
      
      Note about the test case:
      * If the thread policy in Query_cache::wait_while_flush_is_in_progress
        is set to RELEASE_ONE_ON_EACH_SIGNAL then the test case will hang.
        This should show the existence of a problem before the patch.
      * Two thread groups used to avoid using broadcast where it isn't needed.
        If it worth having two groups? Is substituting signal for broadcast enough?
[1 Apr 2009 19:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71139

2822 Kristofer Pettersson	2009-04-01
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      Early patch submitted for discussion.
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
      THD2: select ... => Got a writer in ::store_query
      THD3: select ... => Got a writer in ::store_query
      THD1: flush tables => qc status= FLUSH_IN_PROGRESS; 
                            new writers are blocked.
      THD2: select ... => Still got a writer and enters cond in
                          query_cache_insert
      THD3: select ... => Still got a writer and enters cond in
                          query_cache_insert
      THD1: flush tables => finished and signal status change.
      THD2: select ... => Wakes up and completes the insert.
      THD3: select ... => Happily waiting for better times. Why hurry?
      
      A solution to this situation is to use a broadcast on the thread
      group which contain result set writers.
      
      Note about the test case:
      * If the thread policy in Query_cache::wait_while_flush_is_in_progress
        is set to RELEASE_ONE_ON_EACH_SIGNAL then the test case will hang.
        This should show the existence of a problem before the patch.
      * Two thread groups used to avoid using broadcast where it isn't needed.
        If it worth having two groups? Is substituting signal for broadcast enough?
[2 Apr 2009 9:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71182

2822 Kristofer Pettersson	2009-04-02
      Bug#43758 Query cache can lock up threads in 'freeing items' state
            
      Early patch submitted for discussion.
            
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
           
      Consider three threads THD1, THD2, THD3
          
        THD2: select ... => Got a writer in ::store_query
        THD3: select ... => Got a writer in ::store_query
        THD1: flush tables => qc status= FLUSH_IN_PROGRESS; 
                          new writers are blocked.
        THD2: select ... => Still got a writer and enters cond in
                            query_cache_insert
        THD3: select ... => Still got a writer and enters cond in
                            query_cache_insert
        THD1: flush tables => finished and signal status change.
        THD2: select ... => Wakes up and completes the insert.
        THD3: select ... => Happily waiting for better times. Why hurry?
            
      A solution to this situation is to use a broadcast on the thread
      group which contain result set writers.
        
      Note about the test case:
      * If the thread policy in Query_cache::wait_while_flush_is_in_progress
        is set to RELEASE_ONE_ON_EACH_SIGNAL then the test case will hang.
        This should show the existence of a problem before the patch.
      * Two thread groups used to avoid using broadcast where it isn't needed.
        If it worth having two groups? Is substituting signal for broadcast enough?
[2 Apr 2009 17:25] Harrison Fisk
I have ran my testcase using the last patch (commits 71182) for 3 hours and have not been able to repeat this anymore, so the testcase seems fixed.  I am running a longer test case now, but suspect that it is fixed.

Previously, it would normally repeat this in a minute or two.
[7 Apr 2009 8:37] Kristofer Pettersson
HarrisonF: Can you confirm that you can't reproduce the issue using the latest patch?
[7 Apr 2009 15:26] Harrison Fisk
I have been unable to repeat this despite long running test setups (involving many many restarts of the test program).

With vanilla 5.1.24 from bzr, it would run into the problem in a few minutes.  I have now run test cases for > 24 hours without seeing a lock up when this patch is applied.

From a test case point of view, this bug certainly seems fixed by the proposed patch.
[20 Apr 2009 10:15] Jerry Potokar
Is this patch added in 5.1.34?
[20 Apr 2009 12:00] Kristofer Pettersson
The patch is still waiting on a review and hasn't been merged into the main tree.
[21 Apr 2009 7:45] Jerry Potokar
Ohh, to bad. Unfortunately I am not expert enough to compile mysql with this patch myself :-(
[21 Apr 2009 9:10] Kristofer Pettersson
Jerry: If you're a Sun customer you could most likely get it done for you right away.

The review process is usually much faster than this, but sometimes we get very busy and things take time. THe patch should probably be ready within the week.
[8 May 2009 15:37] Daniel Fischer
I'm hitting this bug too with 5.4 and will try out the patch.
[20 May 2009 12:40] Kristofer Pettersson
The feedback on this bug patch so far is that it works, but there is a problem with the architecture of the locking schema. We want to change it to a simple "wait queue" and use only two functions to access the QC lock: lock_query_cache() and unlock_query_cache(). These function will handle flushing as well, and in each unlock_query_cache() the wait queue will be checked for any waiting threads.

New bug patch will be developed.
[20 May 2009 23:41] James Day
Kristofer,

This is a lockup bug in our current production server release and we have a working patch.

Unless we can get a revised patch in place in time for the same MRU as this one can make it for we should go with the current patch and have a new bug for the improved version. It's too common for us to just wait for a new patch version.
[21 May 2009 10:10] Kristofer Pettersson
James,

Sounds good to me. I don't mind taking it in iterative steps.

What does the reviewers say?

/K
[21 May 2009 11:43] Davi Arnaut
We can get a revised patch in place in time for the target MRU.
[26 May 2009 14:59] Jerry Potokar
Yes, PLEASE fix thisa bug even in a now optimal way, but soon, as it makes 5.1 unusable for me, I get a lockup in 1-2 hours :-(
[26 May 2009 23:23] James Day
Davi, that just about works but if there's any sign of slippage we should go with the existing patch. It's too frequently encountered to wait more than a month or so longer and it's on my list of reasons why those with stability as their top priority should currently delay upgrading to 5.1. It'd be really nice to get rid of it because 5.1 is better in so many other ways.
[4 Jun 2009 9:21] Hervé BRY
I am also experiencing this bug, which causes about 1-3 locks per day on a server with ~800 queries/sec and almost as many cache hits. I cant disable the query cache because our system heavily relies on it to achieve decent performances. Will the patch finally be included in the next release or will I have to downgrade my server (I cant wait more than one or two weeks waiting for a fix to come) ?
[10 Jun 2009 18:32] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76055

2924 Kristofer Pettersson	2009-06-10
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
       THD2: select ... => Got a writer in ::store_query
       THD3: select ... => Got a writer in ::store_query
       THD1: flush tables => qc status= FLUSH_IN_PROGRESS;
                           new writers are blocked.
       THD2: select ... => Still got a writer and enters cond in
                           query_cache_insert
       THD3: select ... => Still got a writer and enters cond in
                           query_cache_insert
       THD1: flush tables => finished and signal status change.
       THD2: select ... => Wakes up and completes the insert.
       THD3: select ... => Happily waiting for better times. Why hurry?
      
      This patch is a refactoring of this lock system. It introduces tree new methods:
       try_lock_query_cache()
       lock_quey_cache()
       unlock_query_cache()
      
      This change also makes wait_while_table_flush_is_in_progress().
      All threads are registered and put in a queue. On each unlock the first
      element in the queue is signalled. This resolve
      the issues with left over threads.
     @ mysql-test/r/query_cache_debug.result
        * Added test case for bug 43758
     @ mysql-test/t/query_cache_debug.test
        * Added test case for bug 43758
     @ sql/ilist.h
        * Introduced standard interface for a double linked intrusive list.
     @ sql/sql_cache.cc
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock_query_cache(), lock_query_cache() and unlock_query_cache().
        * Introduced new class Query_cache_queue_element.
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the SUSPENDING lock type is used to lock the query cache, other
          threads using try_lock_query_cache() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush or defragmentation.
     @ sql/sql_cache.h
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock_query_cache(), lock_query_cache() and unlock_query_cache().
        * Introduced new class Query_cache_queue_element.
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock_query_cache() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush or defragmentation.
[15 Jun 2009 15:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76301

2937 Kristofer Pettersson	2009-06-15
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      Early patch submitted for discussion.
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
         THD2: select ... => Got a writer in ::store_query
         THD3: select ... => Got a writer in ::store_query
         THD1: flush tables => qc status= FLUSH_IN_PROGRESS;
                            new writers are blocked.
         THD2: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD3: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD1: flush tables => finished and signal status change.
         THD2: select ... => Wakes up and completes the insert.
         THD3: select ... => Happily waiting for better times. Why hurry?
      
      This patch is a refactoring of this lock system. It introduces tree new methods:
         try_lock_query_cache()
         lock_quey_cache()
         unlock_query_cache()
      
      This change also makes wait_while_table_flush_is_in_progress(). All threads are
      queued and put on a conditional wait. On each unlock the queue is signalled. This resolve
      the issues with missing signals. To assure that no threads are spending unnecessary
      time waiting a signal broad cast is issued every time a lock is taken before a full
      cache flush.
     @ mysql-test/r/query_cache_debug.result
        * Added test case for bug43758
     @ mysql-test/t/query_cache_debug.test
        * Added test case for bug43758
     @ sql/sql_cache.cc
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
     @ sql/sql_cache.h
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
[16 Jun 2009 11:05] Bugs System
Pushed into 5.1.36 (revid:joro@sun.com-20090616102155-3zhezogudt4uxdyn) (version source revid:kristofer.pettersson@sun.com-20090616084254-p6gq6r1gayjl9ftr) (merge vers: 5.1.36) (pib:6)
[17 Jun 2009 15:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76479

2950 Kristofer Pettersson	2009-06-17
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      This patch corrects a misstake in the test case for bug patch 43658.
      
      There was a race in the test case when the thread id was retrieved from the processlist.
      The result was that the same thread id was signalled twice and one thread id wasn't
      signalled at all.
      
      The affected platforms appears to be limited to linux.
     @ mysql-test/r/query_cache_debug.result
        There was a race in the test case when the thread id was retrieved from the processlist.
        The result was that the same thread id was signalled twice and one thread id wasn't
        signalled at all.
     @ mysql-test/t/query_cache_debug.test
        There was a race in the test case when the thread id was retrieved from the processlist.
        The result was that the same thread id was signalled twice and one thread id wasn't
        signalled at all.
[24 Jun 2009 12:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77034

2937 Kristofer Pettersson	2009-06-15
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      Early patch submitted for discussion.
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
         THD2: select ... => Got a writer in ::store_query
         THD3: select ... => Got a writer in ::store_query
         THD1: flush tables => qc status= FLUSH_IN_PROGRESS;
                            new writers are blocked.
         THD2: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD3: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD1: flush tables => finished and signal status change.
         THD2: select ... => Wakes up and completes the insert.
         THD3: select ... => Happily waiting for better times. Why hurry?
      
      This patch is a refactoring of this lock system. It introduces tree new methods:
         try_lock()
         lock_and_suspend()
         unlock()
      
      This change also makes wait_while_table_flush_is_in_progress(). All threads are
      queued and put on a conditional wait. On each unlock the queue is signalled. This resolve
      the issues with left over threads. To assure that no threads are spending unnecessary
      time waiting a signal broad cast is issued every time a lock is taken before a full
      cache flush.
     @ mysql-test/r/query_cache_debug.result
        * Add test case for bug 43758
     @ mysql-test/t/query_cache_debug.test
        * Add test case for bug 43758
     @ sql/sql_cache.cc
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
     @ sql/sql_cache.h
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
[24 Jun 2009 12:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/77035

3369 Kristofer Pettersson	2009-06-17
      Bug#43758 Query cache can lock up threads in 'freeing items' state
      
      It is possible for more than one thread to enter the condition
      in query_cache_insert(), but the condition predicate is to
      signal one thread each time the cache status changes between
      the following states: {NO_FLUSH_IN_PROGRESS,FLUSH_IN_PROGRESS,
      TABLE_FLUSH_IN_PROGRESS}
      
      Consider three threads THD1, THD2, THD3
      
         THD2: select ... => Got a writer in ::store_query
         THD3: select ... => Got a writer in ::store_query
         THD1: flush tables => qc status= FLUSH_IN_PROGRESS;
                            new writers are blocked.
         THD2: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD3: select ... => Still got a writer and enters cond in
                             query_cache_insert
         THD1: flush tables => finished and signal status change.
         THD2: select ... => Wakes up and completes the insert.
         THD3: select ... => Happily waiting for better times. Why hurry?
      
      This patch is a refactoring of this lock system. It introduces four new methods:
         Query_cache::try_lock()
         Query_cache::lock()
         Query_cache::lock_and_suspend()
         Query_cache::unlock()
      
      This change also deprecates wait_while_table_flush_is_in_progress(). All threads are
      queued and put on a conditional wait. On each unlock the queue is signalled. This resolve
      the issues with left over threads. To assure that no threads are spending unnecessary
      time waiting a signal broadcast is issued every time a lock is taken before a full
      cache flush.
     @ mysql-test/r/query_cache_debug.result
        * Added test case for bug43758
     @ mysql-test/t/query_cache_debug.test
        * Added test case for bug43758
     @ sql/sql_cache.cc
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
     @ sql/sql_cache.h
        * Replaced calls to wait_while_table_flush_is_in_progress() with
          calls to try_lock(), lock_and_suspend() and unlock().
        * Renamed enumeration Cache_status to Cache_lock_status.
        * Renamed enumeration items to UNLOCKED, LOCKED_NO_WAIT and LOCKED.
          If the LOCKED_NO_WAIT lock type is used to lock the query cache, other
          threads using try_lock() will fail to acquire the lock.
          This is useful if the query cache is temporary disabled due to 
          a full table flush.
[26 Jun 2009 15:33] Paul Dubois
Noted in 5.1.36 changelog.

Invalidation of query cache entries due to table modifications could
cause threads to hang inside the query cache with state "freeing
items". 

Setting report to NDI pending push into 5.4.x.
[10 Jul 2009 11:20] Bugs System
Pushed into 5.4.4-alpha (revid:anozdrin@bk-internal.mysql.com-20090710111017-bnh2cau84ug1hvei) (version source revid:kristofer.pettersson@sun.com-20090617180427-zxqutdhvhzdhoa8s) (merge vers: 5.4.4-alpha) (pib:11)
[13 Jul 2009 19:57] Paul Dubois
Noted in 5.4.4 changelog.
[17 Jul 2009 15:41] Mark Callaghan
Is this still broken in 5.0? The fix for http://bugs.mysql.com/bug.php?id=21074 is listed as the cause of this bug and that was pushed into 5.0.
[20 Jul 2009 7:50] Kristofer Pettersson
Note that two different patches were pushed into 5.1 and 5.0 for 21074. I don't think 5.0 suffers from issues caused by patch 21074. We need a more detailed report for 5.0 issues so we can address them separately.
[12 Aug 2009 22:39] Paul Dubois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[15 Aug 2009 1:55] Paul Dubois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:32] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[12 Sep 2009 0:45] Roel Van de Paar
Re-opening bug as it does not seem to be fully/correctly resolved.

Customer is seeing this issue in 5.1.37 on Windows 32 bit.

Issue disappears when query cache is disabled.
[16 Sep 2009 7:16] Kristofer Pettersson
Roel Van de Paar: It might be a good idea to try to get some back traces. This patch should indeed solve the issue as described in this bug report, however there might be other reasons for a lock up.
[22 Sep 2009 18:57] Kristofer Pettersson
Roel: Also note Bug#47277
[25 Sep 2009 22:01] Ricardo Gomez
Hi, I have the same problem, I had mysql 5.1.30, but today I upgraded to mysql 5.1.39  and the problem persists:
| 366  | netadmin| emanagement2.e-solutionscolombia.com:52694 | emanagement | Query   | 280  | freeing items             | update network_devices set status=0 where id =58           |
As additional information I've noticed this only with InnoDB tables, not with MYISAM tables
What can be ?
[25 Sep 2009 22:12] Roel Van de Paar
Hi Ricardo, 

Could you please generare a backtrace for us and upload it to this bug report?

If you need instructions on how to do this, please let me know your OS.
[25 Sep 2009 22:39] Ricardo Gomez
Hi, 
Yeah, I need instructions to do this ...
The system is Fedora Linux 2.6.27.5-117.fc10.x86_64
mysql  Ver 14.14 Distrib 5.1.39, for unknown-linux-gnu (x86_64)
thanks
[28 Sep 2009 14:02] Kristofer Pettersson
Stack traces seems to indicate a bug in the pthread_cond_timedwait() function call which was supposedly patch in 6.0 but not in 5.1. A new bug will be opened to address this issue separately.
[28 Sep 2009 15:27] Ricardo Gomez
Thanks Kristofer, this bug its is already open ? 

I agree with open a new bug, because for me this behavior occurs with 
or without cache enabled in INNODB. I saw this primarily in our production server in 5.1.30. The last Week I migrated to 5.1.39 hoping this will be solved, but the problem remains there.
The only workaround available for me at this moment is migrate tables to MyISAM. But only I could this for 2/150 tables aprox.

Thanks in Advance
[28 Sep 2009 15:41] Susanne Ebrecht
Bug #47277 might be a duplicate of this bug here.
[28 Sep 2009 15:54] Davi Arnaut
Susanne,

This one is about the query cache. In Bug#47277, the reporter mention that the query cache is off.
The 'freeing items' state might indicate a wait somewhere else too.
[30 Sep 2009 12:23] Kristofer Pettersson
Ricardo: No, I will have to find a reliable test case first. However, there was supposedly an issue found earlier related to another bug. It was patched in what used to be the mysql-6.0 tree, but without a test case or bug report. The change made was in mysys/my_wincond.c:int pthread_cond_timedwait(..):

-if (cond->waiting == 0 && result == (WAIT_OBJECT_0+BROADCAST))
+if (cond->waiting == 0)

Once the impact of this change is fully understood we'll re-implement this fix. A bug would be visible if the timed wait times out.
[1 Oct 2009 17:27] Kristofer Pettersson
See also Bug#47768 for pthread bug.
[2 Oct 2009 3:21] Roel Van de Paar
Hi Ricardo,

> Yeah, I need instructions to do this ...
> The system is Fedora Linux 2.6.27.5-117.fc10.x86_64

Here are the instructions for running a back trace:

1. Install gdb (if not installed already)

shell> yum install gdb

2. Start the MySQL server

3. Obtain the pid of the mysqld (not mysqld_safe) process by using ps:

shell> ps -A | grep mysqld

An example output:

shell> ps -A | grep mysqld
 2265 pts/0 00:00:00 mysqld_safe
 2335 pts/0 00:00:14 mysqld

In this example, 2335 is the pid of mysqld.

4. Re-create the hang situation in MySQL

5. Run a backtrace:

shell> gdb -ex "set pagination 0" -ex "thread apply all bt" --batch --pid={enter your pid here} > /tmp/backtrace.txt

Replace '{enter your pid here}' with the pid obtained in step 3, for example:

gdb -ex "set pagination 0" -ex "thread apply all bt" --batch --pid=2335 > /tmp/backtrace.txt

6. Create a symbol file:

shell> cd /your_mysql_installation_path/
shell> nm -D -n --demangle ./bin/mysqld > /tmp/mysqld.sym

Note that your_mysql_installation_path is the directory in which you installed MySQL, which contains 'bin' as a subfolder.

If you run into issues using nm with --demangle, please run it without --demangle.

7. Please send us the /tmp/mysqld.sym and /tmp/backtrace.txt files.

Some related information is listed here:
http://dev.mysql.com/doc/refman/5.1/en/using-stack-trace.html
[5 Oct 2009 22:32] Ricardo Gomez
hi Roel Van de Paar, thanks!These are the files requested. I hope to be useful, I keep waiting for more information and again th

Attachment: backtrace.txt (text/plain), 18.25 KiB.

[5 Oct 2009 22:39] Ricardo Gomez
This is the other (mysqld.sym)

Attachment: mysqld_sym_BUG43758.zip (application/zip, text), 169.81 KiB.

[6 Oct 2009 1:10] Roel Van de Paar
Resolved Stacktrace from Ricardo Gomez

Attachment: bug_43758_resolved_stacktrace_Ricardo_Gomez.txt (text/plain), 24.56 KiB.

[6 Oct 2009 2:15] Roel Van de Paar
Re-closing this bug after discussion with Davi. New/separate bug: bug #47768
[6 Oct 2009 2:18] Roel Van de Paar
Hi Ricardo,

Thank you for sending this in. I have resolved the stacktrace and uploaded it to this bug and the new bug. This bug was re-closed. Please track bug #47768 for future updates.
[8 Oct 2009 2:47] Paul Dubois
The 5.4 fix has been pushed to 5.4.2.