Bug #34235 Random failure of wait_condition.inc with information_schema.processlist
Submitted: 1 Feb 2008 15:26 Modified: 21 Jul 2008 16:31
Reporter: Chuck Bell Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Information schema Severity:S3 (Non-critical)
Version:6.0 OS:Any
Assigned to: Ingo Strüwing CPU Architecture:Any
Triage: D2 (Serious)

[1 Feb 2008 15:26] Chuck Bell
Description:
Some of the backup tests use the wait_condition.inc with a query against the information_schema.processlist. The form is similar to the following.

let $wait_condition = SELECT state = "debug_sync_point: backup_cs_locked"
                      FROM INFORMATION_SCHEMA.PROCESSLIST
                      WHERE info LIKE "BACKUP DATABASE %";
--source include/wait_condition.inc

The tests will eventually fail with one of two errors:

mysqltest: In included file ".\include\wait_condition.inc": At line 42:
Error running query 'SELECT state = "debug_sync_point: backup_cs_locked"
                      FROM INFORMATION_SCHEMA.PROCESSLIST WHERE info 
LIKE "BACKUP DATABASE %"': 2013 Lost connection to MySQL server during query

or

mysqltest: In included file ".\include\wait_condition.inc": At line 53:
Error running query 'SELECT state = "debug_sync_point: backup_cs_locked"
                      FROM INFORMATION_SCHEMA.PROCESSLIST WHERE info 
LIKE "BACKUP DATABASE %"': 2013 Lost connection to MySQL server during query

The failure is random and does not appear to associated with any one test as it has failed at least once in all of the backup tests that use the wait_condition include file. These tests are:

backup.test
backup_commit_blocker.test
backup_ddl_blocker.test
backup_progress.test
backup_security.test
backup_snapshot.test

How to repeat:
Run the affected backup tests repeatedly in a loop.

./mysql-test-run.pl backup_commit_blocker backup_ddl_blocker backup_progress backup_security backup_snapshot 

Suggested fix:
It is suspected that this is a race condition having to do with retrieving results from the information_schema.processlist table WRT the initialization of the mutex in sql_show.cc @ 1882.

Note: The debug trace includes no clues as to the problem on Windows and the interactive debugger (VS) on Windows reports the problem as a memory error and halts the debug session before the error occurs.

It's a different story on Linux. On Linux, we find this in the master.err log:

execute_backup_command: backup: == breakpoint on 'backup_command' ==
safe_mutex: Trying to lock unitialized mutex at sql_show.cc, line 1882
080201 18:17:58 - mysqld got signal 6;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=3
max_threads=151
threads_connected=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 59892 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x9423380
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xb4c4d3f8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8316550
0xffffe410
0xb7cef201
0x889f55f
0x848f958
0x8481a3a
0x83b3c98
0x83b07aa
0x83b5b36
0x8326fba
0x83285cf
0x8331820
0x8332374
0x833376b
0x831f750
0xb7f8a46b
0xb7d966de
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/refman/5.1/en/resolve-stack-dump.html
and follow instructions on how to resolve the stack trace.
Resolved stack trace is much more helpful in diagnosing the
problem, so please do resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x942d018 = SELECT state = "debug_sync_point: bp_starting_state"
                      FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE info LIKE "RESTORE FROM 'backup_progress_orig.bak'%"
thd->thread_id=2
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
[5 Feb 2008 16:23] Chuck Bell
Some more information...and a possible fix.

It appears this is a race condition between the show processlist code and the thread shutdown. Adding the following to THD::~THD() seems to treat the problem (but may not be a complete solution).

===== sql_class.cc 1.364 vs edited =====
--- 1.364/sql/sql_class.cc      2007-12-06 12:31:04 -05:00
+++ edited/sql_class.cc 2008-02-05 11:11:00 -05:00
@@ -737,7 +737,8 @@
 {
   THD_CHECK_SENTRY(this);
   DBUG_ENTER("~THD()");
-  /* Ensure that no one is using THD */
+  mysys_var= 0;                                 // Safety (shouldn't be needed)
+ /* Ensure that no one is using THD */
   pthread_mutex_lock(&LOCK_delete);
   pthread_mutex_unlock(&LOCK_delete);
   add_to_status(&global_status_var, &status_var);
[22 May 2008 13:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/46949

ChangeSet@1.2624, 2008-05-22 15:07:55+02:00, istruewing@stella.local +4 -0
  Bug#34235 - Random failure of wait_condition.inc
              with information_schema.processlist
  Small test case fixes.
[17 Jun 2008 7:30] Ingo Strüwing
We agreed on fixing this by converting the backup tests from BACKUP_BREAKPOINT to DEBUG_SYNC.
[20 Jun 2008 15:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/48254

2638 Ingo Struewing	2008-06-20
      Bug#34235 - Random failure of wait_condition.inc
                  with information_schema.processlist
      
      Changed synchronization point calls.
      Adapted backup test cases and re-enabled them.
      Removed all unused BACKUP_BREAKPOINTs. None is left.
      Removed sql/backup/debug.h as it is now obsolete.
      Removed inclusion of debug.h from many files.
      Cleaned mysql-test/t/disabled.def.
      
      Fixed some random failures of wait_condition.inc with
      information_schema.processlist.
      
      Small fixes of the Debug Sync Facility:
      Added extern "C" to improve linkability on some platforms.
      Fixed compiler warnings.
      Moved locking of the mutex out of the SIGNAL and WAIT_FOR branches.
      Made a more verbose proc_info.
[25 Jun 2008 13:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/48489

2640 Ingo Struewing	2008-06-25
      Bug#34235 - Random failure of wait_condition.inc
                  with information_schema.processlist
      
      Changed synchronization point calls.
      Adapted backup test cases and re-enabled them.
      Removed all unused BACKUP_BREAKPOINTs. None is left.
      Removed sql/backup/debug.h as it is now obsolete.
      Removed inclusion of debug.h from many files.
      Cleaned mysql-test/t/disabled.def.
      
      Fixed some random failures of wait_condition.inc with
      information_schema.processlist.
      
      Small fixes of the Debug Sync Facility:
      Added extern "C" to improve linkability on some platforms.
      Fixed compiler warnings.
      Moved locking of the mutex out of the SIGNAL and WAIT_FOR branches.
      Made a more verbose proc_info.
[25 Jun 2008 13:42] Ingo Strüwing
Patch has been approved by both reviewers. Set status accordingly.
[26 Jun 2008 16:27] Ingo Strüwing
Queued to mysql-6.0-backup.
[19 Jul 2008 16:01] Ingo Strüwing
Pushed to 6.0.7.
Mainly test case changes.
Conversion from BACKUP_BREAKPOINT to DEBUG_SYNC.
This is internal stuff. IMHO no need for documentation.
[21 Jul 2008 16:31] Paul Dubois
No changelog entry needed.
[14 Sep 2008 5:10] Bugs System
Pushed into 6.0.7-alpha  (revid:ingo.struewing@sun.com-20080625133904-bx56g35rg2exwpv9) (version source revid:john.embretsen@sun.com-20080724122511-9c0oudz1xrdrs6y6) (pib:3)
[8 Jan 2010 15:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/96399

3042 Ingo Struewing	2010-01-08
      WL#5101 - MySQL Backup back port
      Merged revid:ingo.struewing@sun.com-20080625133904-bx56g35rg2exwpv9
        Bug#34235 - Random failure of wait_condition.inc
                    with information_schema.processlist
        
        Changed synchronization point calls.
        Adapted backup test cases and re-enabled them.
        Removed all unused BACKUP_BREAKPOINTs. None is left.
        Removed sql/backup/debug.h as it is now obsolete.
        Removed inclusion of debug.h from many files.
        Cleaned mysql-test/t/disabled.def.
        
        Fixed some random failures of wait_condition.inc with
        information_schema.processlist.
        
        Small fixes of the Debug Sync Facility:
        Added extern "C" to improve linkability on some platforms.
        Fixed compiler warnings.
        Moved locking of the mutex out of the SIGNAL and WAIT_FOR branches.
        Made a more verbose proc_info.
      
      original changeset: 2599.12.3
     @ sql/lock.cc
        WL#5101 - MySQL Backup back port
            Bug#34235 - Random failure of wait_condition.inc
                        with information_schema.processlist
            Added new synchronization points.
            Added DBUG_PRINTs.
     @ sql/sql_insert.cc
        WL#5101 - MySQL Backup back port
            Bug#34235 - Random failure of wait_condition.inc
                        with information_schema.processlist
            Removed inclusion of obsolete header file.
            Changed a synchronization point call.
[12 Jan 2010 22:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/96679

3059 Ingo Struewing	2010-01-12
      WL#5101 - MySQL Backup back port
      Merged revid:ingo.struewing@sun.com-20080625133904-bx56g35rg2exwpv9
      again - this time included the sync point "within_ha_commit_trans"
        Bug#34235 - Random failure of wait_condition.inc
                    with information_schema.processlist
        
        Changed synchronization point calls.
        Adapted backup test cases and re-enabled them.
        Removed all unused BACKUP_BREAKPOINTs. None is left.
        Removed sql/backup/debug.h as it is now obsolete.
        Removed inclusion of debug.h from many files.
        Cleaned mysql-test/t/disabled.def.
        
        Fixed some random failures of wait_condition.inc with
        information_schema.processlist.
        
        Small fixes of the Debug Sync Facility:
        Added extern "C" to improve linkability on some platforms.
        Fixed compiler warnings.
        Moved locking of the mutex out of the SIGNAL and WAIT_FOR branches.
        Made a more verbose proc_info.
      
      original changeset: 2599.12.3
     @ sql/handler.cc
        WL#5101 - MySQL Backup back port
        Included sync point "within_ha_commit_trans".
        Removed a compiler warning.