| Bug #36929 | crash in kill_zombie_dump_threads-> THD::awake() with replication tests | ||
|---|---|---|---|
| Submitted: | 23 May 2008 18:59 | Modified: | 1 Feb 2009 12:53 | 
| Reporter: | Andrei Elkin | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) | 
| Version: | 6.0 | OS: | Any | 
| Assigned to: | Andrei Elkin | CPU Architecture: | Any | 
| Tags: | pushbuild, sporadic, test failure | ||
   [23 May 2008 19:08]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47006 ChangeSet@1.2639, 2008-05-23 22:07:27+03:00, aelkin@mysql1000.dsl.inet.fi +2 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () at attempt to access concurrently resetable by the host thread THD::mysys_var. Fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex.
   [23 May 2008 19:48]
   Andrei Elkin        
  There is bug#35714 complaining the same issue. Also having a proto-type of the fix.
   [26 May 2008 10:10]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47043 ChangeSet@1.2639, 2008-05-26 13:10:36+03:00, aelkin@mysql1000.dsl.inet.fi +2 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () at attempt to access concurrently resetable by the host thread THD::mysys_var. Fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex.
   [26 May 2008 11:40]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47049 ChangeSet@1.2639, 2008-05-26 14:40:26+03:00, aelkin@mysql1000.dsl.inet.fi +2 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () at attempt to access concurrently resetable by the host thread THD::mysys_var. Fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex.
   [27 May 2008 15:01]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47091 ChangeSet@1.2639, 2008-05-27 18:00:54+03:00, aelkin@mysql1000.dsl.inet.fi +4 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () at attempt to access concurrently resetable by the host thread THD::mysys_var. The immediate issue is fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex. The same guarding is deployed to avoid potential race conditions between the host and - the show-process-list executing threads (mysqld_list_processes()); - shutdown thread (close_connections()); THD::store_globals() starts acquiring LOCK_delete mutex with this patch, although this is a slight overkill: mysys_var could change without mutex protection from NULL to a non NULL safely enough for the current logics of the killer (THD::awake).
   [29 May 2008 18:21]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47221 ChangeSet@1.2639, 2008-05-29 21:21:07+03:00, aelkin@mysql1000.dsl.inet.fi +4 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () at attempt to access concurrently resetable by the host thread THD::mysys_var. The immediate issue is fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex. The same guarding is deployed to avoid potential race conditions between the host and - the show-process-list executing threads (mysqld_list_processes()); - shutdown thread (close_connections()); THD::store_globals() does not acquire LOCK_delete as mysys_var could change without mutex protection from NULL to a non NULL safely for the current logics of threads executing THD::awake, close_connections(), mysqld_list_processes().
   [29 May 2008 18:34]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47222 ChangeSet@1.2639, 2008-05-29 21:33:35+03:00, aelkin@mysql1000.dsl.inet.fi +4 -0 Bug #36929 crash in kill_zombie_dump_threads-> THD::awake() with replication tests There was a crash in THD::awake () when killer attempted to access a concurrently resettable by the host thread THD::mysys_var. The immediate issue is fixed with forcing the host thread to reset only after acquiring LOCK_delete mutex. The same guarding is deployed to avoid potential race conditions between the host and - the show-process-list executing threads (mysqld_list_processes()); - shutdown thread (close_connections()); THD::store_globals() does not acquire LOCK_delete as mysys_var could change without mutex protection from NULL to a non NULL safely for the current logics of threads executing THD::awake, close_connections(), mysqld_list_processes().
   [7 Jun 2008 9:40]
   Andrei Elkin        
  Pushed to the bzr 6.0-rpl.
   [12 Jun 2008 8:57]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/47770 2669 Andrei Elkin 2008-06-12 bug#36929 fix post-pushing. Correcting an assert that does not hold in embedded.
   [27 Jun 2008 19:06]
   Bugs System        
  A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/48672 2662 Konstantin Osipov 2008-06-27 Apply a short version of the fix for BUG#36929 before pushing to the main tree to fix numerous test failures in pool-of-threads mode.
   [25 Aug 2008 21:03]
   Chuck Bell        
  Released in 6.0.7
   [26 Aug 2008 10:14]
   Andrei Elkin        
  Actually, the patch is still in 6.0-rpl and has not been pushed to the main trees.
   [27 Aug 2008 1:13]
   Paul DuBois        
  Resetting to Patch Queued status.
   [30 Jan 2009 13:30]
   Bugs System        
  Pushed into 6.0.10-alpha (revid:luis.soares@sun.com-20090129165607-wiskabxm948yx463) (version source revid:luis.soares@sun.com-20090129163120-e2ntks4wgpqde6zt) (merge vers: 6.0.10-alpha) (pib:6)
   [1 Feb 2009 12:53]
   Jon Stephens        
  Documented in the 6.0.10 changelog as follows:
        A slave compiled using --with-libevent and run with
        --thread-handling=pool-of-threads could sometimes crash.
 
   [3 Dec 2009 13:46]
   Jon Stephens        
  Also documented in the 5.6.0 changelog. See BUG#48463.
   [7 Mar 2010 1:48]
   Paul DuBois        
  Moved 5.6.0 changelog entry to 5.5.3.

Description: Happened on mysql-6.0 pb when threadpool feature is compiled in and activated. The following tests rpl.rpl_stm_until 'stmt' rpl.rpl_truncate_2myisam 'stmt' rpl.rpl_packet 'stmt' experienced a crash: <andrei> #4 <signal handler called> <andrei> #5 0x20000000000b38f0 in __pthread_mutex_unlock_usercnt () <andrei> from /lib/libpthread.so.0 <andrei> #6 0x40000000003027d0 in THD::awake () <andrei> #7 0x4000000000639d30 in kill_zombie_dump_threads () <andrei> #8 0x4000000000372f60 in dispatch_command () <andrei> #9 0x4000000000373b90 in do_command () The reason of the crash appeared to be unguarded resetting THD->mysys_var by the owner thread whit it's accessible concurrently by the killer thread. How to repeat: xref.pl rpl.rpl_stm_until etc, or look at mysql-6.0 pb logs, or build mysql-6.0 with --with-libevent and mtr on of the tests (can take some number of repeats).