Bug #40043 | Maria: waiting_threads-t unit test hangs in "kill strategy: LATEST" on Windows | ||
---|---|---|---|
Submitted: | 15 Oct 2008 12:11 | Modified: | 8 Jan 2009 10:42 |
Reporter: | Guilhem Bichot | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Maria storage engine | Severity: | S3 (Non-critical) |
Version: | 5.1-maria | OS: | Windows |
Assigned to: | Sergei Golubchik | CPU Architecture: | Any |
[15 Oct 2008 12:11]
Guilhem Bichot
[15 Oct 2008 12:14]
Guilhem Bichot
I was able to repeat this problem on my machine; I tested this patch from Sergei, which lets read-lockers go before write-lockers: === modified file 'mysys/thr_rwlock.c' --- mysys/thr_rwlock.c 2006-12-23 19:17:15 +0000 +++ mysys/thr_rwlock.c 2008-10-15 09:53:24 +0000 @@ -89,7 +89,7 @@ pthread_mutex_lock(&rwp->lock); /* active or queued writers */ - while (( rwp->state < 0 ) || rwp->waiters) + while ( rwp->state < 0 ) pthread_cond_wait( &rwp->readers, &rwp->lock); rwp->state++; @@ -101,7 +101,7 @@ { int res; pthread_mutex_lock(&rwp->lock); - if ((rwp->state < 0 ) || rwp->waiters) + if (rwp->state < 0 ) res= EBUSY; /* Can't get lock */ else { But it didn't change anything.
[15 Oct 2008 12:23]
Guilhem Bichot
The main thread is waiting for termination of the test, and all other 29 threads are either: - trying to get the "lock" global mutex (defined in waiting_threads-t.c) in one of the pthread_mutex_lock in waiting_threads-t.c - or trying to get this same mutex, at my_wincond.c:142 , in pthread_cond_timedwait() inside wt_thd_cond_timedwait() - or (one single thread in this case) at my_wincond.c:116, waiting for something, in pthread_cond_timedwait() inside wt_thd_cond_timedwait()
[15 Oct 2008 12:25]
Guilhem Bichot
No problem in 6.0-main and 6.0-maria, relevant code of 5.1-maria hasn't been merged there yet.
[15 Oct 2008 17:15]
Guilhem Bichot
After commenting out the line at my_wincond.c:116 (and keeping Serg's patch), LATEST finishes after 120 seconds (5 seconds under Linux). I haven't tested comment-out without Serg's patch.
[15 Oct 2008 18:10]
Guilhem Bichot
Results from Windows (with code changes as described in previous post): # kill strategy: LATEST # Testing waiting_threads with 30 threads, 3000 iterations... ok 9 - tested waiting_threads in 121.726 secs (0) # deadlock cycles of length 1: 2866 9 34.60 % # deadlock cycles of length 2: 2196 15 61.20 % # deadlock cycles of length 3: 1519 8 79.58 % # deadlock cycles of length 4: 920 5 90.71 % # deadlock cycles of length 5: 478 3 96.50 % # deadlock cycles of length 6: 239 0 99.37 % # deadlock cycles of length 7: 0 31 99.75 % # deadlock cycles of length 8: 0 9 99.86 % # deadlock cycles of length 9: 0 2 99.88 % # deadlock cycles of length 10: 0 1 99.89 % # deadlock cycles of length 12: 0 2 99.92 % # deadlock cycles of length 13: 0 1 99.93 % # deadlock cycles of length 15: 0 2 99.95 % # deadlock cycles of length 16: 0 4 100.00 % # depth exceeded: 16245 87 # deadlock waits up to 22 us: 0 # deadlock waits up to 49 us: 0 # deadlock waits up to 106 us: 0 # deadlock waits up to 232 us: 0 # deadlock waits up to 507 us: 0 # deadlock waits up to 1105 us: 0 # deadlock waits up to 2408 us: 0 # deadlock waits up to 5247 us: 0 # deadlock waits up to 11433 us: 0 # deadlock waits up to 24908 us: 0 # deadlock waits up to 54265 us: 0 # deadlock waits up to 118224 us: 0 # timed out: 60604 # successes: 21086 The "waits up" stats look wrong ("0"). I'll post Linux output (same machine) later.
[15 Oct 2008 19:17]
Guilhem Bichot
Here's Linux, same machine as previous post: # kill strategy: LATEST # Testing waiting_threads with 30 threads, 3000 iterations... ok 9 - tested waiting_threads in 4.54795 secs (0) # deadlock cycles of length 1: 6868 6 19.25 % # deadlock cycles of length 2: 7602 3 40.55 % # deadlock cycles of length 3: 7236 7 60.83 % # deadlock cycles of length 4: 5993 6 77.63 % # deadlock cycles of length 5: 4412 1 89.99 % # deadlock cycles of length 6: 3388 2 99.48 % # deadlock cycles of length 7: 0 103 99.77 % # deadlock cycles of length 8: 0 43 99.89 % # deadlock cycles of length 9: 0 17 99.94 % # deadlock cycles of length 10: 0 8 99.96 % # deadlock cycles of length 11: 0 3 99.97 % # deadlock cycles of length 12: 0 3 99.97 % # deadlock cycles of length 13: 0 2 99.98 % # deadlock cycles of length 14: 0 2 99.99 % # deadlock cycles of length 15: 0 2 99.99 % # deadlock cycles of length 16: 0 3 100.00 % # depth exceeded: 38766 182 # deadlock waits up to 1 us: 460 # deadlock waits up to 2 us: 4 # deadlock waits up to 4 us: 11 # deadlock waits up to 10 us: 1 # deadlock waits up to 22 us: 55 # deadlock waits up to 49 us: 284 # deadlock waits up to 106 us: 585 # deadlock waits up to 232 us: 1721 # deadlock waits up to 507 us: 5298 # deadlock waits up to 1105 us: 19938 # deadlock waits up to 2408 us: 18588 # deadlock waits up to 5247 us: 246 # deadlock waits up to 11433 us: 1166 # deadlock waits up to 24908 us: 5 # timed out: 6139 # successes: 48151 Note that it has much less timeouts, much more successes, twice more "depth exceeded", and "waits up" stats are not looking stupid (unlike zeros under Windows). By the way, I wonder how the "waits up" can be zeros in Windows, given that the code which prints them is: for (cnt=0; cnt < WT_WAIT_STATS; cnt++) if (wt_wait_stats[cnt]>0) diag("deadlock waits up to %7llu us: %5u", wt_wait_table[cnt], wt_wait_stats[cnt]); so is supposed to diag() only if the statistic is >0.
[16 Oct 2008 20:02]
Guilhem Bichot
Note that the line which I commented out, my_wincond.c, is from a relatively new implementation: it was introduced, together with a significant rewrite of my_wincond.c, in revision sp1r-Kristofer.Pettersson@naruto.-20070215130821-52591 . It may be a correct implementation, or we may have found a bug in it, which is not impossible as this unit test uses rwlocks a lot (more than the average Server code).
[21 Oct 2008 19:26]
Sergei Golubchik
committed: http://lists.mysql.com/maria/246 approved on irc
[7 Nov 2008 15:25]
Guilhem Bichot
fixed in 6.0.8
[8 Jan 2009 10:42]
MC Brown
Test case failure. No documentation needed.