Bug #57546 rwlock prefer reader in mdl.cc
Submitted: 19 Oct 2010 3:09 Modified: 11 Nov 2010 5:15
Reporter: davidxu xu Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.5.4 OS:FreeBSD
Assigned to: Dmitry Lenev CPU Architecture:Any
Tags: reader, rwlock

[19 Oct 2010 3:09] davidxu xu
Description:
I saw mysql 5.5.4 source file mdl.cc discussed about rwlock deadlock,
I found it uses prefer-reader mode for rwlock, that may not be necessary
on FreeBSD, though I recently added a compatible shim for Linux like 
PTHREAD_RWLOCK_PREFER_READER because mysql is using its own rwlock implementation that is not as efficent as FreeBSD native rwlock.
In FreeBSD, when a thread locked reader lock on any rwlock, its later
locking on any rwlocks will prefer reader, it ignores pending write waiters,
This is very simple but effective way to handle deadlock on rwlock.
I wish MySQL will fix the issue on FreeBSD, as I may backout the Linux
compatible shim.

Regards,
David Xu

How to repeat:
Just see mdl.cc which discussing the rwlock deadlock.

Suggested fix:
Just use native pthread_rwlock_t on FreeBSD, don't reinvent it.
[21 Oct 2010 15:07] Sveta Smirnova
Thank you for the report.

Verified as described.
[10 Nov 2010 15:18] Dmitry Lenev
Hello David!

First of all thanks a lot for your feedback!

Starting from 5.5.7-rc version of server we no longer use system rwlocks in mdl.cc subsystem on any platform. Instead on all platforms this subsystem uses our own implementation of special kind of rwlock which is heavily optimized for this particular scenario.

There are two reasons for this:

1) We have found that even Linux'es PTHREAD_RWLOCK_PREFER_READER_NP rwlocks do not provide strong enough guarantees about order in which locks are acquired. As result bug #56405 popped up. Our own special kind of rwlock provides needed guarantees and solves this problem.

2) The way in which rwlocks are used in mdl.cc is not common for normal rwlocks (for example, requests for write-lock are often there and requests for read-locks are rare). By coding our own special kind of rwlock which is heavily optimized for this particular usage we were able to get single implementation which performs well on different platforms. Particularly this step allowed to solve bug #56585.

Note that server code other than mdl.cc still uses normal system rwlocks if available. But this code doesn't rely on PTHREAD_RWLOCK_PREFER_READER_NP flag. So now you can safely back-out Linux compatible shim.

Since the problems with usage of PTHREAD_RWLOCK_PREFER_READER_NP flag and usage of inefficient custom implementation of rwlock should be no longer repeatable with 5.5.7 I'm changing status of this request to "Can't repeat".

Please let us know/feel free to re-open this request if you experience performance problems on FreeBSD with new implementation of mdl.cc's rw_pr_lock_t.

Thanks once again!
[11 Nov 2010 5:15] davidxu xu
The pthread rwlock in freebsd 8.0 and later is lockless, it uses atomic operations.
if you implement your own rwlock and you use mutex and condition variable, you will have lock contention, faster the code protected by your rwlock executes, the more lock contention will happen. In serve cases as following code:

int temp;
pthread_rwlock_rdlock(&rw);
temp = global_value;
pthread_rwlock_unlock(&rw);

The above code will encounter severe lock contention if it needs to be 
frequently executed, and you have used pthread_mutex and condition variable to
implement your own rwlock.

Regards,
David Xu
[12 Nov 2010 7:33] Dmitry Lenev
Hello David!

Indeed, in general case typical implementation of rwlock which is based on mutex + condition variable will perform worse than any reasonable system implementation of rwlock.

But in mdl.cc rwlocks are used in rather special way. Most common operations for them is write-lock/unlock. Read-lock/unlock are supposed not to happen often. Contention-wise in such scenario it probably makes more sense to use mutexes and not rwlocks, but unfortunately code in mdl.cc needs rwlocks in order to avoid deadlocks.

So in mdl.cc we now use our custom implementation of rwlock which is specifically optimised for this particular case. For example in this implementation wr-lock operation in equivalent to mutex lock
in most common case (even in most often scenario of contention) and unlock operation is equivalent to mutex unlock.

We did some testing (run sysbench tests which usually expose problems with contention quite well) and found that this special implementation behaves comparable to system rwlocks on Linux and Windows (in our rather special case). So we have decided use it for mdl.cc on all platforms.

Please let us know if you will encounter any specific problems with it on FreeBSD. Once again thank you for your feedback!