Bug #83532 InnoDB mutex bs: memory barriers
Submitted: 25 Oct 2016 14:11 Modified: 28 Nov 2016 5:28
Reporter: Sergey Vojtovich Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:ARM

[25 Oct 2016 14:11] Sergey Vojtovich
Description:
Decent mutex implementation in uncontested scenario should do:
- on mutex acquisition 1 atomic read-modify-write operation with acquire memory barrier
- on mutex release 1 atomic read-modify-write operation with release memory barrier

See http://en.cppreference.com/w/c/atomic/memory_order#Release-Acquire_ordering for Release-Acquire ordering explanation.

The above is platform independent, consistent and performance wise solution.

What does InnoDB TTASEventMutex do is quite hard to follow due to numerous inline methods and ifdefs, but still:
- on mutex acquisition
  newer gcc on PPC and ARM: full memory barrier (not performance wise, acquire is enough!)
  newer gcc all but PPC and ARM: release memory barrier (bug!!!)
  older gcc, all platforms: release memory barrier (bug!!!)
- on mutex release
  newer gcc on PPC and ARM: full memory barrier (not performance wise, release is enough!)
  newer gcc all but PPC and ARM: release memory barrier (good!)
  older gcc, all platforms: release memory barrier (good!)

Traces:
enter() / try_lock() / tas_lock() / TAS() / os_atomic_test_and_set() / __atomic_exchange(ptr, &new_val,  &ret, __ATOMIC_SEQ_CST)
enter() / try_lock() / tas_lock() / TAS() / os_atomic_test_and_set() / __atomic_exchange(ptr, &new_val,  &ret, __ATOMIC_RELEASE)
enter() / try_lock() / tas_lock() / TAS() / os_atomic_test_and_set() / __sync_lock_test_and_set(ptr, new_val)

exit() / tas_unlock() / TAS() / os_atomic_test_and_set() / __atomic_exchange(ptr, &new_val,  &ret, __ATOMIC_SEQ_CST)
exit() / tas_unlock() / TAS() / os_atomic_test_and_set() / __atomic_exchange(ptr, &new_val,  &ret, __ATOMIC_RELEASE)
exit() / tas_unlock() / TAS() / os_atomic_test_and_set() / __sync_lock_test_and_set(ptr, new_val)

Even though it may work somehow, this implementation is prone to performance issues and race conditions which sometimes take months to fix.

How to repeat:
Code analysis.

Suggested fix:
Complexity and frailty of this can be greatly reduced by:
- reducing number of intermediate calls, specifically tas_[un]lock(), TAS(), os_atomic_test_and_set() are misleading
- combining m_waiters and m_lock_word
- since MySQL is moving towards C11, use C11 atomics API (can be simulated by gcc atomic, or more heavyweight sync builtins/windows Interlocked ops)
- issue proper memory barriers
[25 Oct 2016 14:26] Sergey Vojtovich
Apparently there's a mistake in description of what older gcc does: in fact it does acquire (not release) both on mutex acquisition and release. And thus overall situation is even worse:
- on mutex acquisition
  newer gcc on PPC and ARM: full memory barrier (not performance wise, acquire is enough!)
  newer gcc all but PPC and ARM: release memory barrier (bug!!!)
  older gcc, all platforms: acquire memory barrier (good!)
- on mutex release
  newer gcc on PPC and ARM: full memory barrier (not performance wise, release is enough!)
  newer gcc all but PPC and ARM: release memory barrier (good!)
  older gcc, all platforms: acquire memory barrier (bug!!!)
[25 Oct 2016 14:33] Sergey Vojtovich
There's also a comment in front of tas_unlock():

        /** In theory __sync_lock_release should be used to release the lock.
        Unfortunately, it does not work properly alone. The workaround is
        that more conservative __sync_lock_test_and_set is used instead. */

I won't dive into details why assumption that __sync_lock_test_and_set is more conservative than __sync_lock_release is plain wrong.

This comment is now misleading because tas_unlock() may issue either acquire or release or full memory barrier depending on platform.
[25 Oct 2016 19:20] Mark Callaghan
I reported possible perf problems in the new code, but I don't work on InnoDB perf any more.
https://bugs.mysql.com/bug.php?id=73763
[28 Nov 2016 5:28] Umesh Shastry
Hello Sergey Vojtovich,

Thank you for your bug report. At this time we don't see any correctness issues. Our testing on Intel HW doesn't demonstrate any issues. We will triage and analyse the suggested changes as part of our process.

Thanks,
Umesh