MySQL Bugs: #56666: Race condition between the server main thread and the kill server thread

Bug #56666	Race condition between the server main thread and the kill server thread
Submitted:	8 Sep 2010 21:24	Modified:	12 Nov 2013 2:19
Reporter:	Marc ALFF	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:		OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any

Description:
Found by analysis and code review.

During the server shutdown, the main() and kill_server_thread() are executing in parallel.

The shutdown itself is orderly, with some synchronization between these two threads, until the variable "ready_to_exit" is set to 1 by the kill server thread, which main() waits on.

But after that point, both threads can execute more cleanup code that causes race conditions.

In particular, code like mysqld_exit() in 5.5, and similar code in 5.1 and earlier, attempts to destroy the same mutexes, etc.

That specific area of the code is not heavily stressed, because after all servers in production are supposed to stay up for a long time and not be shutdown frequently.
Stress on this area of the code (and how the bug was found) comes from the MTR test suite, because it starts and shuts down servers so many times, increasing the probability to expose the bug.

This bug in the server shutdown is believed to be the root cause of some unexplained spurious failures in automated tests.

How to repeat:
Read the code
Follow the code after ready_to_exit=1

Suggested fix:
N/A

This is a possible root cause of bug#29650, which was never reproduced.

Found during the analysis of bug#56324.

See also bug#56760, which is caused by bug#56666.

Bug#55740 and Bug#58707 have been marked as duplicates of this one.

Increasing priority because it's causing a lot of valgrind warnings.

Noted in 5.7.3 changelog.

At server shutdown, a race condition between the the main thread and
the shutdown thread could cause server failure.