Bug #79510 mysql_real_connect is not thread safe
Submitted: 3 Dec 2015 16:41 Modified: 25 May 2016 17:59
Reporter: Chris Giard Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: C API (client library) Severity:S2 (Serious)
Version:5.6 and 5.7 OS:Any
Assigned to: CPU Architecture:Any

[3 Dec 2015 16:41] Chris Giard
Description:
When calling mysql_real_connect() after setting MYSQL_READ_DEFAULT_FILE or MYSQL_READ_DEFAULT_GROUP, mysql_real_connect will call my_load_defaults(), located in mysys_ssl/my_default.cc.  my_load_defaults() will then toggle is_login_file, a global static variable in that file, as it calls my_search_option_files(), which has conditional logic based on that global variable.  This is inherently thread-unsafe, and in usage causes ~12% of 200 parallel threads making > 1000 connections to not properly load options from ~/.my.cnf.

Additionally, my_search_option_files() uses multiple other global static variables in that file (listed below), none of which are protected by a mutex.
defaults_already_read
found_no_defaults
my_defaults_extra_file
my_defaults_extra_file_buffer
my_defaults_file
my_defaults_file_buffer
my_defaults_group_suffix
my_login_file
my_login_path

This is a regression from 5.5, which didn't have the is_login_file flag, but still had the races with the other named variables.

How to repeat:
Recreation is non-deterministic because it is a thread race condition.  Generally:
1) Create a .my.cnf with a user and password
2) Using the C API, create a multi-threaded client.
3) Create a thread pool of 200 threads.
4) In each thread, set MYSQL_READ_DEFAULT_GROUP as appropriate.
5) Connect to a server (all connections can go to the same server)
6) Some percentage will fail because they won't load the defaults file.

Note, this was encountered on Cent6.5 x86_64, discovered in 5.6.27.

Suggested fix:
A combination of the following is probably needed to do this in a performant fashion:
1) Rewrite the functions to not use a global variable for is_login_file and instead pass the option to my_search_option_files().
2) Reduce the other global variables used to a minimum.
3) Use mutexes as appropriate to avoid races with the remaining global variables.
[4 Dec 2015 16:34] MySQL Verification Team
Hi,

Thank you for your bug report.

Some notes .... It is inherently safe to access scalar variables from different threads simultaneously. These are integer and integer-based scalar types. Those do not require mutex protection. What could possibly require protection are array types, like strings.

Also, thread-unsafe would be a code that is not protecting blocks of code operating on inter-dependent global variables.

We have no yet encountered such a case, so we would make need a fully repeatable test case. In this case it is a multi-threaded code that produces the same buggy behavior. In this case, buggy behavior would mean not reading properly .my.cnf file. 

Also, we require to know whether these are different users , each reading its own configuration file, or something else. 

The repeatable test case, in this bug it is example code, would prove that libmysqlclient is not thread-safe, by displaying some values differently then others.
[4 Dec 2015 16:56] Chris Giard
Parallel MySQL, original code by Domas, updated to support SSL connections

Attachment: pmysql.cc (application/octet-stream, text), 13.55 KiB.

[4 Dec 2015 16:56] Chris Giard
Makefile for pmysql.cc

Attachment: Makefile (application/octet-stream, text), 274 bytes.

[4 Dec 2015 17:13] Chris Giard
The code is thread-unsafe because of is_login_file, which is a global variable toggled between FALSE and TRUE in my_load_defaults() that affects the code path in my_search_options_file() (as explained in my original message).

Software required to reproduce is attached.  You will also need to setup a mysql server as follows:
1) CREATE DATABASE pmysql;
2) GRANT ALL PRIVILEGES ON pmysql.* TO pmysql@'%' IDENTIFIED BY 'pmysql';

Create a .my.cnf with the following content:
[client]
user=pmysql
password=pmysql

Then do the following:
for i in $( seq 1 10000 ); do echo HOSTNAME_OF_MYSQL_SERVER; done | pmysql -B pmysql -Q "select version()"

You will get a percentage of failures, mostly at the beginning of the run.  The errors will indicate that access was denied for your unix user name, NOT the pmysql user, indicating the mysql_real_connect didn't properly load the .my.cnf.
[4 Dec 2015 17:32] MySQL Verification Team
I am a bit sceptic after myy rough analysis of the code. I doubt that it is is_login_file per se. If there is a bug at all, it is somewhere else.
[4 Dec 2015 17:44] Chris Giard
Logic required to see the error:

T1: calls mysql_real_connect()
T1: sub-call to my_load_defaults()
T1: sub-call to my_search_options_file() (is_login_file is default value, supposed to be FALSE)
T1: sub-call returns from my_search_options_file()
T1: sets is_login_file to TRUE
T2: calls mysql_real_connect()
T2: sub-call to my_load_defaults()
T1: sub-call to my_search_options_file()
T2: sub-call to my_search_options_file() (is_login_file is TRUE, but should be default of FALSE)
T1: sub-call returns from my_search_options_file()
T2: sub-call returns from my_search_options_file()
T2: sets is_login_file to TRUE
T1: sets is_login_file to FALSE
T1: sub-call returns from my_load_defaults()
T2: sub-call to my_search_options_file() (is_login_file is FALSE, but should be TRUE)
T2: sub-call returns from my_search_options_file()
T2: sets is_login_file to FALSE
T2: sub-call returns from my_load_defaults()
[4 Dec 2015 17:49] Chris Giard
Also, a performance hurting "FIX" is to add a mutex_lock in my_load_defaults() immediately before the first call to my_search_options_file() and an unlock after the "is_login_file = FALSE" line.  This is not the proper fix, as it does hurt performance, but it does make it thread safe and removes all the cases I have seen where it doesn't load the .my.cnf properly.
[7 Dec 2015 16:02] MySQL Verification Team
Hi,

It turned out that it was mistake to make 

is_login_file

as a global file. It makes it thread-unsafe. This solution will be changed in one of two available thread-safe option.

Thank you for your report.
[8 Dec 2015 5:52] MySQL Verification Team
Just to share my experience on this.  My load tester app has been unstable after the server crashes.  I didn't debug it yet (easier to put it in a loop in a batch file instead). 

When server crashed and connection is lost,  it ends up like this:
Please note I do not use the my.cnf reader function that client uses. I use boost's one instead....
One must wonder why is mysql_real_connect getting confused and thinking .mylogin.config is a my.cnf at all!!

--------
#server died!!!: 2013 (Lost connection to MySQL server during query)

  select a . c0002 as c from t0011 as a left join t0011 as
 b on a . c0001 > b . c0002 left join t0011 as c on b. c0000
 <> c. c0001
 ;

#server died!!!: 2013 (Lost connection to MySQL server during query)

  select sql_buffer_result distinct c0005 from t0018 into @f

 ;

[ERROR] Found option without preceding group in config file C:\Users\anon\AppData\Roaming\MySQL\.mylogin.cnf at line 2!
[ERROR] Found option without preceding group in config file C:\Users\anon\AppData\Roaming\MySQL\.mylogin.cnf at line 2![ERROR] Fatal error in defaults handling. Program aborted!
[ERROR] Found option without preceding group in config file C:\Users\anon\AppData\Roaming\MySQL\.mylogin.cnf at line 2!

[ERROR] Fatal error in defaults handling. Program aborted!
[ERROR] Fatal error in defaults handling. Program aborted!

C:\code\GypsyII\GypsyII\x64\Release>
--------
[8 Dec 2015 12:12] MySQL Verification Team
This was reported internally two years ago!
Bug 18017023 - UNPROTECTED STATIC GLOBAL DISPLAYS DIFFERENT STATE ACROSS THREADS
[1 Mar 2016 13:52] MySQL Verification Team
http://bugs.mysql.com/bug.php?id=80574 marked as duplicate of this one.
[1 Mar 2016 16:02] MySQL Verification Team
Checked and it is still there.
[25 May 2016 17:59] Paul DuBois
Posted by developer:
 
Noted in 5.6.32, 5.7.14 changelogs.

mysql_real_connect() was not thread-safe when invoked with the
MYSQL_READ_DEFAULT_FILE or MYSQL_READ_DEFAULT_GROUP option enabled.