Bug #77862 data race on safe mutex mp->file
Submitted: 28 Jul 2015 18:10 Modified: 11 Aug 2015 15:44
Reporter: Richard Prohaska Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.7.7, 5.7.9 OS:Linux
Assigned to: CPU Architecture:Any

[28 Jul 2015 18:10] Richard Prohaska
Description:
The safe mutex is the implementation of a MySQL mutex that includes extra data that is useful for debugging locking problems.  For example, the file name and line number of the code that locks a safe mutex is stored in the safe mutex object.

The safemutex data race occurs in 'safe_mutex_lock' function because 'mp->file' is read without holding a mutex but is written later while holding a mutex.

The fix to the safe mutex data race is simple:  just hold the 'mp->global_mutex' when reading mp->file'.

==31411== Possible data race during read of size 8 at 0x2AC2510 by thread #1
==31411== Locks held: none
==31411==    at 0x1868755: safe_mutex_lock (thr_mutex.c:54)
==31411==    by 0x1470B6C: my_mutex_lock (thr_mutex.h:195)
==31411==    by 0x1470DCA: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==31411==    by 0x1476C5B: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1867)
==31411==    by 0x1476D9C: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1892)
==31411==    by 0x147705B: sql_print_information(char const*, ...) (log.cc:1941)
==31411==    by 0x1A899F5: ib::info::~info() (ut0ut.cc:890)
==31411==    by 0x1A6C015: trx_sys_file_format_max_check(unsigned long) (trx0sys.cc:646)
==31411==    by 0x1A38432: innobase_start_or_create_for_mysql() (srv0start.cc:2107)
==31411==    by 0x1899DF3: innobase_init(void*) (ha_innodb.cc:3464)
==31411==    by 0xF6B243: ha_initialize_handlerton(st_plugin_int*) (handler.cc:778)
==31411==    by 0x158B1E2: plugin_initialize(st_plugin_int*) (sql_plugin.cc:1203)

==31411== This conflicts with a previous write of size 8 by thread #9
==31411== Locks held: 2, at addresses 0x2AC24C0 0x2AC24E8
==31411==    at 0x1868962: safe_mutex_lock (thr_mutex.c:123)
==31411==    by 0x1470B6C: my_mutex_lock (thr_mutex.h:195)
==31411==    by 0x1470DCA: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==31411==    by 0x1476C5B: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1867)
==31411==    by 0x1476D9C: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1892)
==31411==    by 0x147705B: sql_print_information(char const*, ...) (log.cc:1941)
==31411==    by 0x1A899F5: ib::info::~info() (ut0ut.cc:890)
==31411==    by 0x1AFE5CC: buf_flush_page_cleaner_coordinator (buf0flu.cc:3015)
==31411==  Address 0x2ac2510 is 80 bytes inside data symbol "LOCK_error_log"

How to repeat:
add helgrind support to 'mtr, create a mtr test that does nothing and run mtr with helgrind on mysqld.

See http://prohaska7.blogspot.com/2015/07/helgrind-and-mysql-57.html

Suggested fix:
The fix to the safe mutex data race is simple:  just hold the 'mp->global_mutex' when reading mp->file'.

https://github.com/prohaska7/mysql-server/commit/91d8f2aa6d572fb58773a6ee8414d3a965d7d506
[29 Jul 2015 11:21] MySQL Verification Team
Hello Richard,

Thank you for the report.
Observed this with 5.7.7 builds.

Thanks,
Umesh
[29 Jul 2015 11:21] MySQL Verification Team
##  Source pulled from http://dev.mysql.com/downloads/mysql/

[umshastr@hod03]/export/umesh/server/source: md5sum mysql-5.7.7-rc.tar.gz
36a96021a93ca236e76f5c90ae19cd44  mysql-5.7.7-rc.tar.gz

(patched mtr with https://github.com/prohaska7/mysql-server/commit/62ba491a8fe463453281d0b766f3fc85fd1c6cb5)

[umshastr@hod03]/export/umesh/server/source/mysql-5.7.7-rc: cmake . -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/tmp -DCMAKE_BUILD_TYPE=Debug
[umshastr@hod03]/export/umesh/server/source/mysql-5.7.7-rc: make -j 16

[umshastr@hod03]/export/umesh/server/source/mysql-5.7.7-rc/mysql-test: ./mtr --debug --helgrind donothing
Logging: ./mtr  --debug --helgrind donothing
MySQL Version 5.7.7
Turning on valgrind with helgrind for mysqld(s)
Checking supported features...
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
Checking leftover processes...
Removing old var directory...
Creating var directory '/export/umesh/server/source/mysql-5.7.7-rc/mysql-test/var'...
Installing system database...

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
worker[1] mysql-test-run: WARNING: Waited 60 seconds for /export/umesh/server/source/mysql-5.7.7-rc/mysql-test/var/mysqld.1/data/auto.cnf to be created, still waiting for 120 seconds...
worker[1] mysql-test-run: WARNING: Check-testcase failed, this could also be caused by the previous test run by this worker thread
main.donothing                           [ fail ]
        Test ended at 2015-07-29 11:45:59

CURRENT_TEST: main.donothing

Timeout for 'check-testcase' expired after 90 secondsmysqltest failed but provided no output

 - the logfile can be found in '/export/umesh/server/source/mysql-5.7.7-rc/mysql-test/var/log/main.donothing/donothing.log'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 188 seconds executing testcases

Completed: Failed 1/1 tests, 0.00% were successful.

Failing test(s): main.donothing

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

mysql-test-run: *** ERROR: there were failing test cases
[umshastr@hod03]/export/umesh/server/source/mysql-5.7.7-rc/mysql-test:

==11503== Possible data race during read of size 8 at 0x2BBED10 by thread #1
==11503== Locks held: none
==11503==    at 0x185537C: safe_mutex_lock (thr_mutex.c:54)
==11503==    by 0x147647E: my_mutex_lock (thr_mutex.h:195)
==11503==    by 0x14766DC: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==11503==    by 0x147C026: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1867)
==11503==    by 0x147C148: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1892)
==11503==    by 0x147C411: sql_print_information(char const*, ...) (log.cc:1941)
==11503==    by 0x1A5FD9B: ib::info::~info() (ut0ut.cc:890)
==11503==    by 0x1A42677: trx_sys_file_format_max_check(unsigned long) (trx0sys.cc:646)
==11503==    by 0x1A0F222: innobase_start_or_create_for_mysql() (srv0start.cc:2107)
==11503==    by 0x1878303: innobase_init(void*) (ha_innodb.cc:3464)
==11503==    by 0xF6AD83: ha_initialize_handlerton(st_plugin_int*) (handler.cc:778)
==11503==    by 0x158A9CD: plugin_initialize(st_plugin_int*) (sql_plugin.cc:1203)
==11503==
==11503== This conflicts with a previous write of size 8 by thread #9
==11503== Locks held: 2, at addresses 0x2BBECC0 0x2BBECE8
==11503==    at 0x1855589: safe_mutex_lock (thr_mutex.c:123)
==11503==    by 0x147647E: my_mutex_lock (thr_mutex.h:195)
==11503==    by 0x14766DC: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==11503==    by 0x147C026: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1867)
==11503==    by 0x147C148: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1892)
==11503==    by 0x147C411: sql_print_information(char const*, ...) (log.cc:1941)
==11503==    by 0x1A5FD9B: ib::info::~info() (ut0ut.cc:890)
==11503==    by 0x1AD1FB6: buf_flush_page_cleaner_coordinator (buf0flu.cc:3015)
[29 Jul 2015 11:49] MySQL Verification Team
// 5.7.9

[umshastr@hod03]/export/umesh/server/source: md5sum 16026794.mysql-5.7-export-8266721.tar.gz
f99c1265b5785850683e002b4a40d798  16026794.mysql-5.7-export-8266721.tar.gz

[umshastr@hod03]/export/umesh/server/source/mysql-5.7-export-8266721: cmake . -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/tmp -DCMAKE_BUILD_TYPE=Debug
[umshastr@hod03]/export/umesh/server/source/mysql-5.7-export-8266721: make -j 16

[umshastr@hod03]/export/umesh/server/source/mysql-5.7-export-8266721/mysql-test: ./mtr --debug --helgrind donothing
Logging: ./mtr  --debug --helgrind donothing
MySQL Version 5.7.9
Turning on valgrind with helgrind for mysqld(s)
Checking supported features...
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
Removing old var directory...
Creating var directory '/export/umesh/server/source/mysql-5.7-export-8266721/mysql-test/var'...
Installing system database...

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
worker[1] mysql-test-run: WARNING: Check-testcase failed, this could also be caused by the previous test run by this worker thread
main.donothing                           [ fail ]
        Test ended at 2015-07-29 13:40:36

CURRENT_TEST: main.donothing

Timeout for 'check-testcase' expired after 90 secondsmysqltest failed but provided no output

 - the logfile can be found in '/export/umesh/server/source/mysql-5.7-export-8266721/mysql-test/var/log/main.donothing/donothing.log'
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 132 seconds executing testcases

Completed: Failed 1/1 tests, 0.00% were successful.

Failing test(s): main.donothing

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

mysql-test-run: *** ERROR: there were failing test cases

==2223== Possible data race during read of size 8 at 0x2A69EB0 by thread #1
==2223== Locks held: none
==2223==    at 0x16F63F8: safe_mutex_lock (thr_mutex.c:54)
==2223==    by 0x1316B62: my_mutex_lock (thr_mutex.h:180)
==2223==    by 0x1316D23: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==2223==    by 0x131BF80: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1917)
==2223==    by 0x131C0AD: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1943)
==2223==    by 0x131C376: sql_print_information(char const*, ...) (log.cc:1992)
==2223==    by 0x194F2D3: ib::info::~info() (ut0ut.cc:900)
==2223==    by 0x1930CFC: trx_sys_file_format_max_check(unsigned long) (trx0sys.cc:681)
==2223==    by 0x18EFBC5: innobase_start_or_create_for_mysql() (srv0start.cc:2135)
==2223==    by 0x173A9A3: innobase_init(void*) (ha_innodb.cc:3652)
==2223==    by 0xE2AC10: ha_initialize_handlerton(st_plugin_int*) (handler.cc:840)
==2223==    by 0x142B596: plugin_initialize(st_plugin_int*) (sql_plugin.cc:1171)
==2223==
==2223== This conflicts with a previous write of size 8 by thread #9
==2223== Locks held: 2, at addresses 0x2A69E88 0x2A69E60
==2223==    at 0x16F6605: safe_mutex_lock (thr_mutex.c:123)
==2223==    by 0x1316B62: my_mutex_lock (thr_mutex.h:180)
==2223==    by 0x1316D23: inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int) (mysql_thread.h:714)
==2223==    by 0x131BF80: print_buffer_to_file(loglevel, char const*, unsigned long) (log.cc:1917)
==2223==    by 0x131C0AD: error_log_print(loglevel, char const*, __va_list_tag*) (log.cc:1943)
==2223==    by 0x131C376: sql_print_information(char const*, ...) (log.cc:1992)
==2223==    by 0x194F2D3: ib::info::~info() (ut0ut.cc:900)
==2223==    by 0x19C42DD: buf_flush_page_cleaner_coordinator (buf0flu.cc:2996)
==2223==
2015-07-29T11:38:40.911205Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
[11 Aug 2015 15:44] Paul DuBois
Noted in 5.7.9, 5.8.0 changelogs.

A potential race condition for the safe mutex implementation was
corrected. This implementation is enabled by default only for debug
builds.