Bug #73168 locking/unlocking overhead for THD::LOCK_thd_data
Submitted: 2 Jul 2014 4:14 Modified: 2 Jul 2014 11:15
Reporter: hui liu (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.6/5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: LOCK_thd_data, set_query

[2 Jul 2014 4:14] hui liu
Description:
For LOCK_thd_data mutex, it's major used to protect thd->query and thd->query_length, accessed from other SHOW related threads:

  /**
    Protects THD data accessed from other threads:
    - thd->query and thd->query_length (used by SHOW ENGINE
      INNODB STATUS and SHOW PROCESSLIST
    - thd->mysys_var (used by KILL statement and shutdown).
    Is locked when THD is deleted.
  */
  mysql_mutex_t LOCK_thd_data;

While seen from the PS, it's called too much for normal workload, and major in THD::set_query_xx():

root@performance_schema 05:35:59>select e.EVENT_NAME, e.SOURCE, e.OPERATION, count(*) as cnt from events_waits_current e join threads t using(thread_id) where type='foreground' and processlist_id not in (1, @@pseudo_thread_id) group by e.EVENT_NAME, e.SOURCE, e.OPERATION order by cnt desc;
+-----------------------------------------+-------------------+-------------+------+
| EVENT_NAME | SOURCE | OPERATION | cnt |
+-----------------------------------------+-------------------+-------------+------+
| wait/synch/mutex/sql/THD::LOCK_thd_data | sql_class.cc:4662 | lock | 3992 |
| wait/io/file/innodb/innodb_log_file | fil0fil.cc:5895 | sync | 2 |
| wait/io/table/sql/handler | handler.cc:7347 | insert | 2 |
| wait/synch/mutex/sql/THD::LOCK_thd_data | sql_base.cc:1492 | lock | 1 |
| wait/lock/table/sql/handler | thr_lock.c:556 | read normal | 1 |
| wait/io/table/sql/handler | handler.cc:2868 | fetch | 1 |
| wait/io/table/sql/handler | handler.cc:2833 | fetch | 1 |
+-----------------------------------------+-------------------+-------------+------+

I added some counter stats for THD::set_query_xxx() and found THD::set_query() is executed  87173.2 per second!

$gdb -p 113129 -ex " set pagination 0" -ex "print nq" -ex "print nqi" -ex "print ni" -batch && sleep 5 && gdb -p 113129 -ex " set pagination 0" -ex "print nq" -ex "print nqi" -ex "print ni" -batch

$1 = 4972652
$2 = 0
$3 = 2486397

$1 = 5408518
$2 = 0
$3 = 2704340

After remove the locking/unlocking in THD::set_query_xxx(), and re-run the load,  this mutex has gone in PS:

root@performance_schema 07:14:16>select e.EVENT_NAME, e.SOURCE, e.OPERATION, count(*) as cnt from events_waits_current e join threads t using(thread_id) where type='foreground' and processlist_id not in (1, @@pseudo_thread_id) group by e.EVENT_NAME, e.SOURCE, e.OPERATION order by cnt desc;
+-------------------------------------+-------------------+-------------------+------+
| EVENT_NAME | SOURCE | OPERATION | cnt |
+-------------------------------------+-------------------+-------------------+------+
| idle | mysqld.cc:952 | idle | 3415 |
| wait/io/file/sql/binlog | mf_iocache.c:1788 | write | 525 |
| wait/io/file/innodb/innodb_log_file | fil0fil.cc:5701 | write | 34 |
| wait/io/table/sql/handler | handler.cc:2833 | fetch | 8 |
| wait/io/table/sql/handler | handler.cc:2868 | fetch | 8 |
| wait/io/table/sql/handler | handler.cc:7347 | insert | 5 |
| wait/lock/table/sql/handler | thr_lock.c:556 | write allow write | 3 |
| wait/io/table/sql/handler | handler.cc:2992 | fetch | 1 |
| wait/io/file/sql/binlog | binlog.cc:6655 | sync | 1 |
+-------------------------------------+-------------------+-------------------+------+
9 rows in set (2.14 sec)

How to repeat:
PS monitor

Suggested fix:
As the THD::set_query_xxx() is only conflicts with mysqld_list_processes(SHOW PROCESSLIST) and fill_schema_processlist (SHOW ENGINE INNODB STATUS), we can benefit with less overhead if replace mutex with lock-free atomic operations.
[2 Jul 2014 11:15] MySQL Verification Team
Hello Liu,

Thank you for the report.
I observed this while using sysbench

// 5.6.20 with defaults settings
// mysqld with --performance_schema_instrument='wait/synch/mutex/%=on'
** set up proper consumers

// Sysbench

[root@cluster-repo mysql-advanced-5.6.20]# sysbench --test=oltp --oltp-table-size=1000000 --mysql-db=dbtest --mysql-user=root --mysql-socket=/tmp/72720M/72720M.sock prepare
sysbench 0.4.12:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Creating table 'sbtest'...
Creating 1000000 records in table 'sbtest'...
[root@cluster-repo mysql-advanced-5.6.20]# sysbench --test=oltp --oltp-test-mode=complex --oltp-read-only=off --oltp-table-size=1000000 --num-threads=900 --max-time=360 --max-requests=0 --mysql-db=dbtest --mysql-user=root --mysql-socket=/tmp/72720M/72720M.sock run
sysbench 0.4.12:  multi-threaded system evaluation benchmark

No DB drivers specified, using mysql
Running the test with following options:
Number of threads: 900

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!

// 
mysql> select version();
+---------------------------------------+
| version()                             |
+---------------------------------------+
| 5.6.20-enterprise-commercial-advanced |
+---------------------------------------+
1 row in set (0.00 sec)

mysql> use performance_schema;select e.EVENT_NAME, e.SOURCE, e.OPERATION, count(*) as cnt from events_waits_current e join threads t using(thread_id) where type='foreground' and processlist_id not in (1, @@pseudo_thread_id) group by e.EVENT_NAME, e.SOURCE, e.OPERATION order by cnt desc;
Database changed
+-----------------------------------------+-------------------+-----------+-----+
| EVENT_NAME                              | SOURCE            | OPERATION | cnt |
+-----------------------------------------+-------------------+-----------+-----+
| wait/synch/mutex/sql/THD::LOCK_thd_data | sql_class.cc:4467 | lock      | 250 |
| idle                                    | mysqld.cc:935     | idle      |  89 |
| wait/synch/mutex/sql/LOCK_open          | sql_base.cc:2936  | lock      |  36 |
| wait/synch/mutex/sql/THD::LOCK_thd_data | sql_class.h:3983  | lock      |   4 |
| wait/synch/mutex/innodb/dict_sys_mutex  | dict0dict.cc:1026 | lock      |   3 |
| wait/io/table/sql/handler               | handler.cc:2788   | fetch     |   1 |
+-----------------------------------------+-------------------+-----------+-----+
6 rows in set (0.08 sec)

Thanks,
Umesh
[25 Feb 2015 10:51] MySQL Verification Team
Hello Liu,

Could you please provide more details about real performance impact? Do you have any test case/ real time stats which confirms the performance impact etc? 

Thanks,
Umesh