Bug #71708 70768 fix perf regression: high rate of RW lock creation and destruction
Submitted: 13 Feb 2014 15:19 Modified: 9 Oct 2014 7:28
Reporter: Laurynas Biveinis (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.6+ OS:Any
Assigned to: CPU Architecture:Any
Tags: dict, performance, regression, rwlock

[13 Feb 2014 15:19] Laurynas Biveinis
Description:
5.6.16 fixed bug 70768 by changing dict_table_t stat-related member locking. Instead of a static array of 64 shared latches each dict_table_t object got its own dynamically created and destroyed latch.

But the short-lived private dict_table_t objects such as those used in ibuf merge, got these dynamic rwlatches too. Each dict_table_t::stats_latch creation performs heap memory allocations, acquires rw_lock_list_mutex, creates two OS events, which in turn acquire os_sync_mutex.

The same also applies to dict_table_t::autoinc_mutex, which is not a 5.6.16 regression, but previously existing wasted work.

How to repeat:
Sysbench OLTP RW at high concurrency. We saw a TPS regression on Percona Server in the I/O-bound case. It should be visible on CPU-bound workloads even more.

The TPS regression of Percona Server is not reproducible on MySQL. But perf profile shows that the issue is there.

Suggested fix:
Do not create and initialize rwlocks for dummy table objects.
[13 Feb 2014 15:45] Laurynas Biveinis
Parts of Perf report on MySQL 5.6.16:

-  16,76%  mysqld  mysqld               [.] ut_delay(unsigned long)
   - ut_delay(unsigned long)
      - 70,50% mutex_spin_wait(ib_mutex_t*, char const*, unsigned long)
         + 25,68% pfs_mutex_enter_func(ib_mutex_t*, char const*, unsigned long)
         - 24,70% rw_lock_create_func(rw_lock_t*, char const*, unsigned long)
              dict_mem_table_create(char const*, unsigned long, unsigned long, unsigned long, unsigned long)
            + ibuf_build_entry_from_ibuf_rec_func(unsigned char const*, mem_block_info_t*, dict_index_t**)
         - 22,26% rw_lock_free_func(rw_lock_t*)
            + dict_mem_table_free(dict_table_t*)
[13 Feb 2014 15:46] Laurynas Biveinis
Credit shared with Alexey Stroganov
[14 Feb 2014 17:01] MySQL Verification Team
Hi Laurynas,

First of all, give my regards to Alexey !!!

Next, we truly need to be able to reproduce a regression on MySQL server in order to treat this as a regression bug. This does not have to be some huge regression; it only needs to be a repeatable one. The fact that profiling structure is changed is not sufficient to treat this as a verified bug, id est a bug that will be always repeated.

I have no doubts that a solution for short-lived private dict_table_t objects and dict_table_t::autoinc_mutex looks as a potential regression, especially under high concurrency. 

We would like to have the way in which to repeat this bug. When you manage to come to such a test case, we would be also very much obliged if you could upload exactly the scripts that you used. SQL scripts and mysqlslap arguments would suffice.

Thank you very much in advance !!
[4 Mar 2014 11:39] Laurynas Biveinis
It is possible that we won't be able to show any regression with Oracle MySQL, or that regression appears in 5.7 only, which then we won't be able to show due to lack of a recent 5.7 public push, as it needs an I/O-bound setup; Oracle MySQL may simply hit another bottleneck first. If that turns out to be the case, consider verifying as a non-regression bug.

Until Alexey replies, I am uploading a patch. Tested in our CI farm as a part of Percona Server and lightly tested on Oracle MySQL 5.6.16 (debug build passes innodb,innodb_zip,innodb_fts MTR suites).

(this comment is here mainly for the purpose of resetting the bug to Open so that I can upload the patch. Feel free to reset to Needs Feedback and consider fixing bug 69963).
[4 Mar 2014 11:40] Laurynas Biveinis
Bug 71708 patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug71708.patch (text/x-diff), 10.64 KiB.

[4 Mar 2014 16:23] MySQL Verification Team
I am verifying this as a non-regression bug related to the improvement of the performance.
[10 Mar 2014 6:33] zhai weixiang
I have observed a similar mutex contention when CPU capacity  was limited (by  cgroup) :

     1 ut_delay(ut0ut.cc:408),mutex_spin_wait(sync0sync.cc:505),pfs_mutex_enter_func(sync0sync.ic:251),rw_lock_free_func(sync0sync.ic:251),pfs_rw_lock_free_func(sync0rw.ic:650),dict_mem_table_free(sync0rw.ic:650),ibuf_get_volume_buffered_count_func(ibuf0ibuf.cc:3065),ibuf_get_volume_buffered(ibuf0ibuf.cc:3136),ibuf_insert_low(ibuf0ibuf.cc:3136),ibuf_insert(ibuf0ibuf.cc:3912),btr_cur_search_to_nth_level(btr0cur.cc:634),row_ins_sec_index_entry_low(row0ins.cc:2653),row_ins_sec_index_entry(row0ins.cc:2936),row_upd_sec_index_entry(row0upd.cc:1840),row_upd_sec_step(row0upd.cc:1867),row_upd(row0upd.cc:1867),row_upd_step(row0upd.cc:1867),row_update_for_mysql(row0mysql.cc:1728),ha_innobase::update_row(ha_innodb.cc:7078),handler::ha_update_row(handler.cc:7297),mysql_update(sql_update.cc:818),mysql_execute_command(sql_parse.cc:3298),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),pfs_spawn_thread(pfs.cc:1858),start_thread(libpthread.so.0),clone(libc.so.6)
      1 ut_delay(libc.so.6),mutex_spin_wait(sync0sync.cc:505),pfs_mutex_enter_func(sync0sync.ic:251),rw_lock_create_func(sync0sync.ic:251),pfs_rw_lock_create_func(sync0rw.ic:561),dict_mem_table_create(sync0rw.ic:561),ibuf_dummy_index_create(ibuf0ibuf.cc:1534),ibuf_build_entry_from_ibuf_rec_func(ibuf0ibuf.cc:1534),ibuf_get_volume_buffered_count_func(ibuf0ibuf.cc:3061),ibuf_get_volume_buffered(ibuf0ibuf.cc:3136),ibuf_insert_low(ibuf0ibuf.cc:3136),ibuf_insert(ibuf0ibuf.cc:3912),btr_cur_search_to_nth_level(btr0cur.cc:647),btr_pcur_open_low(btr0pcur.ic:440),row_search_index_entry(btr0pcur.ic:440),row_upd_sec_index_entry(row0upd.cc:1752),row_upd_sec_step(row0upd.cc:1867),row_upd(row0upd.cc:1867),row_upd_step(row0upd.cc:1867),row_update_for_mysql(row0mysql.cc:1728),ha_innobase::update_row(ha_innodb.cc:7078),handler::ha_update_row(handler.cc:7297),mysql_update(sql_update.cc:818),mysql_execute_command(sql_parse.cc:3298),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),pfs_spawn_thread(pfs.cc:1858),start_thread(libpthread.so.0),clone(libc.so.6)
      1 ut_delay(libc.so.6),mutex_spin_wait(sync0sync.cc:505),pfs_mutex_enter_func(sync0sync.ic:251),rw_lock_create_func(sync0sync.ic:251),pfs_rw_lock_create_func(sync0rw.ic:561),dict_mem_table_create(sync0rw.ic:561),ibuf_dummy_index_create(ibuf0ibuf.cc:1534),ibuf_build_entry_from_ibuf_rec_func(ibuf0ibuf.cc:1534),ibuf_get_volume_buffered_count_func(ibuf0ibuf.cc:3061),ibuf_get_volume_buffered(ibuf0ibuf.cc:3136),ibuf_insert_low(ibuf0ibuf.cc:3136),ibuf_insert(ibuf0ibuf.cc:3912),btr_cur_search_to_nth_level(btr0cur.cc:634),row_ins_sec_index_entry_low(row0ins.cc:2653),row_ins_sec_index_entry(row0ins.cc:2936),row_upd_sec_index_entry(row0upd.cc:1840),row_upd_sec_step(row0upd.cc:1867),row_upd(row0upd.cc:1867),row_upd_step(row0upd.cc:1867),row_update_for_mysql(row0mysql.cc:1728),ha_innobase::update_row(ha_innodb.cc:7078),handler::ha_update_row(handler.cc:7297),mysql_update(sql_update.cc:818),mysql_execute_command(sql_parse.cc:3298),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),pfs_spawn_thread(pfs.cc:1858),start_thread(libpthread.so.0),clone(libc.so.6)

      1 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),os_fast_mutex_lock_func(os0sync.cc:873),pfs_os_fast_mutex_lock(os0sync.cc:873),os_mutex_enter(os0sync.cc:873),os_fast_mutex_init_func(os0sync.cc:852),pfs_os_fast_mutex_init(os0sync.cc:852),os_event_create(os0sync.cc:852),mutex_create_func(sync0sync.cc:282),pfs_mutex_create_func(sync0sync.ic:341),dict_mem_table_create(sync0sync.ic:341),ibuf_dummy_index_create(ibuf0ibuf.cc:1534),ibuf_build_entry_from_ibuf_rec_func(ibuf0ibuf.cc:1534),ibuf_get_volume_buffered_count_func(ibuf0ibuf.cc:3061),ibuf_get_volume_buffered(ibuf0ibuf.cc:3136),ibuf_insert_low(ibuf0ibuf.cc:3136),ibuf_insert(ibuf0ibuf.cc:3912),btr_cur_search_to_nth_level(btr0cur.cc:647),btr_pcur_open_low(btr0pcur.ic:440),row_search_index_entry(btr0pcur.ic:440),row_upd_sec_index_entry(row0upd.cc:1752),row_upd_sec_step(row0upd.cc:1867),row_upd(row0upd.cc:1867),row_upd_step(row0upd.cc:1867),row_update_for_mysql(row0mysql.cc:1728),ha_innobase::update_row(ha_innodb.cc:7078),handler::ha_update_row(handler.cc:7297),mysql_update(sql_update.cc:818),mysql_execute_command(sql_parse.cc:3298),mysql_parse(sql_parse.cc:6235),dispatch_command(sql_parse.cc:1334),do_handle_one_connection(sql_connect.cc:982),handle_one_connection(sql_connect.cc:898),pfs_spawn_thread(pfs.cc:1858),start_thread(libpthread.so.0),clone(libc.so.6)
[1 Apr 2014 12:05] Vasil Dimov
An internally developed patch is being worked upon. I will post the revision id here once it gets pushed. Stay tuned.
[2 Apr 2014 7:31] Vasil Dimov
A patch for this bug has been pushed to mysql-trunk under vasil.dimov@oracle.com-20140401162003-zfhuxtu710846jd9

"Lazily create dict_table_t::stats_latch the first time it is used.
It may not be used at all in the lifetime of some dict_table_t objects."

It will be merged into mysql-5.6 soon.
[2 Apr 2014 12:48] Daniel Price
Fixed as of 5.7.5, and here's the changelog entry:

The fix for Bug#17699331 resulted a performance regression.
[2 Apr 2014 20:43] Laurynas Biveinis
Thanks Vasil. The lazy initialisation looks like a good idea, I'll wait for the next public push to review your patch. But note that my fix fixed it for autoinc_mutex too, and your release note entry covers only the dict latch. What is the status of autoinc_mutex then?
[10 Apr 2014 9:05] Vasil Dimov
This was fixed in 5.6.19 under vasil.dimov@oracle.com-20140409172709-qt44dvpqxrkry0pn.

The autoinc mutex issue will be fixed in mysql-trunk (5.7) only since it is not a regression in 5.6.
[2 May 2014 13:03] Mark Callaghan
From my vague memory that is probably the third big change to this code. The Facebook patch had a reasonably performant fix that could have been the second big change. The InnoDB team had their own second big change and now this? So we are waiting for the fourth big change, or maybe just a small tweak? Perhaps a more comprehensive benchmark test suite would help us get a real fix with fewer iterations.
[5 May 2014 12:57] MySQL Verification Team
Mark,

Thank you very much for your comment. We would like very much to answer your query, but we are not cognizant exactly what is the Facebook that you are writing about. Can you post an URL to your patch ???

Many thanks in advance !!!
[2 Jun 2014 14:11] Laurynas Biveinis
$ bzr log -r 5887
------------------------------------------------------------
revno: 5887
committer: Vasil Dimov <vasil.dimov@oracle.com>
branch nick: mysql-5.6
timestamp: Wed 2014-04-09 20:27:09 +0300
message:
  Backport the following changeset from mysql-trunk to mysql-5.6:
  
    ** revision-id: vasil.dimov@oracle.com-20140401162003-zfhuxtu710846jd9
    ** committer: Vasil Dimov <vasil.dimov@oracle.com>
    ** branch nick: mysql-trunk
    ** timestamp: Tue 2014-04-01 19:20:03 +0300
    ** message:
    **   Fix Bug#71708 70768 fix perf regression: high rate of RW lock creation
    **   and destruction
    **
    **   Lazily create dict_table_t::stats_latch the first time it is used.
    **   It may not be used at all in the lifetime of some dict_table_t objects.
    **
    **   Approved by:	Bin (rb:4739)
[6 Jun 2014 12:28] Daniel Price
Updated changelog entry which has been backported to 5.6.19:

Fixed as of the upcoming 5.6.19, 5.7.5 release, and here's the changelog entry:

The fix for Bug#17699331 caused a high rate of read/write lock creation
and destruction which resulted in a performance regression. 

Thank you for the bug report.
[8 Oct 2014 15:16] Laurynas Biveinis
5.7.5 still has autoinc_mutex (and index->zip_pad.mutex as in bug 73361) created unconditionally.

Is this tracked somewhere or should I file a new bug for autoinc_mutex?
[8 Oct 2014 16:28] MySQL Verification Team
Laurynas,

This bug is about RW locks, so for auto_inc mutex create a new bug.
[9 Oct 2014 7:28] Laurynas Biveinis
Actually, bug 73361 looks like it's about both autoinc_mutex and index->zip_pad.mutex.

So I won't open a new one unless you decide to address only half of 73361 for some reason.