Bug #54538 use of exclusive innodb dictionary lock limits performance
Submitted: 16 Jun 2010 1:23 Modified: 14 Dec 2010 20:13
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB Plugin storage engine Severity:S3 (Non-critical)
Version:5.1.47 OS:Any
Assigned to: Sunny Bains CPU Architecture:Any
Tags: dictionary, innodb, lock, purge, undo
Triage: Triaged: D2 (Serious) / R3 (Medium) / E3 (Medium)

[16 Jun 2010 1:23] Mark Callaghan
Description:
Innodb purge gets an S dictionary lock via row_mysql_freeze_data_dictionary.
Undo gets an X dictionary lock via row_mysql_lock_data_dictionary.

These conflict and cause many performance problems. Our current problem is a server that replicates many REPLACE statements for which the insert-then-undo execution strategy does:
* insert, fail
* undo insert
* update

In this case the undo requires the X dictionary lock. When the slave also does a lot of purge, then this conflicts with the S lock needed by purge and the server has more replication lag than needed.

Why does row_undo need an X lock?
Why does ha_innobase::info need an X lock?

This is the code for row_undo.
Can it use row_mysql_freeze_data_dictionary?

        /* Prevent DROP TABLE etc. while we are rolling back this row.
        If we are doing a TABLE CREATE or some other dictionary operation,
        then we already have dict_operation_lock locked in x-mode. Do not
        try to lock again, because that would cause a hang. */

        locked_data_dict = (trx->dict_operation_lock_mode == 0);

        if (locked_data_dict) {

                row_mysql_lock_data_dictionary(trx);
        }

Things that get an X lock:
dict_create_or_check_foreign_constraint_table         dict0crea.c
                        create        ha_innodb.cc
         innobase_rename_table        ha_innodb.cc
                          info        ha_innodb.cc
            can_switch_engines        ha_innodb.cc
foreign_key_column_is_being_renamed        ha_innodb.cc
       innobase_create_key_def    handler0alter.cc
                     add_index    handler0alter.cc
                     add_index    handler0alter.cc
                     add_index    handler0alter.cc
            prepare_drop_index    handler0alter.cc
              final_drop_index    handler0alter.cc
   row_merge_drop_temp_indexes         row0merge.c
row_discard_tablespace_for_mysql         row0mysql.c
row_import_tablespace_for_mysql         row0mysql.c
row_import_tablespace_for_mysql         row0mysql.c
  row_truncate_table_for_mysql         row0mysql.c
      row_drop_table_for_mysql         row0mysql.c
    row_mysql_drop_temp_tables         row0mysql.c
   row_drop_database_for_mysql         row0mysql.c
                      row_undo          row0undo.c
      srv_suspend_mysql_thread           srv0srv.c
           trx_rollback_active          trx0roll.c

Things that get an S lock:
row_ins_foreign_check_on_constraint           row0ins.c
row_ins_check_foreign_constraints           row0ins.c
      row_purge_parse_undo_rec         row0purge.c
   row_upd_index_is_referenced           row0upd.c
row_upd_check_references_constraints           row0upd.c
      srv_suspend_mysql_thread           srv0srv.c

How to repeat:
run on a busy slave that does:
* lots of DELETE
* lots of REPLACE for which the insert step fails.
[16 Jun 2010 14:02] Mikhail Izioumtchenko
coincides exactly with what I observed last week looking at some stack backtraces.
I tried a patch (against 5.5 codebase)
that uses freeze/unfreeze in row_undo, ::info
and one or two other places and it doesn't seem to make any harm.
[22 Jun 2010 8:10] Marko Mäkelä
I made a mistake in r1350 in the 5.1 tree in our svn repository. It was also merged to 5.0 in r1351.

It seems that the correct fix would have been to acquire the dict_sys->mutex for the dict_table_get_on_id() lookup, in addition to the dict_operation_lock in S mode.

The fix in r1350 was that I changed the S lock request (row_mysql_freeze_data_dictionary(trx)) to an X lock request and the acquisition of the dict_sys->mutex (row_mysql_lock_data_dictionary(trx)). This unnecessarily hurts rollback performance on 5.0+.
[24 Jun 2010 21:23] Mark Callaghan
Can someone update this with the URL for a commit assuming the commit has been done?
[24 Jun 2010 21:27] Sunny Bains
Mark,

Running some last minute tests on 5.1. Once I push the fix you should see the commit email.

Regards,
-sunny
[25 Jun 2010 3:31] Sunny Bains
The change committed only addresses the row0undo.c issue. The second issue
raised is why ha_innobase::info() requires an X lock. We acquire it to avoid a race
condition as the comment suggests:

                       /* lock the data dictionary to avoid races with
                        ibd_file_missing and tablespace_discarded */
                        row_mysql_lock_data_dictionary(prebuilt->trx);

                        /* ib_table->space must be an existent tablespace */
                        if (!ib_table->ibd_file_missing
                            && !ib_table->tablespace_discarded) {

I think acquiring only the dict_sys->mutex here should be sufficient. This change
will not block other threads (particularly the purge thread) and will avoid the potential
race because in all places where we update ibd_file_missing and tablespace_discarded
we always acquire an X lock, which implies that the updater also holds the dict_sys->mutex.
[5 Jul 2010 2:38] Sunny Bains
Marko has a better solution for ha_innobase::info():

"As far as I can tell, after dict_load_table() ibd_file_missing can become TRUE in row0mysql.c, in row_truncate_table_for_mysql() and row_discard_tablespace_for_mysql(). The flag is read without protection in many places. The field tablespace_discarded is only set TRUE by row_discard_tablespace_for_mysql() while holding the dict_sys->mutex.

It seems to me that even this acquisition of the dict_sys->mutex is an overkill. It would be simpler to make fsp_get_available_space_in_free_extents() tolerate a missing tablespace. Just call it, no matter what the ib_table flags are, and complain if it fails."
[21 Jul 2010 23:52] Sunny Bains
Upon investigation the problem is not as simple as Marko makes it out to be and it reveals the
reason why the dict lock solution was (probably) implemented in the first place. The problem is:

  1. The table space is freed in fil_space_free() and is covered only by fil_system->mutex.

  2. In fsp_get_available_space_in_free_extents() we need to hold the above mutex
      for the duration of the mtr().

  3. We can't just check for a valid tablespace on space id and then do the work in
      the mtr. We have to ensure that the tablespace can't be deleted after we've
      checked that it exists and is valid.
[26 Aug 2010 22:32] Mark Callaghan
I don't mind the lock, I just want to do it in share mode. Is there a diff for that?
[28 Sep 2010 8:49] Bugs System
Pushed into mysql-5.1 5.1.52 (revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (version source revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (merge vers: 5.1.52) (pib:21)
[28 Sep 2010 15:40] Bugs System
Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100928153607-tdsxkdm5cmuym5sq) (version source revid:alik@sun.com-20100928153508-0saa6v93dinqx1u7) (merge vers: 5.6.1-m4) (pib:21)
[28 Sep 2010 15:42] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100928153646-pqp8o1a92mxtuj3h) (version source revid:alik@sun.com-20100928153532-lr3gtvnyp2en4y75) (pib:21)
[28 Sep 2010 15:45] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:alik@sun.com-20100928153459-4nudf4zgzlou4s7q) (version source revid:alik@sun.com-20100928153459-4nudf4zgzlou4s7q) (merge vers: 5.5.7-rc) (pib:21)
[1 Oct 2010 18:15] Mark Callaghan
Can you update this with a reference to the original commit? 5.5, 5.1 and mysql-next-mr histories on launchpad are mostly merges from other branches that have the original commits.
[6 Oct 2010 18:05] John Russell
Added to change log:

Changed the locking mechanism for the InnoDB data dictionary during
ROLLBACK operations, to improve concurrency for REPLACE statements.
[14 Oct 2010 8:31] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (merge vers: 5.1.51-ndb-7.0.20) (pib:21)
[14 Oct 2010 8:47] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (merge vers: 5.1.51-ndb-6.3.39) (pib:21)
[14 Oct 2010 9:01] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (merge vers: 5.1.51-ndb-6.2.19) (pib:21)
[5 Dec 2010 12:40] Bugs System
Pushed into mysql-trunk 5.6.1 (revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (version source revid:alexander.nozdrin@oracle.com-20101205122447-6x94l4fmslpbttxj) (merge vers: 5.6.1) (pib:23)
[15 Dec 2010 5:53] Bugs System
Pushed into mysql-5.1 5.1.55 (revid:sunanda.menon@oracle.com-20101215054055-vgwki317xg1wphhh) (version source revid:sunanda.menon@oracle.com-20101215054055-vgwki317xg1wphhh) (merge vers: 5.1.55) (pib:23)
[16 Dec 2010 22:29] Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)