Bug #51557 LOCK_open and kernel_mutex are not happy together
Submitted: 26 Feb 2010 15:41 Modified: 8 Jul 2010 17:38
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.0.84 OS:Any
Assigned to: Konstantin Osipov
Tags: Contribution, handler, innodb, kernel_mutex, LOCK_open, open, performance, table
Triage: Triaged: D2 (Serious)

[26 Feb 2010 15:41] Mark Callaghan
Description:
This thread:
1) gets LOCK_open while opening a table handler 
2) starts to gather per-table stats for InnoDB table
3) blocks on kernel_mutex

Another thread has kernel_mutex doing InnoDB deadlock detection.

Hundreds of threads then get stuck on LOCK_open. It is a bad idea to gather per-table InnoDB stats when LOCK_open is held. That should be deferred or LOCK_open should be released.

#3  mutex_spin_wait (mutex=0x2aaaac3232b8,
        file_name=0x8b3bf7 "trx0trx.c", line=220) at sync0sync.c:565
#4  trx_allocate_for_mysql
#5  ha_innobase::allocate_trx
#6  check_trx_exists
#7  ha_innobase::info
#8  ha_innobase::open
#9  handler::ha_open
#10 openfrm
#11 open_unireg_entry
#12 open_table
#13 open_tables
#14 open_and_lock_tables
#15 mysql_insert

How to repeat:
open table handler instances on a busy server
watch for pileups

Suggested fix:
Defer gathering of table stats until after LOCK_open is released. A simple fix might be to add ::info call later in the open_table/open_tables code
[26 Feb 2010 15:42] Mark Callaghan
See http://mituzas.lt/2010/02/26/opening-tables-v2 and http://mituzas.lt/2009/12/26/opening-tables/
[26 Feb 2010 18:14] Domas Mituzas
Antony says:

"There is an old patch (you can search for it on the bk commits list) which safely releases the LOCK_open mutex before calling the storage engines ::open method. The patch was primarily for fixing DOS scenerio when using the Federated storage engine but it would be equally applicable in this case."

May be useful!
[26 Feb 2010 19:22] Davi Arnaut
WL#3983
[27 Mar 2010 12:13] Konstantin Osipov
http://lists.mysql.com/commits/103808
[27 Mar 2010 15:09] Domas Mituzas
Kostja, can that behavior be merged into 5.1 too?
[27 Mar 2010 15:11] Domas Mituzas
(I mean, is it safe, if we merge it ourselves? :)
[28 Mar 2010 8:49] Konstantin Osipov
No it depends on the architecture changes done by the new MDL in 5.5.
If you promise to never use DDL under LOCK TABLES (inc. REPAIR) it can work in 5.1, otherwise it'll crash.
[28 Mar 2010 8:51] Konstantin Osipov
I was merely discussing the approach, the patch won't apply to 5.1.
[28 Mar 2010 9:01] Domas Mituzas
ah, thanks.
[21 Apr 2010 0:36] Jimmy Yang
I agree with Mark in this aspect. A more reasonable fix for 5.1 will be calling ha_innobase::info() with HA_STATUS_TIME, HA_STATUS_NO_LOCK, HA_STATUS_VARIABLE, HA_STATUS_CONST after the table opening and releasing of the LOCK_open lock. And we will avoid index stats collection and other info activity in ha_innobase::open itself.
[23 Apr 2010 20:46] Mark Callaghan
This is probably a duplicate of http://bugs.mysql.com/bug.php?id=49463
I don't know why I reported it twice.
[23 Apr 2010 22:17] Mark Callaghan
Patch for 5.1 at
http://bazaar.launchpad.net/~mysqlatfacebook/mysqlatfacebook/5.1/revision/3470

We can contribute this. I don't know if it is relevant given the changes in 5.5
[15 Jun 2010 8:19] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:36] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)
[15 Jun 2010 15:15] Konstantin Osipov
Queued into 5.5.5 runtime, tagged 5.5.5
3054 Konstantin Osipov	2010-06-11
      WL#5419 "LOCK_open scalability: make tdc_refresh_version 
      an atomic counter"
      
      Split the large LOCK_open section in open_table(). 
      Do not call open_table_from_share() under LOCK_open.
      Remove thd->version.
      
      This fixes
      Bug#50589 "Server hang on a query evaluated using a temporary 
      table"
      Bug#51557 "LOCK_open and kernel_mutex are not happy together"
      Bug#49463 "LOCK_table and innodb are not nice when handler 
      instances are created".
      
      This patch has effect on storage engines that rely on
      ha_open() PSEA method being called under LOCK_open.
      In particular:
      
      1) NDB is broken and left unfixed. NDB relies on LOCK_open
      being kept as part of ha_open(), since it uses auto-discovery.
      While previously the NDB open code was race-prone, now
      it simply fails on asserts.
      
      2) HEAP engine had a race in ha_heap::open() when
      a share for the same table could be added twice
      to the list of shares, or a dangling reference to a share
      stored in HEAP handler. This patch aims to address this
      problem by 'pinning' the newly created share in the 
      internal HEAP engine share list until at least one
      handler instance is created using that share.
     @ include/heap.h
        Add members to HP_CREATE_INFO.
        Declare heap_release_share().
     @ sql/lock.cc
        Remove thd->version, use thd->open_tables->s->version instead.
     @ sql/repl_failsafe.cc
        Remove thd->version.
     @ sql/sql_base.cc
        - close_thread_table(): move handler cleanup code outside the critical section
protected by LOCK_open.
        - remove thd->version
        - split the large critical section in open_table() that
        opens a new table from share and is protected by LOCK_open
        into 2 critical sections, thus reducing the critical path.
        - make check_if_table_exists() acquire LOCK_open internally.
        - use thd->open_tables->s->version instead of thd->refresh_version to make sure
that all tables in
        thd->open_tables are in the same refresh series.
     @ sql/sql_base.h
        Add declaration for check_if_table_exists().
     @ sql/sql_class.cc
        Remove init_open_tables_state(), it's now equal to
        reset_open_tables_state().
     @ sql/sql_class.h
        Remove thd->version, THD::init_open_tables_state().
     @ sql/sql_plugin.cc
        Use table->m_needs_reopen to mark the table as stale
        rather than manipulate with thd->version, which is no more.
     @ sql/sql_udf.cc
        Use table->m_needs_reopen to mark the table as stale
        rather than manipulate with thd->version, which is no more.
     @ sql/table.h
        Remove an unused variable.
     @ sql/tztime.cc
        Use table->m_needs_reopen to mark the table as stale
        rather than manipulate with thd->version, which is no more.
     @ storage/heap/CMakeLists.txt
        Add heap tests to cmake build files.
     @ storage/heap/ha_heap.cc
        Fix a race when ha_heap::ha_open() could insert two 
        HP_SHARE objects into the internal share list or store
        a dangling reference to a share in ha_heap instance,
        or wrongly set implicit_emptied.
     @ storage/heap/hp_create.c
        Optionally pin a newly created share in the list of shares
        by increasing its open_count. This is necessary to 
        make sure that a newly created share doesn't disappear while
        a HP_INFO object is being created to reference it.
     @ storage/heap/hp_open.c
        When adding a new HP_INFO object to the list of objects
        in the heap share, make sure the open_count is not increased
        twice.
     @ storage/heap/hp_test1.c
        Adjust the test to new function signatures.
     @ storage/heap/hp_test2.c
        Adjust the test to new function signatures.
[22 Jun 2010 13:08] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100622130139-u05awgya93zvbsop) (version source revid:marko.makela@oracle.com-20100603095032-v5ptkkzt1bhz0m1d) (merge vers: 5.1.48) (pib:16)
[22 Jun 2010 13:10] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100622130623-r7yhm89fz9n5t9nb) (version source revid:alik@sun.com-20100622130528-187gd949sa9b6pa6) (pib:16)
[8 Jul 2010 16:11] Paul Dubois
Noted in 5.5.5 changelog.

Previously, the server held a global mutex while performing file
operations such as deleting an .frm or data file, or reading table
metadata from an .frm file or index statistics from a data file. Now
the mutex is not held for these operations. Instead, the server uses
metadata locks.
[8 Jul 2010 17:38] Mark Callaghan
=D>

The applause emoticon according to http://www.muller-godschalk.com/yahoo.html
[13 May 2011 13:31] Paul Dubois
Correction to changelog entry: Remove "reading table metadata from an .frm file"