Bug #97825 dd_mdl_acquire in dd_table_open with dict_sys->mutex hold may cause deadlock
Submitted: 28 Nov 2019 9:27 Modified: 9 Mar 2020 17:07
Reporter: dave do Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.18 OS:Any
Assigned to: CPU Architecture:Any

[28 Nov 2019 9:27] dave do
Description:
dict_table_t *dd_table_open_on_name(THD *thd, MDL_ticket **mdl,
                                    const char *name, bool dict_locked,
                                    ulint ignore_err) {
...

  if (!skip_mdl && dd_mdl_acquire(thd, mdl, db_buf, tbl_buf)) {
    return nullptr;
  }

  if (!dict_locked) {
    mutex_enter(&dict_sys->mutex);
  }

We should not acquire MDL if we hold dict_sys->mutex(dict_locked is true), because of MDL which belongs to server layer is more high-level than dict_sys->mutex.

If there is a concurrent thread hold an exclusive MDL with this table then acquire dict_sys->mutex, there will be a deadlock.

Compare to dd_table_open_on_id, dict_sys->mutex released before dd_table_open_on_id_low called, because MDL may acquire in dd_table_open_on_id_low.

How to repeat:
ditto.

Suggested fix:
Release dict_sys->mutex before dd_mdl_acquire and lock again after that.
[28 Nov 2019 13:45] MySQL Verification Team
Hello Mr. do,

Thank you for your bug report.

However, we do not think that this is a bug. 

MDL works at one level, regardless of which storage engine i used, while InnoDB locks work only on storage engine level. Hence, as long as our developers stick to the discipline of the order in which each lock is taken, there should be no deadlocks between the two.

If you manage to reproduce a deadlock between MDL and mutex locks, we would then reconsider your report.
[28 Nov 2019 14:21] dave do
Hi Sinisa,

As you mentioned, the code should stick to the discipline of the order in which each lock is taken. So let's say if a table is to be opened, first the MDL on this table should be taken, then it dives into InnoDB, and InnoDB probably take the dict_sys mutex. This is a correct locking order. However, in above dd_table_open_on_name(), if dict_locked is true, which means dict_sys mutex is now being held, it is really dangerous to call dd_mdl_acquire(), which acquires the MDL. This indeed break the locking order. If we go through other calls to dd_mdl_acquire() in dict0dd.cc, we will find out that this function always gets called after releasing the dict_sys mutex. Please re-consider this bug, and hopefully fix it if it is confirmed.

Thanks!
[28 Nov 2019 16:24] MySQL Verification Team
I see no problem in what you report.

The outcome of the your scenario is that MDL will be taken and mutex lock will be free. That is expected behaviour.

If you want to know why MDLs have been introduced, read carefully bug #989.
[2 Dec 2019 4:28] dave do
Hi Sinisa,

Maybe i didn't make the issue clear enough, let me try again.

First, my concern is the lock order in dd_table_open_on_name is not correct, or let`s say it is different from other similar functions in the same source file.

Second, there is some examples and lock order analysis of the function similar with dd_table_open_on_name in dict0dd.cc.

eg1)

4278 static void dd_open_table_one_on_name(const char *name, bool dict_locked,
4279                                       dict_names_t &fk_list, THD *thd) {
4280   dict_table_t *table = nullptr;
4281   const dd::Table *dd_table = nullptr;
4282   MDL_ticket *mdl = nullptr;
4283
4284   if (!dict_locked) {
4285     mutex_enter(&dict_sys->mutex);		==========> make the dict_sys->mutex locked anyway.
4286   }
4287
4288   table = dict_table_check_if_in_cache_low(name);
4289
4290   if (table != NULL) {
4291     /* If the table is in cached already, do nothing. */
4292     if (!dict_locked) {
4293       mutex_exit(&dict_sys->mutex);
4294     }
4295
4296     return;
4297   } else {
4298     /* Otherwise, open it by dd obj. */
4299     char db_buf[MAX_DATABASE_NAME_LEN + 1];
4300     char tbl_buf[MAX_TABLE_NAME_LEN + 1];
4301
4302     /* Exit sys mutex to access server info */
4303     mutex_exit(&dict_sys->mutex);		==========> make the dict_sys->mutex unlocked before dd_mdl_acquire.
4304
4305     if (!dd_parse_tbl_name(name, db_buf, tbl_buf, nullptr, nullptr, nullptr)) {
4306       goto func_exit;
4307     }
4308
4309     if (dd_mdl_acquire(thd, &mdl, db_buf, tbl_buf)) {
4310       goto func_exit;
4311     }
4312
4313     dd::cache::Dictionary_client *client = dd::get_dd_client(thd);
4314     dd::cache::Dictionary_client::Auto_releaser releaser(client);
4315
4316     if (client->acquire(db_buf, tbl_buf, &dd_table) || dd_table == nullptr) {
4317       goto func_exit;
4318     }
	...
4350 func_exit:
4351   if (table != NULL) {
4352     dd_table_close(table, thd, &mdl, false);
4353   } else {
4354     dd_mdl_release(thd, &mdl);
4355   }
4356
4357   if (dict_locked) {
4358     mutex_enter(&dict_sys->mutex);		==========> make the dict_sys->mutex re-locked if it is locked before call this function.
4359   }
4360 }

eg2)
 611 dict_table_t *dd_table_open_on_id(table_id_t table_id, THD *thd,
 612                                   MDL_ticket **mdl, bool dict_locked,
 613                                   bool check_corruption) {
 614   dict_table_t *ib_table;
 615   const ulint fold = ut_fold_ull(table_id);
 616   char db_buf[MAX_DATABASE_NAME_LEN + 1];
 617   char tbl_buf[MAX_TABLE_NAME_LEN + 1];
 618   char full_name[MAX_FULL_NAME_LEN + 1];
 619
 620   if (!dict_locked) {
 621     mutex_enter(&dict_sys->mutex);		==========> make the dict_sys->mutex locked anyway.
 622   }
 623
	...
 627 reopen:
 628   if (ib_table == nullptr) {
 629 #ifndef UNIV_HOTBACKUP
 630     if (dict_table_is_sdi(table_id)) {

 653     } else {
 654       mutex_exit(&dict_sys->mutex);		==========> make the dict_sys->mutex unlocked because dd_mdl_acquire will called in dd_table_open_on_id_low (L446 dict0dd.cc).
 655
 656       ib_table = dd_table_open_on_id_low(thd, mdl, table_id);
 657
 658       if (dict_locked) {
 659         mutex_enter(&dict_sys->mutex);
 660       }
 661     }
	...
 678   } else {
 679     for (;;) {
 680 #ifndef UNIV_HOTBACKUP
 681       bool ret = dd_parse_tbl_name(ib_table->name.m_name, db_buf, tbl_buf,
 682                                    nullptr, nullptr, nullptr);
 683 #endif /* !UNIV_HOTBACKUP */
 684
 685       memset(full_name, 0, MAX_FULL_NAME_LEN + 1);
 686
 687       strcpy(full_name, ib_table->name.m_name);
 688
 689       ut_ad(!ib_table->is_temporary());
 690
 691       mutex_exit(&dict_sys->mutex);		==========> make the dict_sys->mutex unlocked before dd_mdl_acquire.
 692
 693 #ifndef UNIV_HOTBACKUP
 694       if (ret == false) {
 695         if (dict_locked) {
 696           mutex_enter(&dict_sys->mutex);
 697         }
 698         return (nullptr);
 699       }
 700
 701       if (dd_mdl_acquire(thd, mdl, db_buf, tbl_buf)) {
 702         if (dict_locked) {
 703           mutex_enter(&dict_sys->mutex);		==========> make the dict_sys->mutex re-locked if it is locked before call this function.
 704         }
 705         return (nullptr);
 706       }
 707 #endif /* !UNIV_HOTBACKUP */
 708
 709       /* Re-lookup the table after acquiring MDL. */
 710       mutex_enter(&dict_sys->mutex);		==========> make the dict_sys->mutex re-locked if it is locked before call this function.
 711
 712       HASH_SEARCH(id_hash, dict_sys->table_id_hash, fold, dict_table_t *,
 713                   ib_table, ut_ad(ib_table->cached), ib_table->id == table_id);
 714

We can find that all of them will release dict_sys->mutex before dd_mdl_acquire.
And from the comments of the code ( /* Exit sys mutex to access server info */, /* Re-lookup the table after acquiring MDL. */ ), we can see some clue of the author`s thinking about the lock order between MDL(server layer lock) and dict_sys->mutex(storage engine layer lock).

Since the locking order in above examples is always MDL->dict_sys, and this is consistent with how server access a table as I mentioned in previous comment, it looks the locking order in dd_table_open_on_name violates the orders in other functions, which can be obviously and easily observed by reading the code. From the design point of view, this is somehow odd, and in practice, we ran into a deadlock which is caused by this unexpected locking order(please to my comment in Bug #95743). I would really appreciate if this bug can be re-evaluated.
[2 Dec 2019 13:34] MySQL Verification Team
Hi Mr. do,

Lock order could be different, but it is irrelevant, since these are locks on totally different levels and can't, in themselves, cause any deadlock.
[20 Jan 2020 8:57] Marc ALFF
Bug report verified by the LOCK ORDER tool.

There is indeed a problematic dependency, detected as:
  ARC FROM "mutex/innodb/dict_sys_mutex" TO "prlock/sql/MDL_lock::rwlock" OP "W"

This report can not be closed as "not a bug" by claiming that locks are independent and taken in independent layers, because this is precisely what the report is about: there is a dependency from the innodb layer back to the sql layer, that breaks the general flow of sql -> innodb.
[9 Mar 2020 17:07] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 8.0.20 release, and here's the proposed changelog entry from the documentation team:

A data dictionary table open function was implemented with incorrect lock
ordering.
[10 Mar 2020 13:01] MySQL Verification Team
Thank you, Daniel.