Bug #75513 DROP TABLE creating a stall when foreign keys in use on other tables
Submitted: 15 Jan 2015 10:20 Modified: 8 Jan 2019 12:48
Reporter: Riccardo Pizzi Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.5.31 OS:Linux
Assigned to: CPU Architecture:Any

[15 Jan 2015 10:20] Riccardo Pizzi
Description:
We are facing severe performance problems when running DROP TABLE. Stalls going from few seconds to 30+ seconds have been experienced and after some digging it seems related to a shared dictionary lock needed for foreign keys constraint check.

During a drop table we see the following:

-Thread 139711064704768 has waited at row0ins.c line 1558 for 4.0000 seconds the semaphore:
S-lock on RW-latch at 0x1056520 '&dict_operation_lock'
a writer (thread id 139694324492032) has reserved it in mode  exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file row0ins.c line 1558
Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos6-64/target/BUILD/Percona-Server-5.5.31-rel30.3/storage/innobase/row/row0mysql.c line 3283

Thread id 139694324492032 is the thread dunning  drop table.

Looking at the source code, I see that an exclusive lock on data dictionary is taken when running  DROP TABLE.  At the same time, a shared lock on  data dictionary is needed by the foreign check routine. 

From row0ins.cc:

row_ins_check_foreign_constraints(
{
        dict_foreign_t* foreign;
        ulint           err;
        trx_t*          trx;
        ibool           got_s_lock      = FALSE;

        trx = thr_get_trx(thr);

        foreign = UT_LIST_GET_FIRST(table->foreign_list);

        while (foreign) {
                if (foreign->foreign_index == index) {

                        if (foreign->referenced_table == NULL) {
                                dict_table_get(foreign->referenced_table_name_lookup,
                                               FALSE);
                        }

                        if (0 == trx->dict_operation_lock_mode) {
                                got_s_lock = TRUE;

                                row_mysql_freeze_data_dictionary(trx);
                        }

As a result,  any row operation on any table using foreign keys stalls for the entire time of the DROP TABLE operation due to not being able to get this shared lock. 

I have checked code for 5.6.22 and looks like the issue is also there.

I know that most installations have dropped all foreign keys for performance reasons, however,  I don't understand why a dict_operation_lock is needed when checking constraints in row operations.

Unfortunately our large buffer pool causes DROP TABLE to take quite some time, and this issue around dict_operation_lock makes safely dropping a table impossible in production as it creates downtime on the application side.

How to repeat:
Create tables using foreign keys, load them with some data on a machine with 80G buffer pool. Ensure buffer pool is warm, then drop a table and look for stalls in innodb status.
[11 Dec 2017 0:52] haochen he
there is comment in src-5.7.17:
/* NOTE that if the thread ends up waiting for a lock
    we will release dict_operation_lock temporarily!
    But the counter on the table protects the referenced
    table from being dropped while the check is running. */
i dont know why waiting is necessary but it seems that it does.
[23 Nov 2018 12:56] MySQL Verification Team
Hi,

We are very well aware of the fact that exclusive lock is taken for the table during DROP TABLE and shared lock is taken for the checks being done for foreign keys. Hence, this is the only known solution that will maintain the integrity of both DML and DDL operations. Let us not forget that InnoDB SE is a fully ACID compliant engine.

Hence, we can not verify this bug, unless you come with a solution that would perform much faster for these situations.
[24 Dec 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".