Bug #101809 "DROP TRIGGER IF EXISTS” in concurrent sessions leads to an error
Submitted: 1 Dec 2020 2:11 Modified: 8 Dec 2020 15:38
Reporter: Akshay Suryawanshi (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DDL Severity:S3 (Non-critical)
Version:5.7, 5.7.32 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution, Drop, IF EXISTS, trigger

[1 Dec 2020 2:11] Akshay Suryawanshi
Description:
If more than one `DROP TRIGGER IF EXISTS` statements are executed at the same time for the same table, one or more of the statements may return a MySQL error “ERROR 1360 (HY000): Trigger does not exist” instead of succeeding with a warning as usual.
 
Code examination reveals that usage of the “IF EXISTS” clause and existence of the actual trigger is only checked without holding a lock through the whole DROP TRIGGER command, following the add_table_for_trigger call around line 143 of sql/sql_trigger.cc, in mysql_create_or_drop_trigger:

123:   if (!create)
124:   {
125:     bool if_exists= thd->lex->drop_if_exists;
...
143:     if (add_table_for_trigger(thd,
144:                               thd->lex->spname->m_db,
145:                               thd->lex->spname->m_name,
146:                               if_exists, & tables))
147:       goto end;
148: 
149:     if (!tables)
150:     {
151:       DBUG_ASSERT(if_exists);
152:       /*
153:         Since the trigger does not exist, there is no associated table,
154:         and therefore :
155:         - no TRIGGER privileges to check,
156:         - no trigger to drop,
157:         - no table to lock/modify,
158:         so the drop statement is successful.
159:       */
160:       result= FALSE;
161:       /* Still, we need to log the query ... */
162:       stmt_query.append(thd->query().str, thd->query().length);
163:       goto end;
164:     }
165:   }
166: 

However, if more than one concurrent thread executes a “DROP TRIGGER IF EXISTS” statement for the same trigger because no lock is held before checking the existence of the trigger, multiple threads may pass this condition. They will then race to acquire the MDL lock, where the first thread to successfully acquire it deletes the trigger, and any other threads will find that the trigger is already deleted, causing an ER_TRG_DOES_NOT_EXIST error to be generated around line 235 of sql/sql_trigger.cc:

File: sql_trigger.cc
231:   if (!table->triggers)
232:   {
233:     if (!create)
234:     {
235:       my_error(ER_TRG_DOES_NOT_EXIST, MYF(0));
236:       goto end;
237:     }
238: 

This problem is more dangerous when using Multi-threaded Slave replication as the chances of multiple parallel worker threads executing the “DROP TRIGGER IF EXISTS” statements concurrently are more likely, and when this occurs it leads to replication being broken.

Note: There is a possibility that similar racy code paths exist in other places where the “IF EXISTS” clause applies to DDLs, but we have not examined the other code paths yet.

How to repeat:
First, start by creating a test trigger, so that it can be dropped:

Session 1
root@mysql> CREATE TRIGGER ins_employees BEFORE INSERT ON employees for each row set new.emp_no = emp_no + 1;
Query OK, 0 rows affected (0.01 sec)

Second, we acquire the MDL lock using LOCK TABLES:

Session 1
root@mysql> LOCK TABLES employees READ;
Query OK, 0 rows affected (0.00 sec)

Next, in another two sessions, execute the same DROP TRIGGER statement, each will wait to check that the trigger does exist and then wait on the MDL lock:

Session 2
root@mysql> DROP TRIGGER IF EXISTS ins_employees;

Session 3
root@mysql> DROP TRIGGER IF EXISTS ins_employees;

Finally, release the MDL lock acquired in session 1:

Session 1
root@mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)

Immediately,  Session 2 and 3 will be released and the statement in Session 3 returns an error, even though it should have been OK with a warning just as session 2:

Session 2
root@mysql> DROP TRIGGER IF EXISTS ins_employees;
Query OK, 0 rows affected (3.52 sec)

Session 3
root@mysql> DROP TRIGGER IF EXISTS ins_employees;
ERROR 1360 (HY000): Trigger does not exist

Suggested fix:
A simple fix would be to add another check for the “IF EXISTS” clause after acquiring the MDL lock, before returning the error with my_error(). The warning which should be returned has at this point already been pushed, so this will not introduce any behavioral changes.
[1 Dec 2020 2:12] Akshay Suryawanshi
Patch for this bug request

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: 0001-Drop-trigger-if-exists-error-fix.patch (application/octet-stream, text), 1.31 KiB.

[1 Dec 2020 8:07] MySQL Verification Team
Hello Akshay,

Thank you for the report and contribution.
Verified as described with 5.7.32 build.

regards,
Umesh
[7 Dec 2020 10:40] Dmitry Lenev
Posted by developer:
 
Hello!

The problem is not repeatable in 8.0 branch of server because it was
fixed as part of New Data Dictionary work by introducing proper locking
on trigger names. This was done by the following patch:

commit 7d09dd19cc8119d42862bfe7a7e0b5f17735cdce
Author: Dmitry Shulga <dmitry.shulga@oracle.com>
Date:   Tue Aug 9 21:37:52 2016 +0600

    This is implementation of WL#7896 - Use DD API to work with triggers.
    
    Within this WL the following changes were made:
      * the statements CREATE/DROP TRIGGER use the new dd api to store/retrieve
        triggers metadata from the data dictionary;
        ...
      * added support for MDL locking while handling CREATE/DROP TRIGGER;
        To support it the new standalone function acquire_mdl_for_trigger
        was added that is called right after the method open_and_lock_subj_table
        returns success. Also this function called indirectly (over call of
        the new function lock_trigger_names) as part of dropping a table;
        ...

which was pushed to 8.0.1 Development Milestone Release.

While the contributed patch for 5.7 alleviates the problem in the specific
case mentioned in bug report it still doesn't fully fix broader class of
similar scenarios. For example, proper fix should ensure that we have correct
order of events in binary log which requires some form of locking for trigger
names.

Taking into account the above we choose not to apply the contributed patch to
5.7 branch and close the bug as fixed in 8.0.1.
[8 Dec 2020 15:38] Paul DuBois
Posted by developer:
 
Fixed in 8.0.1 by the switch to the Data Dictionary.