Description:
When filter rules are in effect on a slave, there is an inflation of time accounted for within the "System Lock" stage, which is purely supposed to be for locking tables. This can clearly be seen within Performance Schema, for instance:
mysql> select event_name, timer_wait from events_stages_history_long where thread_id = 30;
+--------------------------------------------+------------+
| EVENT_NAME | TIMER_WAIT |
+--------------------------------------------+------------+
| stage/sql/init | 22066000 |
| stage/sql/System lock | 92186000 | <<-- normal time
| stage/sql/updating | 107064000 |
| stage/sql/end | 4824000 |
| stage/sql/query end | 2302000 |
| stage/sql/closing tables | 6024000 |
| stage/sql/freeing items | 5429000 |
| stage/sql/Reading event from the relay log | 14015000 |
| stage/sql/Opening tables | 9299000 |
| stage/sql/System lock | 650642000 | <<-- inflated time
| stage/sql/Reading event from the relay log | 15994000 |
...
| stage/sql/Reading event from the relay log | 9982000 |
| stage/sql/Opening tables | 9900000 |
| stage/sql/System lock | 1923939000 | <<-- inflated time
| stage/sql/Reading event from the relay log | 15857000 |
...
| stage/sql/Reading event from the relay log | 6355000 |
| stage/sql/init | 37904000 |
| stage/sql/checking permissions | 2275000 |
| stage/sql/Opening tables | 11326000 |
| stage/sql/init | 15904000 |
| stage/sql/System lock | 30497000 | <<-- normal time
| stage/sql/updating | 81414000 |
| stage/sql/end | 4791000 |
| stage/sql/query end | 2149000 |
| stage/sql/closing tables | 5753000 |
| stage/sql/freeing items | 5001000 |
| stage/sql/Reading event from the relay log | 10638000 |
| stage/sql/Opening tables | 7668000 |
| stage/sql/System lock | 697293000 | <<-- inflated time
| stage/sql/Reading event from the relay log | 16016000 |
...
Note that every time there is an inflated time against the System lock stage, the thread immediately goes back to reading an event from the relay log (i.e, the current event was ignored and skipped).
How to repeat:
* Set up replication with filtering on a specific table
* Run replication traffic for both the filtered, and some non filtered tables
* Monitor events_stages_history_long for the replication SQL thread specifically
* See, as above, the inflated times for what should be filtered events
Suggested fix:
Set the thread state to something else after the System lock state, before checking filtering rules (suggest adding a new "Checking replication filters" stage).