Bug #91548 LOCK_grant and LOCK_open can deadlock on a gtid slave
Submitted: 4 Jul 2018 15:39 Modified: 30 Nov 2018 3:54
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Locking Severity:S1 (Critical)
Version:5.7.22 OS:Any
Assigned to: CPU Architecture:Any

[4 Jul 2018 15:39] Shane Bester
Description:
Two threads deadlock on the slave, their stacks are :

1. Slave SQL Thread running a CREATE USER command and trying to save gtid:

 # Call Site
00 ntdll!ZwWaitForSingleObject+0xa
01 ntdll!RtlpWaitOnCriticalSection+0xe8
02 ntdll!RtlEnterCriticalSection+0xd1
03 mysqld!native_mutex_lock+0x10 [.\include\thr_mutex.h @ 81]
04 mysqld!my_mutex_lock+0x10 [.\include\thr_mutex.h @ 182]
05 mysqld!inline_mysql_mutex_lock+0x3e [.\include\mysql\psi\mysql_thread.h @ 711]
06 mysqld!check_if_table_exists+0x66 [.\sql\sql_base.cc @ 2655]
07 mysqld!open_table+0x4a0 [.\sql\sql_base.cc @ 3238]
08 mysqld!open_and_process_table+0x1dd [.\sql\sql_base.cc @ 5144]
09 mysqld!open_tables+0x221 [.\sql\sql_base.cc @ 5753]
0a mysqld!open_and_lock_tables+0x44 [.\sql\sql_base.cc @ 6475]
0b mysqld!open_n_lock_single_table+0x3a [.\sql\sql_base.cc @ 6317]
0c mysqld!open_n_lock_single_table+0x27 [.\sql\sql_base.h @ 469]
0d mysqld!System_table_access::open_table+0x145 [.\sql\rpl_table_access.cc @ 57]
0e mysqld!Gtid_table_access_context::init+0xd5 [.\sql\rpl_gtid_persist.cc @ 192]
0f mysqld!Gtid_table_persistor::save+0xf4 [.\sql\rpl_gtid_persist.cc @ 406]
10 mysqld!Gtid_state::save+0x22 [.\sql\rpl_gtid_state.cc @ 714]
11 mysqld!commit_owned_gtids+0x61 [.\sql\handler.cc @ 1592]
12 mysqld!commit_owned_gtid_by_partial_command+0x1a [.\sql\handler.cc @ 1622]
13 mysqld!mysql_create_user+0x571 [.\sql\auth\sql_user.cc @ 1491]
14 mysqld!mysql_execute_command+0x21ed [.\sql\sql_parse.cc @ 3950]
15 mysqld!mysql_parse+0x296 [.\sql\sql_parse.cc @ 5560]
16 mysqld!Query_log_event::do_apply_event+0x966 [.\sql\log_event.cc @ 4790]
17 mysqld!Log_event::apply_event+0x2a3 [.\sql\log_event.cc @ 3473]
18 mysqld!apply_event_and_update_pos+0x17e [.\sql\rpl_slave.cc @ 4719]
19 mysqld!exec_relay_log_event+0x1f1 [.\sql\rpl_slave.cc @ 5268]
1a mysqld!handle_slave_sql+0x975 [.\sql\rpl_slave.cc @ 7425]
1b mysqld!pfs_spawn_thread+0x132 [.\storage\perfschema\pfs.cc @ 2193]
1c mysqld!win_thread_start+0x1c [.\mysys\my_thread.c @ 38]

2.  end-user connection querying i_s:

 # Call Site
00 ntdll!ZwWaitForKeyedEvent+0xa
01 ntdll! ?? ::FNODOBFM::`string'+0x9837
02 mysqld!native_rw_rdlock+0xc [.\include\thr_rwlock.h @ 76]
03 mysqld!inline_mysql_rwlock_rdlock+0x37 [.\include\mysql\psi\mysql_thread.h @ 895]
04 mysqld!Partitioned_rwlock::rdlock+0x4f [.\sql\auth\partitioned_rwlock.h @ 79]
05 mysqld!Partitioned_rwlock_read_guard::{ctor}+0x15 [.\sql\auth\partitioned_rwlock.h @ 113]
06 mysqld!LOCK_grant_read_guard::{ctor}+0x1b [.\sql\auth\sql_authorization.cc @ 548]
07 mysqld!get_column_grant+0x48 [.\sql\auth\sql_authorization.cc @ 2705]
08 mysqld!get_schema_column_record+0x238 [.\sql\sql_show.cc @ 5372]
09 mysqld!fill_schema_table_from_frm+0x466 [.\sql\sql_show.cc @ 4408]
0a mysqld!get_all_tables+0x4da [.\sql\sql_show.cc @ 4713]
0b mysqld!do_fill_table+0x89 [.\sql\sql_show.cc @ 8148]
0c mysqld!get_schema_tables_result+0x240 [.\sql\sql_show.cc @ 8279]
0d mysqld!JOIN::prepare_result+0x91 [.\sql\sql_select.cc @ 908]
0e mysqld!JOIN::exec+0x15c [.\sql\sql_executor.cc @ 124]
0f mysqld!handle_query+0x202 [.\sql\sql_select.cc @ 185]
10 mysqld!execute_sqlcom_select+0x226 [.\sql\sql_parse.cc @ 5128]
11 mysqld!mysql_execute_command+0x707 [.\sql\sql_parse.cc @ 2814]
12 mysqld!mysql_parse+0x296 [.\sql\sql_parse.cc @ 5560]
13 mysqld!dispatch_command+0x6e4 [.\sql\sql_parse.cc @ 1486]
14 mysqld!do_command+0x1ea [.\sql\sql_parse.cc @ 1026]
15 mysqld!handle_connection+0x12c [.\sql\conn_handler\connection_handler_per_thread.cc @ 300]
16 mysqld!pfs_spawn_thread+0x132 [.\storage\perfschema\pfs.cc @ 2193]
17 mysqld!win_thread_start+0x1c [.\mysys\my_thread.c @ 38]
18 ucrtbase!crt_at_quick_exit+0x7d

Here's a sample cut processlist, also showing new connections are stalling ....

| Time | State                | Info
+------+----------------------+----------------------------------------
|  496 |                      | NULL
|  867 | Opening tables       | NULL
|  519 | Opening tables       | CREATE USER IF NOT EXISTS 'myuser'@'%'
|  519 | checking permissions | select * from information_schema.column
|  488 | starting             | SET NAMES utf8
|  488 | starting             | SET NAMES utf8
|  488 | starting             | SET NAMES utf8
|  488 | starting             | SET NAMES utf8
|  458 | starting             | SET NAMES utf8
|  458 | starting             | SET NAMES utf8
|  458 | starting             | SET NAMES utf8
|  427 | starting             | SET NAMES utf8

How to repeat:
Configuration
-------------
Setup ordinary gtid master.
Slave runs with options:
 --sync_master_info=1 --sync_relay_log=1 --sync_relay_log_info=1
 --loose-enforce-gtid-consistency=ON --loose-master_info_repository=TABLE
 --loose-relay_log_info_repository=TABLE  --loose-gtid-mode=ON

To repeat:
----------
Execute on master in a loop:
  drop user if exists myuser;
  create user if not exists myuser;

Create some fake tables as needed.. the more the better.

On slave, connect and query the i_s in a loop, multiple threads if needed:
  select * from information_schema.columns order by table_schema,table_name;
[5 Jul 2018 12:26] Shane Bester
I couldn't seem to reproduce this on 8.0.11.
[30 Nov 2018 3:54] Paul Dubois
Posted by developer:
 
Fixed in 5.7.25, 8.0.14.

On a GTID-enabled server, concurrent statements on the
INFORMATION_SCHEMA.COLUMNS table could deadlock.