Bug #119553 Mutex deadlock between THR_LOCK_servers and federated_mutex
Submitted: 15 Dec 14:00
Reporter: Marcos Albe (OCA) Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Federated storage engine Severity:S2 (Serious)
Version:8.4.6 OS:Linux
Assigned to: CPU Architecture:x86

[15 Dec 14:00] Marcos Albe
Description:
Lock/mutex acquisition order allows for circular dependency between THR_LOCK_servers and federated_mutex.  We had core dump obtained while the instance was stalled, from which we produced the following analysis:

### Thread 7665 ###

## Switch to thread 7665

(gdb) t 7665
[Switching to thread 7665 (Thread 0x7edd50d9d640 (LWP 433700))]\
#0  futex_wait (private=0, expected=2, futex_word=0x3bb3940 <LOCK_open>) at ../sysdeps/nptl/futex-internal.h:146                                                          
146      int err = lll_futex_timed_wait (futex_word, expected, NULL, private);

## Get backtrace

(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x3bb3940 <LOCK_open>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x3bb3940 <LOCK_open>, private=0) at lowlevellock.c:50
#2  0x00007f123ca8d704 in ___pthread_mutex_lock (mutex=0x3bb3940 <LOCK_open>) at pthread_mutex_lock.c:154
#3  0x00000000009e633f in native_mutex_lock (mutex=0x3bb3940 <LOCK_open>) at ../../../include/thr_mutex.h:95
#4  my_mutex_lock (mp=0x3bb3940 <LOCK_open>) at ../../../include/thr_mutex.h:191
#5  inline_mysql_mutex_lock (that=0x3bb3940 <LOCK_open>, src_file=<optimized out>, src_line=<optimized out>)  at ../../../include/mysql/psi/mysql_mutex.h:257
#6  0x0000000000c65820 in open_table (thd=thd@entry=0x7ee474535950, table_list=table_list@entry=0x7ee47454b210, ot_ctx=ot_ctx@entry=0x7edd50d9a450)
…snip…

## Switch to frame 4 within the thread, where it is attempting to acquire the mutex

(gdb) f 4
#4  my_mutex_lock (mp=0x3bb3940 <LOCK_open>) at ../../../include/thr_mutex.h:191
191      return native_mutex_lock(&mp->m_u.m_native);

 
## Print the mutex to obtain the current owner

(gdb) p *mp
$1 = {m_u = {m_native = {__data = {__lock = 2, __count = 0, __owner = 291993, __nusers = 1, __kind = 3, __spins = 100, __elision = 0, __list = {__prev = 0x0,  __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\231t\004\000\001\000\000\000\003\000\000\000d", '\000' <repeats 18 times>, __align = 2},    m_safe_ptr = 0x2}}

 
## Locate the owner

(gdb) thread find 291993
Thread 290 has target id 'Thread 0x7ef84834e640 (LWP 291993)'

### Thread 290 ###

## Switch to thread 290, owner of LOCK_open

(gdb) t 290
[Switching to thread 290 (Thread 0x7ef84834e640 (LWP 291993))]
#0  futex_wait (private=0, expected=2, futex_word=0x3c1e2e0 <federated_mutex>) at ../sysdeps/nptl/futex-internal.h:146
146      int err = lll_futex_timed_wait (futex_word, expected, NULL, private);

 
## Obtain backtrace for this thread

(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x3c1e2e0 <federated_mutex>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x3c1e2e0 <federated_mutex>, private=0) at lowlevellock.c:50
#2  0x00007f123ca8d704 in ___pthread_mutex_lock (mutex=0x3c1e2e0 <federated_mutex>) at pthread_mutex_lock.c:154
#3  0x0000000001ab1db4 in native_mutex_lock (mutex=0x3c1e2e0 <federated_mutex>) at ../../../include/thr_mutex.h:95
#4  my_mutex_lock (mp=0x3c1e2e0 <federated_mutex>) at ../../../include/thr_mutex.h:191
#5  inline_mysql_mutex_lock (src_line=1531, src_file=0x2dfc958 "../../../../storage/federated/ha_federated.cc", that=0x3c1e2e0 <federated_mutex>)    at ../../../include/mysql/psi/mysql_mutex.h:257
#6  free_share (share=0x7ee5180d9280) at ../storage/federated/../../../../storage/federated/ha_federated.cc:1531
#7  ha_federated::close (this=<optimized out>) at ../storage/federated/../../../../storage/federated/ha_federated.cc:1629
...snip...

 
## Move to frame 4, where thread is attempting to acquire the mutex:

(gdb) f 4
#4  my_mutex_lock (mp=0x3c1e2e0 <federated_mutex>) at ../../../include/thr_mutex.h:191
191      return native_mutex_lock(&mp->m_u.m_native);

## Print the mutex to obtain the current owner

(gdb) p *mp
$2 = {m_u = {m_native = {__data = {__lock = 2, __count = 0, __owner = 304040, __nusers = 1, __kind = 3, __spins = 0, __elision = 0, __list = {__prev = 0x0,    __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\250\243\004\000\001\000\000\000\003", '\000' <repeats 22 times>, __align = 2}, m_safe_ptr = 0x2}}

## Locate the owner

(gdb) thread find 304040
Thread 347 has target id 'Thread 0x7ee53f7f1640 (LWP 304040)'

### Thread 347 ### 

## Switch to thread 347

(gdb) t 347
[Switching to thread 347 (Thread 0x7ee53f7f1640 (LWP 304040))]
#0  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x3bb4ea8 <THR_LOCK_servers+8>, expected=expected@entry=3, clockid=clockid@entry=0,                          abstime=abstime@entry=0x0, private=<optimized out>, cancel=cancel@entry=false) at futex-internal.c:103
103      switch (err)

## Get backtrace

(gdb) bt
#0  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x3bb4ea8 <THR_LOCK_servers+8>, expected=expected@entry=3, clockid=clockid@entry=0,  abstime=abstime@entry=0x0, private=<optimized out>, cancel=cancel@entry=false) at futex-internal.c:103
#1  0x00007f123ca8727c in __GI___futex_abstimed_wait64 (futex_word=futex_word@entry=0x3bb4ea8 <THR_LOCK_servers+8>, expected=expected@entry=3, clockid=clockid@entry=0,  abstime=abstime@entry=0x0, private=<optimized out>) at futex-internal.c:128
#2  0x00007f123ca8fcbb in __pthread_rwlock_rdlock_full64 (abstime=0x0, clockid=0, rwlock=0x3bb4ea0 <THR_LOCK_servers>) at /usr/src/debug/glibc-2.34-168.el9_6.23.x86_64/nptl/pthread_rwlock_common.c:460
#3  ___pthread_rwlock_rdlock (rwlock=0x3bb4ea0 <THR_LOCK_servers>) at pthread_rwlock_rdlock.c:27
#4  0x0000000000d2d32e in native_rw_rdlock (rwp=0x3bb4ea0 <THR_LOCK_servers>) at ../../../include/thr_rwlock.h:83
#5  inline_mysql_rwlock_rdlock (src_line=958, src_file=0x2c8ef38 "../../../sql/sql_servers.cc", that=0x3bb4ea0 <THR_LOCK_servers>)    at ../../../include/mysql/psi/mysql_rwlock.h:328
#6  get_server_by_name (mem=0x7ee53f7ed9d0, server_name=<optimized out>, buff=0x7ee53f7ed8b0) at ../../../sql/sql_servers.cc:958
#7  0x0000000001ab09ee in get_connection (share=0x7ee53f7edbd0, mem_root=0x7ee53f7ed9d0)    at ../storage/federated/../../../../storage/federated/ha_federated.cc:618
#8  parse_url (mem_root=mem_root@entry=0x7ee53f7ed9d0, share=share@entry=0x7ee53f7edbd0, table=table@entry=0x7ee1cc039480, table_create_flag=table_create_flag@entry=0)    at ../storage/federated/../../../../storage/federated/ha_federated.cc:788
#9  0x0000000001ab934f in get_share (table_name=0x7ee5180d56d8 "./lists/user_federated", table=0x7ee1cc039480)    at ../storage/federated/../../../../storage/federated/ha_federated.cc:1471
#10 0x0000000001ab9a6c in ha_federated::open (this=0x7ee1cc039e80, name=<optimized out>)
...snip...

 
## Move to frame 4, where thread is attempting to acquire the lock:

(gdb) f 4
#4  0x0000000000d2d32e in native_rw_rdlock (rwp=0x3bb4ea0 <THR_LOCK_servers>) at ../../../include/thr_rwlock.h:83
83      return pthread_rwlock_rdlock(rwp);

## Print the mutex to obtain the current writer

(gdb) p *rwp
$4 = {__data = {__readers = 11, __writers = 0, __wrphase_futex = 3, __writers_futex = 1, __pad3 = 0, __pad4 = 0, __cur_writer = 291993, __shared = 0,  __rwelision = 0 '\000', __pad1 = "\000\000\000\000\000\000", __pad2 = 0, __flags = 0},   __size = "\v\000\000\000\000\000\000\000\003\000\000\000\001", '\000' <repeats 11 times>, "\231t\004", '\000' <repeats 28 times>, __align = 11}

 
So, here we found that thread 290 is waiting for federated_mutex which is owned by thread 347 (LWP id 304040), and in turn thread 347 is waiting for THR_LOCK_servers which is owned by thread 290 (LWP id 291993).

How to repeat:
Use several FEDERATED tables on 8.4.6, coming from different servers, a few of which have slow networking/IO.  No discernible workload pattern.

Suggested fix:
Correct the order of acquisition for the mutex/locks involved in the above deadlock...or if this won't be fixed (as it appears to be the case with most other FEDERATED bugs), then pretty please: mark the engine as deprecated so people will stop using it :-)