Bug #103275 MySQL starup hangs when used with memcached
Submitted: 12 Apr 2021 5:41 Modified: 21 Apr 2021 12:02
Reporter: Mershad Irani Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Memcached Severity:S2 (Serious)
Version:8.0.23, 8.0.24 OS:Any
Assigned to: CPU Architecture:Any

[12 Apr 2021 5:41] Mershad Irani
Description:
When used with memcached plugin, mysqld 8.0 process fails to startup(hangs while starting up) when there are large number of entries in innodb_memcache.containers. 

I understand that memcached is deprecated 8.0.22 onwards, but I thought it would be worth listing this bug, since it is easily reproducible and affects production. 

On further investigation, we found that this happens when there are a large number of tables and the tables are present in innodb_memcache.containers table and not in the mysql/innodb data dictionary. 

This could be a matter of timing. With lower number of entries in innodb_memcache.containers, the probability of the race condition between the mysqld_main and daemon_memcached_main threads reduces. For Example: only 1 in 5 startups will hang. 

The probability of the race condition increases with higher number of entries innodb_memcache.containers

So here is my high level understanding of the issue by analyzing the stack trace(I will attach it to the bug) 

T1 is mysqld_main
T2 is daemon_memcached_main

T2 needs to run innodb_initialize, for which it holds a MDL lock on innodb_memcache.containers.
T1 needs to hold the MDL lock  on the same table for Data Dictionary initialization.

T2 now waits for a mutex on the innodb DD cache to validate  testdb1/sbtest1036.
T1 is holding a mutex on the DD cache to clear innodb_memcache.containers from the DD cache(reset_tables_and_tablespaces) .

T1 waits on T2 to release the MDL lock on innodb_memcache.containers
T2 waist on T1 to release the mutex on the innodb DD cache. 

Thus leading to a deadlock.

How to repeat:
1. create 10000 tables. It is also at times reproducible with 1000 tables. 

sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/insert.lua --mysql-port=3306 --db-driver=mysql --mysql-table-engine=innodb --oltp-tables-count=10000 --oltp-table-size=1 --threads=100 --mysql-user=aaa --mysql-db=testdb1 --mysql-password=<password>     --mysql-host=<host> prepare

2. Insert the entries into innodb_memcache.containers

INSERT INTO `innodb_memcache`.`containers` select concat(table_name,table_schema), table_schema, table_name,'id','k','flags','cas','expiry','PRIMARY' from information_schema.tables where table_schema like 'testdb1' ;

3. Now restart mysqld. You would see that it is not coming up.
[20 Apr 2021 7:30] MySQL Verification Team
Hello Mershad Irani,

Thank you for the report and feedback.
Could you please provide the configuration(my.cnf) details from your environment to reproduce this issue at our end? You may want to mark it as a private if you wish after posting here. Thank you.

regards,
Umesh
[20 Apr 2021 14:41] Mershad Irani
Hi Umesh, 

Thank you for getting back. 

I am able to reproduce this with minimal configuration and no configuration specific to memcached. 

innodb_data_file_path = ibdata1:12M:autoextend
innodb-file-per-table = 1
innodb_log_files_in_group = 2
innodb_log_file_size = 134217728
pid-file = <path>
basedir = <path>
datadir = <path>
innodb-data-home-dir = <path>
innodb-log-group-home-dir = <path>
tmpdir = <path>
[21 Apr 2021 8:43] MySQL Verification Team
Thank you for the details.
Verified as described with 8.0.24 build.

regards,
Umesh
[21 Apr 2021 12:02] Mershad Irani
Correction to the description: 

On further investigation, we found that this happens when there are a large number of tables and the tables are present in innodb_memcache.containers table and not in the mysql/innodb data dictionary. 

-- changes to --  

On further investigation, we found that this happens when there are a large number of entries  present in innodb_memcache.containers table.

The issue is reproducible even if there are a no state entries innodb_memcache.containers table.
[30 Oct 2023 14:24] OCA Admin
Contribution submitted via Github - Bug #103275: Fix startup hang when memcached enabled and DB has many tables 
(*) Contribution by Nicholas Othieno (Github lottaquestions, mysql-server/pull/500#issuecomment-1785241299): > Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle''s Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment: "I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it." Thanks

This contribution is under the OCA signed by Amazon and covering submissions to the MySQL project.

Contribution: git_patch_1577283156.txt (text/plain), 11.05 KiB.

[6 Mar 14:56] Nicholas Othieno
Hi MySQL Open source team,

Checking on the status of the code contribution we sent. The bug is still appearing in MySQL 8.0.36. 

Would we be able to get feedback, on the submission?