Bug #83079 | long semaphore wait often resulting in crash, DICT_SYS mutex held by statistics | ||
---|---|---|---|
Submitted: | 21 Sep 2016 13:16 | Modified: | 24 Apr 2017 4:45 |
Reporter: | Mikhail Izioumtchenko (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[21 Sep 2016 13:16]
Mikhail Izioumtchenko
[21 Sep 2016 13:23]
Mikhail Izioumtchenko
DICT_SYS mutex is used for common things such as opening the table. So if dict_stats_save() runs into problems such as having to wait for something in the buffer pool everything will stop. I think there is no reason for dict_stats_save() to hold anything much. Maybe something like this can be implemented: * client threads queue statistics update requests to the statistic thread and the latter do it in some standard MySQL way as opposed to the current usage of the internal InnoDB SQL interpreter.
[5 Dec 2016 15:45]
MySQL Verification Team
Fully verified. This bug was experienced on many occasions, but more about test case in another comment.
[3 Apr 2017 5:22]
Aakanksha Verma
Requested the reporter for the reproducible scenario.
[6 Apr 2017 8:59]
Aakanksha Verma
Thanks for the update from the customer. But it would be really helpful if we could also get the core file. Request the reporter to kindly update the core file to the bug page.
[6 Apr 2017 9:17]
Naga Satyanarayana Bodapati
Posted by developer: from log sent: Last time read locked in file row0purge.cc line 853 Last time write locked in file /export/home/pb2/build/sb_0-19387208-1466445627.43/rpm/BUILD/mysql-5.7.13-SR12884268831/mysql-5.7.13-SR12884268831/storage/innobase/dict/dict0stats.cc line 2366 Some observations: dict_stats_save() has taken dict_operation_lock and it is also might be waiting for dict_sys mutex. 2366 rw_lock_x_lock(dict_operation_lock); 2367 mutex_enter(&dict_sys->mutex); 2368 So the conclusion that stats is holding dict_sys mutex could be wrong. Is it possible to generate the core with debug build? We have debug structures in rw_lock which tells us who is holding it. Will greatly help us in fixing this bug quickly.
[7 Apr 2017 14:58]
MySQL Verification Team
Mikhail, As you can see, our Development is unable to repeat the bug. Simply, there are too many different options of acquiring and releasing the locks that lead to this particular, very long, semaphore wait. Thank you very much in advance !!!
[24 Apr 2017 4:45]
Aakanksha Verma
Closing the bug as it is no more reproducible.
[26 May 2017 16:29]
Stephan Lueckl
We can reproduce this bug on our development system. (we have about 40 databases with 500 tables each in our environment) Happens all the time when we import a database on a running system. We tried to reproduce it with a compiled debug-version of mysql as you wanted. But in this case the database seems to be too slow - the bug isn't reproduceable anymore. Any suggestions? What else can we do?
[29 May 2017 8:33]
Stephan Lueckl
It is quite easy to reproduce it. You only have to reimport a db structure dump (without data) over a existing database - run it with 2 processes simultaneously and the database will crash We have uploaded a coredump, our config, and the error log.
[29 May 2017 10:00]
MySQL Verification Team
@@[29 May 8:33] Stephan Lueckl Please attach here using the Tab files (private if you wish) the zip file of the db structure dump file and my.cnf. I couldn't repeat with my own db. Thanks in advance.
[29 May 2017 10:10]
Stephan Lueckl
[29 May 10:00] Miguel Solorzano We added the 2 files as requested.
[16 Oct 2017 19:16]
Paul Nylund
This happens constantly when I try to drop a lot of databases. mysql -uroot -pXXXX -e "show databases" |grep -v sys| grep -v Database | grep -v mysql| grep -v information_schema| gawk '{print "drop database " $1 ";select sleep(0.5);"}' | mysql -uroot -pXXXX It drops random number of databases until mysql just hangs and reboots. From mysql.err log: 2017-10-16T11:33:16.238249Z 0 [Warning] InnoDB: A long semaphore wait: --Thread 140574724179712 has waited at row0row.cc line 1073 for 241.00 seconds the semaphore: S-lock on RW-latch at 0x7fdcd8e5eb20 created in file buf0buf.cc line 1460 a writer (thread id 140574707394304) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: ffffffffdfffffff Last time read locked in file row0row.cc line 1073 Last time write locked in file /build/mysql-5.7-ESHnMz/mysql-5.7-5.7.19/storage/innobase/btr/btr0cur.cc line 1870 2017-10-16T11:33:16.238378Z 0 [Warning] InnoDB: A long semaphore wait:
[18 Oct 2017 8:44]
Stephan Lueckl
Same here -> we can reproduce the bug too and got it verified in #86552, but since june silentium... :(
[13 Dec 2017 15:38]
Stephan Lueckl
Fix for this problem found! Changing the variable innodb_purge_threads from it’s new default value of 4, to the 5.6 default value of 1 to avoid purge thread contention innodb_purge_threads=1 Source: https://blog.pythian.com/mysql-crashes-ddl-statement-lesson-purge-threads/
[22 Aug 2018 16:07]
Ryan Brothers
I've been running into this issue in 5.7 too and can reproduce it. I'm trying innodb_purge_threads = 1 now, but since innodb_purge_threads = 4 is the default in 5.7 and 8.0, is there still an issue to be fixed? It's very easy to reproduce by running concurrent DDL.
[22 Aug 2018 16:20]
Stephan Lueckl
I have the info that this bug was fixed in the 5.7.23, 8.0.12 release
[18 Dec 2018 10:42]
Anton Ravich
We had faced this bug on 5.7.23 (AWS RDS service) with innodb_purge_threads = 1 . Can we reopen this bug ?
[10 Jan 2019 14:59]
Przemyslaw Malkowski
Isn't this one related actually to https://bugs.mysql.com/bug.php?id=91977 where AHI is contributing to the problem?