Bug #79185 | Innodb freeze running REPLACE statements | ||
---|---|---|---|
Submitted: | 9 Nov 2015 11:09 | Modified: | 15 Jan 2016 18:42 |
Reporter: | Will Bryant | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.5.46 | OS: | Ubuntu (14.04) |
Assigned to: | Shaohua Wang | CPU Architecture: | Any |
Tags: | REPLACE hang |
[9 Nov 2015 11:09]
Will Bryant
[10 Nov 2015 20:11]
MySQL Verification Team
Please provide your my.cnf. Thanks.
[10 Nov 2015 21:16]
Will Bryant
Our config file
Attachment: my.cnf (application/octet-stream, text), 4.69 KiB.
[10 Nov 2015 21:40]
Will Bryant
More data points: using 2 clients it locked up after a few hours. Whereas using 4-8 clients but doing INSERT only, instead of REPLACE, it ran for hours with no lock-ups.
[18 Nov 2015 18:39]
Pawel Boguslawski
Hello, We've noticed similar problem while stress testing Debian mysql update from 5.5.44-0+deb7u1 to 5.5.46-0+deb7u1 (about 200 simultaneous OTRS user sessions simulated with jmeter, two apache application servers hitting one backend mysql database server over standard TCP connection). Only one in 10 test iterations caused this problem; we couldn't reproduce it again. No such problem occurred during our testing the previous mysql-server 5.5.x versions in the past. Symptoms: mysqld process is alive, no errors in syslog nor in mysql error log; cpu & disk are idle, existing db otrs connections (selects, updates, etc.) all are frozen; new db connections freeze on otrs (innodb) database queries too. Some status commands freeze too (i.e. "show engine innodb status") other not (i.e. "show processlist", "show status"); "service mysql stop" does not work - "kill -9" was necessarry; problem occured in our development environment so we had time to dump some logs and stats which may be helpful in debugging. Attached please find logs & additional comments & our mysqld config. Other Debian users also reported similar problem after 5.5.46-0+deb7u1 udate: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804214 Thank you & regards, Pawel IB Develoment Team https://dev.ib.pl/
[18 Nov 2015 18:41]
Pawel Boguslawski
Logs, dumps, comments and mysqld config.
Attachment: mysql-sever-5.5.46-0+deb7u1_crash.tar.gz (application/gzip, text), 17.58 KiB.
[27 Nov 2015 15:31]
MySQL Verification Team
All reporters, Please get the threads stack traces at next hang, using gdb. gdb -ex "set pagination 0" -ex "thread apply all bt"--batch -p $(pidof mysqld)
[1 Dec 2015 8:09]
Anton Stekanov
gdb backtrace (5.5.46-0+deb8u1)
Attachment: backtrace.log (text/x-log), 799.70 KiB.
[1 Dec 2015 8:10]
Anton Stekanov
> Shane Bester > gdb -ex "set pagination 0" -ex "thread apply all bt"--batch -p $(pidof mysqld) Same issue on 5.5.46-0+deb8u1. Backtrace attached
[3 Dec 2015 8:17]
Laurynas Biveinis
Having analysed the stacktraces of Percona Server occurrences (a recent release that has merged fix for bug 76135), and having tested a preliminary fix, we are pretty certain this is a regression from bug 76135: mutex lock word and waiters flag accesses are not ordered properly, as explained by Kristian Nielsen at https://lists.launchpad.net/maria-developers/msg07860.html. Note that Percona Server 5.5 InnoDB mutex implementation is identical to MySQL 5.5 one. If we patch the server that e.g. on x86_64 IB_STRONG_MEMORY_MODEL takes precedence over HAVE_IB_GCC_ATOMIC_TEST_AND_SET, the issue disappears. Obviously this is not a fix for other platforms.
[3 Dec 2015 11:24]
Shaohua Wang
Hi Will, would you please provide a detailed steps and related scripts to reproduce this bug? so that we can verify our fix once we have a solution. Thank you in advance!
[3 Dec 2015 11:27]
Shaohua Wang
Will, also your CPU/OS/Compiler versions, and is it possible to reproduce it on HDD?
[16 Dec 2015 15:17]
Panayiotis Gotsis
The issue affects us as well. I am uploading the relevant information. The snapshot of the data that I will be sending is after us trying to kill sql queries.
[16 Dec 2015 15:18]
Panayiotis Gotsis
my.cnf, show processes and trace for bug
Attachment: 5.5.46-bug.tar.gz (application/gzip, text), 15.23 KiB.
[16 Dec 2015 15:20]
Panayiotis Gotsis
I forgot to mention: Distributor ID: Debian Description: Debian GNU/Linux 7.9 (wheezy) Release: 7.9 Codename: wheezy Linux <hostname> 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u6 x86_64 GNU/Linux
[25 Dec 2015 16:44]
Brendon Colby
I believe that we experienced this issue as well multiple times after upgrading to 5.5.46-0+deb7u1. mysqld would completely hang / freeze as would all connection attempts to the server (our primary). I had to kill -9 the process. No errors were reported anywhere that I could find. All logging simply stopped. The entire server crashed at one point as well and I had to reboot it. Distributor ID: Debian Description: Debian GNU/Linux 7.9 (wheezy) Release: 7.9 Codename: wheezy Linux <primary DB server> 3.2.0-4-amd64 #1 SMP Debian 3.2.73-2+deb7u1 x86_64 GNU/Linux I downgraded to 5.5.44-0+deb7u1 and have been running fine for two days. I'll attach a backtrace, an extended status just seconds before the hang, and the config. The full process list showed no active queries but this was seconds before the hang. I had the general log turned on too but I see nothing that stands out to me. We ARE running REPLACE INTO queries, but the last query that it logged at the time of the hang was an INSERT statement. I didn't think to keep a session open and couldn't open one at the time of the hang, so I don't know what, if any, queries were stuck. I will attach a backtrace, our config, and the last extended status seconds before the hang.
[25 Dec 2015 16:45]
Brendon Colby
backtrace, my.cnf, extended status at time of hang
Attachment: 5.5.46_bug.tar.gz (application/x-gzip, text), 9.68 KiB.
[31 Dec 2015 7:04]
MySQL Verification Team
also: http://bugs.mysql.com/bug.php?id=79815
[15 Jan 2016 18:42]
Daniel Price
Fixed as of the upcoming 5.5.49, 5.5.30, 5.7.12, 5.8.0 release, and here's the changelog entry: Running REPLACE operations on multiple connections resulted in a hang. Thank you for the bug report.
[15 Jan 2016 19:17]
Daniel Price
Correction to the previous comment. This bug is fixed as of the 5.5.49, 5.6.30, 5.7.12, and 5.8.0 releases.
[1 Feb 2016 10:45]
Moritz Winterberg
Hi, we are hit by this problem every other day on one of our productive Jessie machines. Could you provide a patch or any other workaround as long as 5.5.49 is not released ? Or if not could you give a vague hint at what date the fix will be released ? Thank you !
[5 Feb 2016 22:02]
Brendon Colby
Moritz - we downgraded to 5.5.44 as a workaround and have been running solid ever since.
[17 Mar 2016 16:18]
Inaam Rana
Laurynas, Thanks for the insight. We were badly hurt by this. We are going to fall back to IB_STRONG_MEMORY_MODEL till 5.6.30 is out. I'll post here to update if it solved the problem for us.
[17 Mar 2016 19:39]
Laurynas Biveinis
Inaam, thank you. I'd also highly recommend to fix bug 79477, even if for 5.8 only. So that bugs like this are caught sooner instead of resulting in a sub-second stalling server with some hang probability.
[31 Mar 2016 13:46]
Inaam Rana
Falling back to IB_STRONG_MEMORY_MODEL indeed solved the problem. We were ending up in lock ups quite consistently on some of our heavily loaded clusters (almost a node every couple of hours). Now we have been running for few days without hitting this issue.
[31 Mar 2016 13:57]
Christopher Lörken
Any news on when the fixed versions will be released? This is a quite bad regression bug which forced us to downgrade and it is now fixed for nearly 3 months... Would be nice to know...