Bug #100058 | InnoDB hangs for 500s when renaming table | ||
---|---|---|---|
Submitted: | 1 Jul 2020 8:50 | Modified: | 3 Jul 2020 2:54 |
Reporter: | Xinjing Zhou-from TxSQL Team at Tencent | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.7.29, 5.7.30 | OS: | Linux |
Assigned to: | CPU Architecture: | x86 (24 cores) | |
Tags: | deadlock |
[1 Jul 2020 8:50]
Xinjing Zhou-from TxSQL Team at Tencent
[1 Jul 2020 8:52]
Xinjing Zhou-from TxSQL Team at Tencent
stack trace
Attachment: rename-hang.stack (application/octet-stream, text), 303.18 KiB.
[1 Jul 2020 8:53]
Xinjing Zhou-from TxSQL Team at Tencent
dump of the database "sbtest"
Attachment: sbtest.sql (application/octet-stream, text), 958.72 KiB.
[1 Jul 2020 8:57]
Xinjing Zhou-from TxSQL Team at Tencent
MySQL configuration file
Attachment: my.cnf (application/octet-stream, text), 2.36 KiB.
[1 Jul 2020 8:58]
Xinjing Zhou-from TxSQL Team at Tencent
go program for reproducing the hang
Attachment: rename-update.go (application/octet-stream, text), 7.26 KiB.
[1 Jul 2020 9:08]
Xinjing Zhou-from TxSQL Team at Tencent
Patch for fixing the issue
Attachment: rename-update-hang-fix.patch (application/octet-stream, text), 396 bytes.
[1 Jul 2020 12:38]
MySQL Verification Team
Hi Mr. Zhou, Thank you for your bug report. However, this does not seem to be bug at all. First of all, if you read our Reference Manual, you will find out that RENAME operation waits for ALL transactions that involve this table, in any possible manner, to finish, before RENAME can be performed in the internal cache and dictionary. This is exactly your case. Needless to mention, you are not using latest 5.7 release. Not a bug.
[2 Jul 2020 7:00]
Xinjing Zhou-from TxSQL Team at Tencent
Sorry, I am a bit confused. Why is hanging for 500s is not a bug? The io handler thread is basically doing nothing for 500s. This also means that the io thread won't handle any reads/writes assigned to it, blocking even transactions unrelated to the table being renamed for 500s. You can verify this by running "show processlist;" in MySQL client cmdline when the deadlock occurs. A sample result of the "show processlist;" is attached below. The table is being renamed from "sbtest6tmp" to "sbtest6". There are many blocked updates on unrelated tables. Just verified the same issue on the latest release(using source codes of 5.7.30 from Github).
[2 Jul 2020 7:01]
Xinjing Zhou-from TxSQL Team at Tencent
Sample result of the "show processlist " command when the deadlock occurs.
Attachment: show-processlist.txt (text/plain), 9.12 KiB.
[2 Jul 2020 7:04]
Xinjing Zhou-from TxSQL Team at Tencent
For production servers, hanging for 500s is pretty serious, if not a disaster.
[2 Jul 2020 12:17]
MySQL Verification Team
Hi, As our documentation clearly states, there are several different things that RENAME has to wait for in order to execute. Beside this problem, your installation might not be optimally configured. Not a bug.
[3 Jul 2020 2:54]
Xinjing Zhou-from TxSQL Team at Tencent
"you will find out that RENAME operation waits for ALL transactions that involve this table, in any possible manner, to finish, before RENAME can be performed in the internal cache and dictionary.". I get this as RENAME is a blocking operation that requires exclusive access to the table being renamed. However, in the show-processlist.txt shown above, it is actually blocking other transactions on other tables as well which is not an expected behavior according to the manual. Actually, these blocked transactions logically don't have anything to do with the table being renamed. Unfortunately, the io operations of these unrelated transactions happen to be submitted to the io handler which is being blocked by the deadlock, causing these transactions to hang as well. Maybe the manual could be improved to reflect this. "Beside this problem, your installation might not be optimally configured.". As for the sample installation described above, it is designed for quickly reproducing the bug. Our real workload configuration has the following traits that the sample installation mimics: 1. Change Buffering = All 2. Double Write = On 3. Secondary indexes updates. 4. The working set size > capacity the buffer pool so that change buffering and merging happens. 5. Renaming tables that have secondary indexes updates on them. So long as there are change buffering, secondary index updates, and renaming tables in the workload, it is possible to trigger the hang.
[3 Jul 2020 12:27]
MySQL Verification Team
Yes, RENAME is blocking other transactions. Simply, it has to access two different table caches, which both have been locked. This is partially resolved in 8.0.