Bug #77011 Hanging "System Lock" when executing "flush table ... for export"
Submitted: 12 May 2015 6:50 Modified: 18 May 2016 12:19
Reporter: Vincent Meng Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.6 verified, 5.6.25 OS:Any
Assigned to: CPU Architecture:Any
Tags: flush tables for export, hanging system lock

[12 May 2015 6:50] Vincent Meng
Description:
I was trying to run "flush table ... for export" on a table. There weren't any operations on that table, the query should go through. But it didn't and the query was hanging on "System Lock" for a long time. Sometimes it hangs for minutes and sometimes it hangs for hours or days.

The environment is a busy database with a lot of DML. The table is a new table without any data, and there was no operations on it. However, there were other DML running on other tables.

We found the requirements to reproduce this issue are:
1. change buffer is actively used (ibuf_size > 1)
2. the table to be flushed has no records in change buffer (we created a new table to demonstrate this)

How to repeat:
We can now reproduce this issue in a very simple environment. So attached a link to our Vagrant file that was used to reproduce this issue.

https://github.com/ftb-dataengineering/Virtual-Machine/tree/master/system-lock

Here are 4 simple steps to reproduce the hanging system lock:

1. [terminal] vagrant up & vagrant ssh
2. [terminal] sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=50000000 --mysql-db=test --mysql-user=vagrant --mysql-password=vagrant prepare
3. [sql] create table app (id int primary key);
4. [sql] flush tables app for export;

Suggested fix:
We found that in mysql-server/storage/innobase/ibuf/ibuf0ibuf.cc ibuf_contract_in_background function, it's stuck in the while loop. Basically the function still tries to flush the ibuf when there's no data for that specific table_id.

In this line: n_bytes = ibuf_merge(table_id, &n_pag2, FALSE);
When there's no records in change buffer for table A, it should return 0 for n_bytes, hence the loop can be broken later on. However it returns 1 instead of 0.

Eventually the issue is from ibuf_merge_space, there's a line:
++sum_sizes;

For unknown reason it plus 1, and it becomes the return value for n_bytes.

If there's purpose for ++sum_sizes, there should be a check in ibuf_contract_in_background function. Or the ++sum_sizes should be fixed.
[12 May 2015 6:55] Vincent Meng
There is a temporary solution to resolve the hanging system lock:
set global innodb_change_buffer_max_size = 0;
or
set global innodb_change_buffering = noen;

This will clear the change buffer and hence break the while loop.
[12 May 2015 23:00] Vincent Meng
This bug is also reported on Percona
https://bugs.launchpad.net/percona-server/+bug/1454441
[22 May 2015 8:01] Umesh Shastry
Hello Vincent Meng,

Thank you for the report.
Observed this with 5.6.25.

Thanks,
Umesh
[22 May 2015 8:09] Umesh Shastry
test results

Attachment: 77011.txt (text/plain), 18.02 KiB.

[22 May 2015 8:09] Umesh Shastry
Related - Bug #68639
[18 May 2016 12:19] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 5.6.31, 5.7.13 release, and here's the changelog entry:

A FLUSH TABLES ... FOR EXPORT operation appeared to stall. A loop in the
ibuf_contract_in_background function failed to exit. 

Thank you for the bug report.
[26 Oct 2017 17:29] Gene Torres
happens to me on 56.36