Bug #89065 sync_binlog=1 on a busy server and slow binary log filesystem stalls slaves
Submitted: 27 Dec 2017 14:13 Modified: 8 Jan 12:58
Reporter: Riccardo Pizzi Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.35 OS:Any
Assigned to: Bogdan Kecman

[27 Dec 2017 14:13] Riccardo Pizzi
Description:
If the binary logs of a busy server reside on a slow storage (avg seek time 40 to 60 ms) setting sync_binlog=1 gives an hard time to the binlog dump threads, which are unable to send the binlog files to the slaves in a timely manner, and therefore the slaves' I/O thread falls behind beyond hope.

I believe this is due to locks taken when accessing the binlog files, however, such lock shouldn't be taken on already closed files.

This situation can be easily spotted because the binlog dump threads on master stay most of the time in the "Finished reading one binlog; switching to next binlog" state, I have attached a threads dump at the time below.

eg:

>select * from information_schema.processlist where user = 'db05usr';
+---------+---------+---------------------+------+-------------+--------+-------------------------------------------------------+------+-----------+-----------+---------------+-------+
| ID      | USER    | HOST                | DB   | COMMAND     | TIME   | STATE                                                 | INFO | TIME_MS   | ROWS_SENT | ROWS_EXAMINED | TID   |
+---------+---------+---------------------+------+-------------+--------+-------------------------------------------------------+------+-----------+-----------+---------------+-------+
| 1830450 | db05usr | 10.120.25.192:33254 | NULL | Binlog Dump | 703035 | Finished reading one binlog; switching to next binlog | NULL | 703034878 |         0 |             0 | 53569 |
| 2155037 | db05usr | 10.10.31.246:58697  | NULL | Binlog Dump | 483652 | Finished reading one binlog; switching to next binlog | NULL | 483651827 |         0 |             0 | 27157 |
| 1832601 | db05usr | 10.120.25.97:57808  | NULL | Binlog Dump | 701740 | Finished reading one binlog; switching to next binlog | NULL | 701739918 |         0 |             0 |  9691 |
+---------+---------+---------------------+------+-------------+--------+-------------------------------------------------------+------+-----------+-----------+---------------+-------+
3 rows in set (0.01 sec)

Setting sync_binlog to zero (or even 2....) immediately fixes the problem.

I think there is a lot of competition on access to binlog mutexes, and there is little time left for the binlog dump threads to do their job when sync_binlog=1. I remember a similar bug in 5.5, where a very busy master was causing exacty the same issue.

How to repeat:
- DML heavy workload
- slowish binlog storage
- sync_binlog=1
[27 Dec 2017 14:14] Riccardo Pizzi
threads dump during the issue

Attachment: threads_dump.txt (text/plain), 7.20 KiB.

[8 Jan 12:52] Bogdan Kecman
Hi,

Yes, this behavior can easily be reproduced but I don't believe it is a bug. It is fairly well documented what's happening when you set sync_binlog to 1 and setting it to 1 on slow storage is just wrong config, not a bug.

all best
Bogdan
[8 Jan 12:57] Riccardo Pizzi
Well, it is documented that there is a performance penalty,
not that replication stops because the slave threads don't have a change to access the binlog file.

I really believe the locks that protect the binlog subsystem should be changed so that only the current file is protected, not all of them. From what I see, the locks protect *any* access to *any* binlog file, even old ones. There is where in my opinion the bug exists. If we only protect the current file, slave threads are free to access the previous one (for example) and the problem would be mitigated if not solved.

Thanks,
Rick
[8 Jan 12:58] Riccardo Pizzi
Sorry, I meant "don't have a chance" above.