Bug #85034 | Memory leak in GTID replication (crash bug) | ||
---|---|---|---|
Submitted: | 17 Feb 2017 1:02 | Modified: | 2 Mar 2017 13:13 |
Reporter: | Trey Raymond | Email Updates: | |
Status: | Duplicate | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.7.17 | OS: | Red Hat (6.8) |
Assigned to: | CPU Architecture: | Any |
[17 Feb 2017 1:02]
Trey Raymond
[17 Feb 2017 1:17]
Trey Raymond
FYI you can get replication running on the described setup with gtid and log_slave_updates both off, and memory will not be an issue. but that drops functionality and a level of redundancy. We could do more interesting test cases around log_slave_updates to control how many copies of a tx went to each place, but that's blocked by https://bugs.mysql.com/bug.php?id=84973
[20 Feb 2017 21:01]
Trey Raymond
If we get rid of slaves and msr, and look at the problem on the master only...setting relay_log_info_repository=FILE (with sync_relay_log_info=1) causes replication to run without runaway memory use, though of course more slowly - and features such as msr, as well as monitoring any many other things dependent on TABLE, are unusable. Hopefully that will help figure out where to look...something in its population of the slave_relay_log_info table seems to be causing the issue.
[21 Feb 2017 19:40]
Trey Raymond
This has been determined to be two separate but related issues. On the master (server A in the example), the problem occurs when sync_relay_log_info=1, even though relay_log_info_repository=TABLE. That variable should be ignored unless repo=FILE. Somewhere in the code, the paths get crossed. On the msr slave (server C), the memory usage is from processing logs created with binlog_rows_query_log_events on. Disabling that on the masters stops the leak on the slaves. It's apparently loading the statement text into memory, and not releasing it when the gtid is skipped. Big thanks to Xiang Rao and Przemek Malkowski for their help testing this over the weekend.
[23 Feb 2017 18:51]
Trey Raymond
Downgrading this to S2 - while it's serious, we found that mentioned workaround that has only a minor loss of functionality
[2 Mar 2017 13:13]
MySQL Verification Team
Hello Trey Raymond, Thank you for the report. Looks like I'll have to mark this older bug as duplicate of Przemyslaw's newer Bug #85251 which I confirmed today and it has same issue as reported here. Thanks, Umesh