MySQL Bugs: #77237: Multi-threaded slave log spamming on failure

Bug #77237	Multi-threaded slave log spamming on failure
Submitted:	4 Jun 2015 0:52	Modified:	1 Apr 2016 1:20
Reporter:	Davi Arnaut (OCA)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	5.6, 5.6.26	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	MTS, replication

Description:
When a multi-threaded slave stops with an error, the same error message is printed three times.

How to repeat:
1. Start a multi-threaded slave.
2. Insert a row on master.
create table t1 (a int primary key) engine=innodb;
insert into t1 values (1);
3. Lock row on slave
select * from t1 where a = 1 for update;
4. Update row on master;
update t1 set a = 2 where a = 1;
5. Wait for slave to stop.
6. Look at error log.

Error log sample:

2015-06-03 17:45:29 13646 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000001' at position 4520, relay log './mysqld-relay-bin.000006' position: 3838
2015-06-03 17:46:20 13646 [Warning] Slave SQL: Worker 49 failed executing transaction '157c7cfa-0a32-11e5-9002-e83935312055:22' at master log mysql-bin.000001, end_log_pos 4748; Could not execute Update_rows event on table es_foo_1.t1; Lock wait timeout exceeded; try restarting transaction, Error_code: 1205; handler error HA_ERR_LOCK_WAIT_TIMEOUT; the event's master log FIRST, end_log_pos 4748, Error_code: 1205
2015-06-03 17:46:20 13646 [ERROR] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2015-06-03 17:46:20 13646 [Note] Error reading relay log event: slave SQL thread was killed
2015-06-03 17:46:20 13646 [Warning] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2015-06-03 17:46:20 13646 [Warning] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756

Suggested fix:
Don't print the same message three times.

Hello Davi,

Thank you for report and test case.
Observed this with latest 5.6.26 build.

Thanks,
Umesh

// Test case used from Bug #77239

2015-06-04 09:16:00 7429 [Warning] Slave SQL: Worker 4 failed executing transaction '' at master log master-bin.000003, end_log_pos 1808; Error 'Lock wait timeout exceeded; try restarting transaction' on query. Default database: 'test'. Query: 'update t1 set a = 3 where a = 2', Error_code: 1205
2015-06-04 09:16:00 7429 [ERROR] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2015-06-04 09:16:00 7429 [Note] Error reading relay log event: slave SQL thread was killed
2015-06-04 09:16:00 7429 [Warning] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2015-06-04 09:16:00 7429 [Warning] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756

Posted by developer:
 
Fixed as of the upcoming MySQL Server 5.7.12 / 5.6.30 releases, and here's the changelog entry:

When a multi-threaded slave stopped with an error, the same error message
was printed three times. Now, the SQL thread's kill acceptance status is
saved, and only printed once.

Thank you for the bug report.