Bug #69102 Add counter for number of "statements" skipped due to SQL_SLAVE_SKIP_COUNTER
Submitted: 30 Apr 2013 10:52 Modified: 30 Apr 2013 17:45
Reporter: Simon Mudd (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:5.6.11 OS:Any
Assigned to: CPU Architecture:Any

[30 Apr 2013 10:52] Simon Mudd
Description:
Sometimes it is necessary to skip statements due to replication breakage.
This is usually done with something like this.

SET GLOBAL SQL_SLAVE_SKIP_COUNTER = XX; start slave;

However, there are no counters of how often this happens, and if a large group of statements need to be skipped this information would be useful.

How to repeat:
Look for counters. You won't find one.

Suggested fix:
Create a counter like slave_sql_skipped which records the number of times that this has happened.
This is similar in concept to http://bugs.mysql.com/bug.php?id=69101.
[30 Apr 2013 13:22] MySQL Verification Team
It appears in the slave's error log, but that is not queryable via SQL:

2013-04-30 15:20:15 3984 [Note] 'SQL_SLAVE_SKIP_COUNTER=2' executed at relay_log_file='.\test-relay-bin.000004', relay_log_pos='282', master_log_name='test-bin.000001', master_log_pos='120' and new position at relay_log_file='.\test-relay-bin.000005', relay_log_pos='532', master_log_name='test-bin.000001', master_log_pos='370'
[30 Apr 2013 17:45] Simon Mudd
The reason for wanting to have these counters is:
- you might need to skip certain types of error
- you might have that configured and want to know if you are still getting errors or things may have recovered (parsing the log files each time is not helpful)
- for graphing this gives you a way of tracking issues over time (normally you would expect this counter to not change)
- if you have a large replication tree it may be better to have some broken tables rather than a slave which is not up to date at all (thus affecting more parts of the database) so while stopping and looking at each error in turn and dealing with it by hand that may not be practical as the number of servers you manage grows: initially you just skip (some) errors and then can monitor whether the problem is resolved, finally resolving the issue in a more co-ordinated fashion.  These counters make monitoring such issues much easier.