MySQL Bugs: #1760: no simple way to tell if slave SQL thread has completed relay log

Bug #1760	no simple way to tell if slave SQL thread has completed relay log
Submitted:	5 Nov 2003 10:38	Modified:	5 Nov 2003 13:25
Reporter:	William Tam	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S4 (Feature request)
Version:		OS:
Assigned to:		CPU Architecture:	Any

Description:
It would be useful to have simple programatic way to find out if the slave SQL_THREAD has executed all events in the relay log.  This would be especially useful for automating failover in scripted high-availability systems.  If user wants to CHANGE MASTER TO on a slave that may have a large number of events still to "catch up", and the original master is unavailable, the user cannot execute the CHANGE MASTER TO with an UNTIL clause because he can't get the offset and logfile name from SHOW MASTER STATUS (since the master is unavailable).

How to repeat:
Only current way to find out if the slave SQL_THREAD has executed all events in th e relay log is to execute "show status" and look for text "Has read all relay log" in result.

Suggested fix:
Add "relay_log_status" to SHOW SLAVE STATUS; and/or add capability to MASTER_POS_WAIT() function to allow it capability to block until all commands in current relay log is completed.

Hi!

>Suggested fix:
>Add "relay_log_status" to SHOW SLAVE STATUS

In 4.1.1 (not released yet) we added a column Slave_IO_State which is the copy of what there is in SHOW PROCESSLIST for the slave I/O thread. This is not what you are looking for; you are looking for the same thing but for the slave SQL thread. I will put this in discussion internally.
As you said, a workaround is simply:
SHOW PROCESSLIST | grep "Has read all relay log"
(the slave SQL thread is the only thread which can have this state).

>and/or add capability to
>MASTER_POS_WAIT() function to allow it capability to block until all
>commands in current relay log is completed.

You could achieve this with START SLAVE UNTIL RELAY_LOG_FILE=, RELAY_LOG_POS=:
do a 'ls -l' in the slave's data directory, to find the last relay log and its size, then use this info into the above command. It will do what you suggested for MASTER_POS_WAIT() (which waits for-the-master, so is not meant for cases where there is no more master).

Thanks for your suggestions!

Regards,
Guilhem

Thanks very much for the quick reply, and thanks for looking at adding Slave_SQL_State to SHOW SLAVE STATUS!  

The idea that "...you could achieve this with START SLAVE UNTIL RELAY_LOG_FILE=,
RELAY_LOG_POS=:" is a somewhat more interesting approach than looping through every second or so to SHOW PROCESSLIST | grep "Has read all relay log"
Again, from a dev standpoint, I could do some not-to-terribly complicated regexp on the results of "ls -l" to get the args.  Still, I must then somehow poll the slave thread every second or so to see if it's done yet (ie still running) -- a somewhat ugly hack.    

I guess what I'm really looking for is a slave version of the MASTER_POS_WAIT() function that "waits-on-slave", defaults to the values in relay-log.info for its start point, and stops blocking the first time it runs out of relay log!  Not too much to ask, eh?  In the meantime, I'll do a workaround, but I think that this would be a very helpful thing for developing high-availability systems.  Thanks again!