Bug #72377 misleading statement about locking master while dumping data
Submitted: 18 Apr 2014 4:44 Modified: 13 Aug 2014 11:03
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.6 OS:Any
Assigned to: David Moss CPU Architecture:Any

[18 Apr 2014 4:44] Mark Callaghan
Description:
see http://dev.mysql.com/doc/refman/5.6/en/replication-howto-masterstatus.html

This makes the reader think the master is read-only for the duration of the data dump on a master when setting up a slave. Someone ranted about that and the rant reached highscalability.com:
http://martin.kleppmann.com/2014/03/26/six-things-about-scaling.html
http://highscalability.com/blog/2014/4/16/six-lessons-learned-the-hard-way-about-scaling-a...

"If you have existing data on your master that you want to synchronize on your slaves before starting the replication process, you must stop processing statements on the master, and then obtain its current binary log coordinates and dump its data, before permitting the master to continue executing statements. If you do not stop the execution of statements, the data dump and the master status information that you use will not match and you will end up with inconsistent or corrupted databases on the slaves."

How to repeat:
read the docs

Suggested fix:
fix the docs
[18 Apr 2014 9:47] MySQL Verification Team
Hello Mark,

Thank you for the report.

Thanks,
Umesh
[24 Jun 2014 14:35] David Moss
Thank you for the feedback.
Having reviewed the blogs you posted and reread the documentation on this a few times I'm afraid I don't see what needs to be changed here.

My understanding is that the master database has to be "locked" from processing any changes while the data dump is taken to ensure that the data dump is complete. This is what the section of docs you quoted seems to say.

Could you provide a bit more detail on which part of the documentation you feel should be changed?
[18 Jul 2014 15:50] David Moss
Thanks for your additional input. I have added the following to the 5.7 documentation:
If you have existing data on your master that you want to synchronize on your slaves before starting the replication process, consider how you will capture and transfer this data. When you are using InnoDB, you do not need a read-lock and a transaction that is long enough to transfer the data snapshot is sufficient and the following procedure is not recommended. However, if you are not using InnoDB, you must stop processing statements on the master to obtain a read-lock, then get its current binary log coordinates and dump the data, before permitting the master to continue executing statements. If you do not stop the execution of statements, the data dump and the master status information will not match, resulting in inconsistent or corrupted databases on the slaves.

Note that there will be some more changes required to completely cover this. Therefore I'm not closing the bug yet.
[13 Aug 2014 11:03] David Moss
Final fixes have been made this week.
[13 Aug 2014 11:03] David Moss
Posted by developer:
 
Final fixes were made this week.