Bug #67840 MySQL replication start inefficient at large RTTs
Submitted: 7 Dec 2012 19:43 Modified: 10 Jan 2013 19:19
Reporter: Domas Mituzas Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1, 5.5, 5.5.30, 5.7.1 OS:Any
Assigned to: CPU Architecture:Any
Triage: Needs Triage: D3 (Medium)

[7 Dec 2012 19:43] Domas Mituzas
Description:
Every time MySQL I/O thread connects to a master, it runs multiple statements such as:

SELECT UNIX_TIMESTAMP()
SHOW VARIABLES LIKE 'SERVER_ID'
SELECT @@GLOBAL.COLLATION_SERVER
SELECT @@GLOBAL.TIME_ZONE (4.x-master-only)

All of them can be merged into single SELECT, thus having only one roundtrip instead of four and avoid lengthy negotiation to start getting replication stream. 

 SELECT UNIX_TIMESTAMP(), @@global.server_id, @@global.collation_server, @@global.time_zone
    -> ;
+------------------+--------------------+---------------------------+--------------------+
| UNIX_TIMESTAMP() | @@global.server_id | @@global.collation_server | @@global.time_zone |
+------------------+--------------------+---------------------------+--------------------+
|       1354909287 |          128076063 | latin1_swedish_ci         | SYSTEM             |
+------------------+--------------------+---------------------------+--------------------+
1 row in set (0.07 sec)

How to repeat:
START SLAVE

Suggested fix:
merge statements into one, if they are really needed
[13 Dec 2012 20:46] Sveta Smirnova
Thank you for the report.

Verified as described, although slave sends these queries in different places of code.
[8 Jan 2013 1:55] Domas Mituzas
Sveta, that depends on how one understands the word "different". It is essentially same function that sends them all, and there's not much conditional logic around that, just subsequent code portions doing same. 

What did you mean by "different places" ?
[10 Jan 2013 18:54] Sveta Smirnova
Domas, I meant these are not 3 queries in row, like:

SELECT UNIX_TIMESTAMP()
SHOW VARIABLES LIKE 'SERVER_ID'
SELECT @@GLOBAL.COLLATION_SERVER

but:

//do some work

SELECT UNIX_TIMESTAMP()

//do more work

SHOW VARIABLES LIKE 'SERVER_ID'

//work more

SELECT @@GLOBAL.COLLATION_SERVER

So I did not analyzed if it is possible to combine them in single query. I believe it is, but left Replication Team to decide.
[10 Jan 2013 19:19] Domas Mituzas
Well, C code is always relatively verbose. In high level language that "do some work" logic would be oneliners and you wouldn't raise this issue, that there's code between statements. Of course there is, but all that code can happen after the single select.
[12 Jan 2013 13:18] Sveta Smirnova
half-fix for this: fixes 5.0+ issue only

Attachment: bug67840.diff (application/octet-stream, text), 3.07 KiB.

[12 Jan 2013 13:21] Sveta Smirnova
Thank you for the feedback.

I finally analyzed it properly and even created half-fix for this bug against 5.5, so for 5.0+ master slave will send single SELECT query instead of two. Although I still think proper fix require more work for developers from the replication team.