Bug #83158 Phantom-read problem in semi-sync loss-less replication
Submitted: 27 Sep 2016 4:08 Modified: 16 Nov 2016 11:25
Reporter: team phx (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:MySQL 5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: cluster, phantom-readm, replication, semi-sync

[27 Sep 2016 4:08] team phx
Description:
The bug can be recurred in a cluster with rpl_semi_sync_master_wait_slave_count > 0 if master restart before “Engine Commit” phase. During the crash and recovery, the last binlog events will be committed without slave’s ack. A phantom-read problem can be recurrenced now:

Any clients will read this events successfully on master. But these clients will not read them on slave if master crash down again in a very small period.(slave is switched into master manually or automatically).

This problem can also be found on the semi-sync principle article:

   http://my-replication-life.blogspot.co.uk/2013/09/loss-less-semi-synchronous-replication.h...

How to repeat:
Preparation:

1. Install semi-sync on both master and slave.

execute 'create database bug_db ; use bug_db ; create table bug_table( c1 int )' on master and check slave's semi-sync is ON now.

 
2. Execute ‘Insert into bug_table values(1);’on master
 
 
3. make slave disconnect with master.
 client connects to master and sends ‘Insert into bug_table values(2);’  
 master will hang while executing this SQL(after binlog is written)). 

4. restart master. 

 
5. client reconnect to master and execute ‘select * from bug_table;’
Here we can see value 2 was been inserted in Storage Engine.

6.kill master mysql again and makes it always dead.

 
7. Now, client have to read from slave, execute ‘select * from bug_table’ on slave.

The result shows client meets a phantom-read problem.

Suggested fix:
Offer a new hook point to solve this problem.

A)A hook point after binlog opened.
         Semi-sync could wait for the slave ack while recovery.
 
B)A hook point before binlog initilization.
  Rollback the binlog event if necessary.
[27 Sep 2016 4:09] team phx
bug description

Attachment: Phantom-readprobleminsemi-syncreplication.pdf (application/pdf, text), 271.98 KiB.

[27 Sep 2016 11:49] Umesh Shastry
Hello!

Thank you for the report and contribution.
Please be informed that in order to submit contributions you must first sign the Oracle Contribution Agreement (OCA). For additional information please check http://www.oracle.com/technetwork/community/oca-486395.html.
If you have any questions, please contact the MySQL community team - http://www.mysql.com/about/contact/?topic=community

Thanks,
Umesh
[29 Sep 2016 9:19] Umesh Shastry
Hello PhxSQL team!

This is just a follow up message.
Please note that in order to use your contribution, you have to sign the OCA. I have explained you in earlier note about OCA.

Kindly ignore above message if you have already signed OCA.

Thanks,
Umesh
[15 Nov 2016 11:59] team phx
Add before_binlog_init hook point as we mentioned, diff based on 5.7.16. We do hope this solution be accepted officially.

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: before_binlog_init.diff (application/octet-stream, text), 3.84 KiB.

[16 Nov 2016 11:24] team phx
more complete modifications we had made (also diff with 5.7.16)

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: hook_points_for_sync.diff (application/octet-stream, text), 12.78 KiB.

[16 Nov 2016 11:25] team phx
Patch above contains more complete modifications we had made on MySQL: 1. Add before_binlog_init hook point for binlog rollback. 2. Add extra parameters for exist after_flush hook point so that plugins could fetch binlog events. 3. Call init_server_auto_options before init_server_components so that before_binlog_init could get server_uuid. 4. Export function read_gtids_from_binlog in binlog.cc.
[17 Aug 4:40] Prasad N
I am hitting a similar situation:
Master crashes during heavy loading.
There is a failove and slave is promoted as a new master.
The old master after crash recovery comes back - and takes the role of a slave.

i can see that the new master does not have all transactions that seem to have been committed on the old master.

So this seems to defeat the purpose of loss-less failover.
[17 Aug 4:41] Prasad N
So is there any further update on the fix or resolution of this issue ?