Bug #69196 Master updates Slave but not Master itself
Submitted: 10 May 2013 12:42 Modified: 21 Aug 2014 19:50
Reporter: Mannoj S Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.0.77 OS:Linux (CentOS 5.4 Final)
Assigned to: CPU Architecture:Any
Tags: replication

[10 May 2013 12:42] Mannoj S
Description:
I have a setup master/slave in which applications are pointed only to master. Yesterday Master had got crashed due to "multi bit error on dimm detected" in Front indication panel in orange color.

After reboot I see the master went for recovery then it came up and application started to use the Master normally but still the err msg exist in Front indication panel. But now the slave had stuck out due to primary key constraint for a table.

The problem I face is master has a table data until what exist in binlog of master. But slave has the table data what is not in master binlog. Below is the table details when compared to Master table and Slave table.

Last info in binlog timestamp is 2013-04-29 02:13:11 
System shut down timestamp 2013-04-29 02:54         

     On Master
     ==========
            mysql> select * from audit where id=11298907;
            ------------------------------------------------------------
            | id     | ipaddress      | dated        | msisdn  |
            +----------+----------------+-------------------------------
            |11298907 | 82.25.226.183  | 2013-04-29 02:13:11 | 998282821|
            -----------------------------------------------------------
            mysql> select * from audit where id > 11298907 limit 1;

            -----------------------------------------------------------
            | id     | ipaddress      | dated        |msisdn  |
            +----------+----------------+------------------------------
            |11298908 | 82.25.226.183 | 2013-04-29 04:31:13 | 992828111|
            ------------------------------------------------------------

            On Slave
            ==========

            mysql> select * from audit where id=11298907;
            ------------------------------------------------------------
            | id     | ipaddress      | dated        | msisdn  |
            +----------+----------------+-------------------------------
            |11298907 | 82.25.226.183  | 2013-04-29 02:13:11 | 998282821|
            -----------------------------------------------------------

            mysql> select * from audit where id > 11298907 limit 1;

            -----------------------------------------------------------
            | id     | ipaddress      | dated        |msisdn  |
            +--------+----------------+---------------------------------
            |11298908 | 82.25.226.183 | 2013-04-29 02:13:12 | 762616173|
            ------------------------------------------------------------
Both the servers are ntp synched with GST. And they remain same after Master reboot too. It took almost 2 hrs to bring up Master Server. How come only slave can have the data that Master is not aware of?  

Slave takes information from Master's binlog once it is written to disk, and that is how Slave DB server gets updated. ( I can get convinced when before flushing to disk the DB server got crashed and that is why I get previous data on Master). But what about the lines that found missing for me in Master binlog? 
I don't find any trimmed information from Master err.log . As I see this is a vital flaw when it comes to Slave continuing from point of failure, it throws primary key constraint for the records for new entries from Master to Slave. I'm reporting this as a bug. 

How to repeat:
Hit with a regular TPS of 100 to a Master 5.0.77 Mysql server under SBL that has slave of the same type hearing in the same network. Only normal DMLS with Seconds_Behind_Master=0. Crash the Master mysql server by doing a force reboot. After Master recovery, cross check the auto increments on Master and Slave tables. It says Slave having greater values than Master. Thus proving Master lags behind slave and Slave couldn't hear from Master again.

Suggested fix:
After recovery and rollback of Master DB data provide information about Master binlog position as rolled back checkpoint into binary file to rollback on Slave, whilst relay log info persist in Slave and auto rollback on Slave to the point where Master binlog checkpoint points.
[10 May 2013 16:23] Mannoj S
As this breaks disaster recovery site am moving this to  Critical
[21 May 2013 5:21] Mannoj S
Any update about this bug?
[21 Jul 2014 19:50] Sveta Smirnova
Thank you for the report.

Version 5.0.77 is not supported anymore. Please upgrade to one of current GA versions 5.5 or 5.6 and if same issue happens again provide output of SHOW CREATE TABLE audit, configuration files from master and slave and last binary log before crash.

Actually such bad crashes can lead to data loss from the master's binary log, but I see a change here. I suspect that binary log was simply corrupted, but in very bad way which allowed it to be executed on slave.
[22 Aug 2014 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".