Bug #71007 master data is different from slave after master was killed several times
Submitted: 26 Nov 2013 1:44 Modified: 25 Oct 2018 6:48
Reporter: raolh rao Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Replication Severity:S4 (Feature request)
Version:mysql-5.5.30 OS:Any
Assigned to: CPU Architecture:Any

[26 Nov 2013 1:44] raolh rao
Description:
In a replication a master have only one slave, I use sysbench to update data in the master,and the data-updates would be transmited to the slave.
before sysbench's data-updates finish, I killed master server so sysbench data-updates would stop for losing connections. I started master server again.afert master server was ok, I waited slave to recept all master binlogs and reply all of them. after that I used pt-table-checksum to compare master' data with slave data.as we known them must be same. but after I did this procedure several times continuely the master'data was different from slave's ,the reason is there was one row's updation wasn't loged in binary logs, so it couldn't transmit to slave.

How to repeat:
This is my test script:
---------------------------------
config1="$HOME/work/my1.cnf "
config2="$HOME/work/my2.cnf "
function myexc1()
{
  r=`mysql --defaults-file=$config1 -e "$1"`
  echo "$r"
}
function myexc2()
{
  r=`mysql --defaults-file=$config2 -e "$1"`
  echo "$r"
}

function slaveok()
{
  stat=`myexc1 "show master status"`
  pos=`echo $stat | awk -F " " '{print $6}'`
  stat=`myexc2 "show slave status"`
  t=`echo $stat | sed 's/ /\n/g' | grep "$pos" | wc -l`
  if [ "$t" = "2" ]
  then
    echo "ok"
  else
    echo "no"
  fi
}
set -x

function test()
{
  pid=`ps -ef | grep mysqld | grep hzraolh | grep my1.cnf | awk -F " " '{print $2}'`
  if [ -z "$pid" ]
  then
    $HOME/work/vsr_ha/bin/mysqld --defaults-file=$config1 &
    sleep 200
  fi
  while [ `slaveok` != "ok" ]
  do
    sleep 20
  done
  pt-table-checksum --host=127.0.0.1 --user=root --port=4565 --databases=test --tables=sbtest1
  sleep 5
  myexc2 "select this_crc, master_crc from percona.checksums" | grep -v "crc" > 1.res
  r=`cat 1.res | awk '{if($1 != $2){print "fail"; exit}}'`
  echo $r
  if [ "$r" = "fail" ]
  then
   echo "quit!!!!!"
   exit
  fi

  sysbench --test=$HOME/tmp/update_index.lua --oltp-table-size=1000000 --mysql-socket=/tmp/vsrha1.sock --mysql-db=test --oltp-dist-type=uniform --num-threads=20 --max-requests=0 --max-time=1800 --oltp-read-only=off --report-interval=10 run &
  sleep 300
  myexc1 "show status like \"%semi%\"" > 1.status
  stat=`myexc1 "show master status"`
  blog=`echo $stat | awk -F " " '{print $5}'`
  pid=`ps -ef | grep mysqld | grep hzraolh | grep my1.cnf | awk -F " " '{print $2}'`
  kill -9 $pid
  pushd /ssd/raolh/vsr_ha/node1
  sleep 5
  cp $blog "$blog"_bak
  popd
}

function main()
{
  i=1
  while [ $i -le 100 ]
  do 
    test
    i=$(($i+1))
  done
}
main
----------------------

This is my configurations:

master
----------------------

[client]
port		= 4565
socket		= /tmp/vsrha1.sock
user		=root

[mysqld]
port		= 4565
socket		= /tmp/vsrha1.sock
server-id	= 34
basedir = /styx/home/hzraolh/work/vsr_ha
datadir = /ssd/raolh/vsr_ha/node1
skip-external-locking
skip-slave-start
key_buffer_size = 16K
max_allowed_packet = 1M
table_open_cache = 4
sort_buffer_size = 64K
read_buffer_size = 256K
read_rnd_buffer_size = 256K
net_buffer_length = 2K
thread_stack = 128K
general_log=OFF
slow_query_log=OFF
log_error = my-error
log-bin = mysql-bin
relay_log = mysql-relay
binlog_format = ROW

#InnoDB
table_cache              = 512
long_query_time          = 5
max_connections         = 800
query_cache_type         = 0
character-set-server       = utf8
expire_logs_days         = 10
max_binlog_size          = 1000M

innodb_io_capacity        = 800
innodb_buffer_pool_size = 4G
innodb_autoextend_increment   = 128
innodb_log_files_in_group       = 2
innodb_log_file_size            = 1G
innodb_fast_shutdown = 0
innodb_lock_wait_timeout        = 5
innodb_flush_method           = O_DIRECT
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table           = 1
max_allowed_packet           =16M
net_write_timeout              = 12000
sync_binlog                   =1
---------------------------------------

slave
---------------------------------------
[client]
port		= 4566
socket		= /tmp/vsrha2.sock
user		=root

[mysqld]
port		= 4566
socket		= /tmp/vsrha2.sock
server-id	= 35
relay_log_purge = OFF
basedir = /styx/home/hzraolh/work/vsr_ha
datadir = /ssd/raolh/vsr_ha/node2
loose-rpl_semi_sync_slave_enabled = ON
skip-external-locking
skip-slave-start
key_buffer_size = 16K
max_allowed_packet = 1M
table_open_cache = 4
sort_buffer_size = 64K
read_buffer_size = 256K
read_rnd_buffer_size = 256K
net_buffer_length = 2K
thread_stack = 128K
general_log=OFF
slow_query_log=OFF
log_error = my-error
log-bin = mysql-bin
relay_log = mysql-relay

#InnoDB
table_cache              = 512
long_query_time          = 5
max_connections         = 800
query_cache_type         = 0
character-set-server       = utf8
expire_logs_days         = 10
max_binlog_size          = 1000M

innodb_io_capacity        = 800
innodb_buffer_pool_size = 4G
innodb_autoextend_increment   = 128
innodb_log_files_in_group       = 2
innodb_log_file_size            = 1G
innodb_fast_shutdown = 0
innodb_lock_wait_timeout        = 5
innodb_flush_method           = O_DIRECT
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table           = 1
max_allowed_packet           =16M
net_write_timeout              = 12000
sync_binlog                   =1
[28 Nov 2013 2:13] raolh rao
I later made a same test on mysql-5.5.20, and there was no problem on it.so I guess there was same changes made between 5.5.20 and 5.5.30 that cause this problem.
[28 Nov 2013 18:41] Sveta Smirnova
Thank you for the report.

Do you use MyISAM tables? What is the output of SHOW SLAVE STATUS\G on slave?
[29 Nov 2013 2:02] raolh rao
no,I used a InnoDB table named sbtest1 created by sysbench.

I am very sure that slave had replayed all binary logs requested from master, and master had send all its binary logs to slave.

on the master
-----------------------
mysql> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000131 |     7167 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.01 sec)

mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8 MAX_ROWS=1000000
1 row in set (0.00 sec)

on the slave
-----------------------------------------
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 4565
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000131
          Read_Master_Log_Pos: 7167
               Relay_Log_File: mysql-relay.000122
                Relay_Log_Pos: 7314
        Relay_Master_Log_File: mysql-bin.000131
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 7167
              Relay_Log_Space: 8171
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 34
1 row in set (0.00 sec)

mysql> show create table sbtest1\G
*************************** 1. row ***************************
       Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8 MAX_ROWS=1000000
1 row in set (0.00 sec)
[29 Nov 2013 10:49] Sveta Smirnova
Thank you for the feedback.

> I killed master server so sysbench data-updates would stop for losing connections.

How did you kill master server? Did you use `kill -9`?
[1 Dec 2013 10:41] raolh rao
yes I did
[2 Dec 2013 17:11] Sveta Smirnova
Thank you for the feedback.

In this case this is not a bug: `kill -9` crashes server, so it can not perform necessary operations, such as writing last record into binary log. You should use `mysqladmin shutdown` instead.
[3 Dec 2013 1:40] raolh rao
Yes,it is a bug.
we all know when a transction commit,it firstly writes binlog then writes redo log and flush dirty pages in storge engine(InnoDB).so, if a transaction doesn't write binlog, it shouldn't change any data. 
But in this case, `kill -9` crashes server, there is a transaction doesn't write binlog but change data.This problem can cause different data between master and slave.
[16 Jan 2014 17:28] Sveta Smirnova
Thank you for the feedback.

We internally discussed this case and decided that although our user manual clearly says such behavior is possible: "A value of 1 is the safest choice because in the event of a crash you lose at most one commit group from the binary log." (http://dev.mysql.com/doc/refman/5.5/en/replication-options-binary-log.html#sysvar_sync_bin...) we can verify this as feature request: "Add support for transactional replication".
[17 Jan 2014 8:37] raolh rao
Thanks, but I have found  the reason lead to this problem, it's the bug #58785.
[25 Oct 2018 6:48] Erlend Dahl
We will not implement this request.