Bug #46793 Slave Fail with "Errno: 1594,Relay log read failure"
Submitted: 18 Aug 2009 19:14 Modified: 6 Oct 2009 7:57
Reporter: James Chen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:v 5.1.37 OS:Linux (CentOS 5.0)
Assigned to: CPU Architecture:Any
Tags: Relay log read failure, replication, salve

[18 Aug 2009 19:14] James Chen
Description:
My Servers are in 1 master(mas1)-2 slaves(slv1 & slv2) mode, i've setup this mode for 2 months ago. The mas1-slv2 mode are runing well, but mas1-slv1 always have a error below nor more than runing almost 20 sql statements, therefor the slave will not sync all the time:
===[ slave's show slave status \G; ]===========================================
mysql> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: mas1
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 5400
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 4723
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1594
                   Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 4578
              Relay_Log_Space: 5701
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1594
               Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
1 row in set (0.00 sec)
===[ (slv1)slave's mysql error log ]===========================================
090819  2:25:11 [ERROR] Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0
090819  2:25:11 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
090819  2:25:11 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
090819  2:25:11 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000006' position 4578
===[ (mas1)master's mysqlbinlog ]==============================================
[root@mas1 mysql]#mysqlbinlog -f mysql-bin.000006 --position=4578
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 4
#090819  1:53:49 server id 200  end_log_pos 106         Start: binlog v 4, server v 5.1.37-log created 090819  1:53:49 at startup
# Warning: this binlog is either in use or was not closed properly.
ROLLBACK/*!*/;
BINLOG '
reqKSg/IAAAAZgAAAGoAAAABAAQANS4xLjM3LWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAACt6opKEzgNAAgAEgAEBAQEEgAAUwAEGggAAAAICAgC
'/*!*/;
# at 4578
#090819  2:25:11 server id 200  end_log_pos 4606        Intvar
SET INSERT_ID=66/*!*/;
# at 4606
#090819  2:25:11 server id 200  end_log_pos 4829        Query   thread_id=9     exec_time=0     error_code=0
use Concert/*!*/;
SET TIMESTAMP=1250619911/*!*/;
SET @@session.pseudo_thread_id=9/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
INSERT INTO PrintTicket ( CONCERT_SN, AREA_SN, AREACUBE_SN, AREACUBESEAT_SN, STORE_SN, MDYTIME, ADMINSN ) VALUES ( 1, 2, 1, 73, 1, '2009-08-19 02:25:11', 1 )
/*!*/;
# at 4829
#090819  2:25:11 server id 200  end_log_pos 4900        Query   thread_id=9     exec_time=0     error_code=0
SET TIMESTAMP=1250619911/*!*/;
SET @@session.autocommit=0/*!*/;
BEGIN
/*!*/;
# at 4900
#090819  2:25:11 server id 200  end_log_pos 5122        Query   thread_id=9     exec_time=0     error_code=0
SET TIMESTAMP=1250619911/*!*/;
UPDATE AreaCubeSeat SET STATUS=3, BOOKINGDT='2009-08-19 02:25:11', PAYDT='2009-08-19 02:25:11', BOOKINGADMINSN=1, PAYADMINSN=1, PAYAMOUNT=4500 WHERE SN = 72
/*!*/;
# at 5122
#090819  2:25:11 server id 200  end_log_pos 5149        Xid = 164
COMMIT/*!*/;
# at 5149
#090819  2:25:11 server id 200  end_log_pos 5177        Intvar
SET INSERT_ID=67/*!*/;
# at 5177
#090819  2:25:11 server id 200  end_log_pos 5400        Query   thread_id=9     exec_time=0     error_code=0
SET TIMESTAMP=1250619911/*!*/;
SET @@session.autocommit=1/*!*/;
INSERT INTO PrintTicket ( CONCERT_SN, AREA_SN, AREACUBE_SN, AREACUBESEAT_SN, STORE_SN, MDYTIME, ADMINSN ) VALUES ( 1, 2, 1, 72, 1, '2009-08-19 02:25:11', 1 )
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

How to repeat:
when you use InnoDB and run with transation statements, it happends every time and no more than run than few(maybe 20~30) sql transaction statements.
Even I run the "slave start", the slv1 still can not sync with master.
I should use mysqldump to dump sql from master, and restore to slv1. and login to the slv1 with typeing " change master to Master_Log_File='mysql-bin.000006',Master_Log_Pos=xxxx". The slv1 will run again.

Suggested fix:
when I use "change master to Master_Log_File='mysql-bin.000006',Master_Log_Pos=xxxx", the slave will work without sync again. I can run for few more sql transaction statements, it will go wrong again.
[18 Aug 2009 19:16] James Chen
I think this may a bug ,so can you explain it ,thank you!
[19 Aug 2009 5:25] Sveta Smirnova
Thank you for the report.

Are all servers of same version 5.1.37 or version of master/slave is different? Could you also please provide configuration files from all servers?
[19 Aug 2009 14:10] James Chen
All servers are the same version of 5.1.37.
====[ Hareware information ]=============================================
[mas1]
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            3060  @ 2.40GHz
stepping        : 6
cpu MHz         : 1596.000
cache size      : 4096 KB

MemTotal:      8182632 kB
MemFree:       6639432 kB
Buffers:        188288 kB
Cached:        1138488 kB
SwapCached:          0 kB
Active:        1068516 kB
Inactive:       386196 kB

[slv1 & slv2 are same hardware]
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            3060  @ 2.40GHz
stepping        : 6
cpu MHz         : 1596.000
cache size      : 4096 KB

MemTotal:      4051832 kB
MemFree:       1763760 kB
Buffers:        171096 kB
Cached:        1947696 kB
SwapCached:          0 kB
Active:        1025528 kB
Inactive:      1139848 kB

====[ version information ]=============================================
[root@slv2 ~]#mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 158
Server version: 5.1.37-log MySQL Community Server (GPL) by Remi
mysql> select version();
+------------+
| version()  |
+------------+
| 5.1.37-log |
+------------+
1 row in set (0.00 sec)

====[ (mas1)Master Configure file ]=============================================
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
log-bin=mysql-bin
server-id=200
innodb_flush_log_at_trx_commit=1
sync_binlog=1
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=1
max_connections=20480
wait_timeout=450000
max_allowed_packet=100M
default-character-set=utf8
init_connect="SET NAMES UTF8;"

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
====[ (slv1)Slave Configure file ]=============================================
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
log-bin=mysql-bin
server-id=201
report-host=slv1.wis.lemon.cx
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
max_connections=10240
#query_buffer=8M
#myisam_sort_buffer_size=4M
wait_timeout=50
#key_buffer=64M
max_allowed_packet=100M
default-character-set=utf8
init_connect="SET NAMES UTF8;"
innodb_data_file_path=ibdata1:10M:autoextend

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

====[ (slv2)Slave Configure file ]=============================================
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
log-bin=mysql-bin
server-id=202
report-host=slv2.wis.lemon.cx
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
max_connections=10240
#myisam_sort_buffer_size=4M
wait_timeout=50
#key_buffer=64M
max_allowed_packet=100M
default-character-set=utf8
init_connect="SET NAMES UTF8;"
innodb_data_file_path=ibdata1:10M:autoextend

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
[20 Aug 2009 12:24] Susanne Ebrecht
Is the binlog format ROW, STATEMENT or MIXED?
[20 Aug 2009 17:36] James Chen
3 of server's binlog set to statement by default.
Until the problem happen all the time, I changed All of the 3 Sql server set to Mixed right now. But the problem is still there.
[21 Aug 2009 4:15] Susanne Ebrecht
How is the setup?

Is it:

a) Master replicates to Slave 1 and Slave 1 replicates to Slave 2
b) Master replicates to Slave 1 and Master replicates to Slave 2

Which IP and which server ID have all Master and Slaves.

Is there by accident same IP or same server ID ?

Could it be that wrong my.cnf is used because

a) slaves run on same machine

b) another software installed a my.cnf and you didn't recognise it
[21 Aug 2009 17:36] James Chen
=======[ Replication model ]=====================
Master --> Slave(slv1)
Master --> Slave(slv2)
=======[ IP Setting ]=====================
Master (mas1) 10.88.188.200  (server-id=200)
slave  (slv1) 10.88.188.201  (server-id=201)
slave  (slv2) 10.88.188.202  (server-id=202)
=======[ Server runing status ]=====================
All 3 of mysql run on different servers, and also got the right my.cnf.
[8 Sep 2009 8:52] Sveta Smirnova
Thank you for the feedback.

Everything looks same. Have you checked network errors on problem slave? You can find this information in OS error files. Also, please, run mysqlbinlog on problem relay log.
[6 Oct 2009 7:57] James Chen
Finally,

I re-install again, and problems just gone.

However, thank you very much.