Bug #55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
Submitted: 2 Aug 2010 7:31 Modified: 9 Feb 2011 20:58
Reporter: Zhenxing He Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Replication Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Alfranio Correia CPU Architecture:Any
Tags: sporadic, test
Triage: Triaged: D3 (Medium)

[2 Aug 2010 7:31] Zhenxing He
Description:
Test case rpl_log_pos.test fails sporadically on PB2 with the following symptom:

rpl.rpl_log_pos                          [ fail ]
        Test ended at 2010-07-31 05:44:01

CURRENT_TEST: rpl.rpl_log_pos
--- /export/home/pb2/test/sb_0-2097694-1280526154.48/mysql-5.1.50-solaris10-sparc-test/mysql-test/suite/rpl/r/rpl_log_pos.result	2010-07-31 00:04:33.000000000 +0300
+++ /export/home/pb2/test/sb_0-2097694-1280526154.48/mysql-5.1.50-solaris10-sparc-test/mysql-test/suite/rpl/r/rpl_log_pos.reject	2010-07-31 06:44:01.068768000 +0300
@@ -12,7 +12,7 @@
 change master to master_log_pos=MASTER_LOG_POS;
 Read_Master_Log_Pos	75
 start slave;
-Last_IO_Error = Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master'
+Last_IO_Error = Got fatal error 1236 from master when reading data from binary log: 'binlog truncated in the middle of event'
 include/stop_slave.inc
 show master status;
 File	Position	Binlog_Do_DB	Binlog_Ignore_DB

mysqltest: Result length mismatch

How to repeat:
Don't know how to reproduce it manually, please check the PB2 for similar failures.

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_output_id=612629aed...

The latest failure is:

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&search=yes&push_id=14527...
[10 Sep 2010 10:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/117994

3503 Alfranio Correia	2010-09-10
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      Disabled the test case.
[27 Sep 2010 9:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119150

3504 Alfranio Correia	2010-09-27
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      There are two calls to read_log_event() on master in mysql_binlog_send(),
      each will read 19 bytes, and the error of the second read_log_event() will
      be reported to slave.
      
      The second read_log_event() will start from position 94 (75 + 19) to 113
      (75 + 19 + 19), the end position 113 is larger than the size 106 of binary
      log that only have a FD, but because there is a call to mtr.suppression:
      
      call mtr.add_suppression ("Slave I/O: Got fatal error 1236 from master when reading data from binary");
      
      which will write some binary events to the file, and so normally the
      size of the binlog on master before slave connection will be large enough
      for the second read_log_event(). 
      
      However, mysql_binlog_send() does not use the same IO_CACHE that is used to write 
      into binary log (i.e. mysql_bin_log.log_file) for the hot binary log. It opens
      the binary log file directly by calling open_binlog() and create a separated
      IO_CACHE. So there is a possibily that after a master has flushed the binary log
      file, the content has been cached by the filesystem, and has not yet updated the
      on disk file yet. If this happens, then a slave will only see part of the file,
      and thus the second read_log_event() will report event truncated error.
[28 Sep 2010 8:50] Bugs System
Pushed into mysql-5.1 5.1.52 (revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (version source revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (merge vers: 5.1.52) (pib:21)
[29 Sep 2010 8:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119353

3504 Alfranio Correia	2010-09-29 [merge]
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      There are two calls to read_log_event() on master in mysql_binlog_send(),
      each will read 19 bytes, and the error of the second read_log_event() will
      be reported to slave.
      
      The second read_log_event() will start from position 94 (75 + 19) to 113
      (75 + 19 + 19), the end position 113 is larger than the size 106 of binary
      log that only have a FD, but because there is a call to mtr.suppression:
      
      call mtr.add_suppression ("Slave I/O: Got fatal error 1236 from master when reading data from binary");
      
      which will write some binary events to the file, and so normally the
      size of the binlog on master before slave connection will be large enough
      for the second read_log_event(). 
      
      However, mysql_binlog_send() does not use the same IO_CACHE that is used to write 
      into binary log (i.e. mysql_bin_log.log_file) for the hot binary log. It opens
      the binary log file directly by calling open_binlog() and create a separated
      IO_CACHE. So there is a possibily that after a master has flushed the binary log
      file, the content has been cached by the filesystem, and has not yet updated the
      on disk file yet. If this happens, then a slave will only see part of the file,
      and thus the second read_log_event() will report event truncated error.
[1 Oct 2010 15:45] Andrei Elkin
bug#54988 is a duplicate for this one.
[14 Oct 2010 8:39] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (merge vers: 5.1.51-ndb-7.0.20) (pib:21)
[14 Oct 2010 8:54] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (merge vers: 5.1.51-ndb-6.3.39) (pib:21)
[14 Oct 2010 9:12] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (merge vers: 5.1.51-ndb-6.2.19) (pib:21)
[9 Nov 2010 19:49] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (merge vers: 5.1.50) (pib:21)
[13 Nov 2010 16:16] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (merge vers: 5.1.50) (pib:21)
[13 Nov 2010 16:42] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:marko.makela@oracle.com-20100824081003-v4ecy0tga99cpxw2) (pib:21)
[3 Jan 2011 15:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127802

3532 Alfranio Correia	2011-01-03
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      There are two calls to read_log_event() on master in mysql_binlog_send().
      Each call reads 19 bytes in this test case and the error of the second
      read_log_event() is reported to the slave.
      
      The second read_log_event() starts from position 94 (75 + 19) to 113
      (75 + 19 + 19). Usually, there are two events in the binary log:
      
          . 0   - 3   - Header
          . 4   - 105 - Format Descriptor Event
          . 106 - 304 - Query Event
      
      and both reads fail because operations are reading from invalid positions
      as expected.
      
      However, mysql_binlog_send() does not use the same IO_CACHE that is used to
      write into binary log (i.e. mysql_bin_log.log_file) for the hot binary log.
      It opens the binary log file directly by calling open_binlog() and creates a
      separated IO_CACHE. So there is a possibly that after a master has flushed
      the binary log file, the content has been cached by the filesystem, and has
      not updated the disk file. If this happens, then a slave will only see part
      of the file, and thus the second read_log_event() will report event truncated
      error.
      
      To fix the problem, if the first read_log_event() has failed, we ensure that
      the second one will try to read from the same position.
[27 Jan 2011 18:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129804

3532 Alfranio Correia	2011-01-27
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      There are two calls to read_log_event() on master in mysql_binlog_send().
      Each call reads 19 bytes in this test case and the error of the second
      read_log_event() is reported to the slave.
      
      The second read_log_event() starts from position 94 (75 + 19) to 113
      (75 + 19 + 19). Usually, there are two events in the binary log:
      
          . 0   - 3   - Header
          . 4   - 105 - Format Descriptor Event
          . 106 - 304 - Query Event
      
      and both reads fail because operations are reading from invalid positions
      as expected.
      
      However, mysql_binlog_send() does not use the same IO_CACHE that is used to
      write into binary log (i.e. mysql_bin_log.log_file) for the hot binary log.
      It opens the binary log file directly by calling open_binlog() and creates a
      separated IO_CACHE. So there is a possibly that after a master has flushed
      the binary log file, the content has been cached by the filesystem, and has
      not updated the disk file. If this happens, then a slave will only see part
      of the file, and thus the second read_log_event() will report event truncated
      error.
      
      To fix the problem, if the first read_log_event() has failed, we ensure that
      the second one will try to read from the same position.
[28 Jan 2011 1:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/129824

3555 Alfranio Correia	2011-01-28
      BUG#55675 rpl.rpl_log_pos fails sporadically with error binlog truncated in the middle
      
      There are two calls to read_log_event() on master in mysql_binlog_send().
      Each call reads 19 bytes in this test case and the error of the second
      read_log_event() is reported to the slave.
      
      The second read_log_event() starts from position 94 (75 + 19) to 113
      (75 + 19 + 19). Usually, there are two events in the binary log:
      
          . 0   - 3   - Header
          . 4   - 105 - Format Descriptor Event
          . 106 - 304 - Query Event
      
      and both reads fail because operations are reading from invalid positions
      as expected.
      
      However, mysql_binlog_send() does not use the same IO_CACHE that is used to
      write into binary log (i.e. mysql_bin_log.log_file) for the hot binary log.
      It opens the binary log file directly by calling open_binlog() and creates a
      separated IO_CACHE. So there is a possibly that after a master has flushed
      the binary log file, the content has been cached by the filesystem, and has
      not updated the disk file. If this happens, then a slave will only see part
      of the file, and thus the second read_log_event() will report event truncated
      error.
      
      To fix the problem, if the first read_log_event() has failed, we ensure that
      the second one will try to read from the same position.
[28 Jan 2011 9:00] Andrei Elkin
Approved since the latest patch addresses discussed on irc an optimization issue.
[31 Jan 2011 14:39] Bugs System
Pushed into mysql-5.1 5.1.56 (revid:alfranio.correia@oracle.com-20110131143133-tbrk92p7b691sy7v) (version source revid:alfranio.correia@oracle.com-20110131101101-q2o5o72io1tqvvv2) (merge vers: 5.1.56) (pib:24)
[31 Jan 2011 14:39] Bugs System
Pushed into mysql-5.5 5.5.10 (revid:alfranio.correia@oracle.com-20110131143404-lhpp8lyvdm48z54o) (version source revid:alfranio.correia@oracle.com-20110131131105-7xbz1zb3m2atl0qr) (merge vers: 5.5.10) (pib:24)
[31 Jan 2011 14:41] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:alfranio.correia@oracle.com-20110131143645-adjsz57nh921np4f) (version source revid:alfranio.correia@oracle.com-20110131134438-ravmm40h965x7d5f) (merge vers: 5.6.2) (pib:24)
[31 Jan 2011 14:42] Alfranio Correia
Pushed to 5.1, 5.5 and trunk.
[1 Feb 2011 16:10] Jon Stephens
Changes do not appear to impact end users; closed without further action.
[9 Feb 2011 9:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/130808

3622 Alfranio Correia	2011-02-09
      Enabling test case rpl_row_event_max_size after fixing BUG#55675.
[9 Feb 2011 9:31] Bugs System
Pushed into mysql-trunk 5.6.2 (revid:alfranio.correia@oracle.com-20110209092832-5cs1rbisq14r685c) (version source revid:alfranio.correia@oracle.com-20110209092832-5cs1rbisq14r685c) (merge vers: 5.6.2) (pib:24)
[9 Feb 2011 20:58] Jon Stephens
Re-closed: see my previous comment.