Bug #42914 No LAST_IO_ERROR for max_allowed_packet errors
Submitted: 17 Feb 2009 8:01 Modified: 19 Dec 2009 8:49
Reporter: Matthew Montgomery Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S2 (Serious)
Version:5.1.31, Cluster 6.3.18 OS:Any
Assigned to: Libing Song CPU Architecture:Any

[17 Feb 2009 8:01] Matthew Montgomery
Description:
ROW based binlog format include non-updated columns in before and after image.

The NDB engine provides a mechanism (ndb_log_updated_only=ON|OFF, default ON) to specify whether only updated columns are logged to the binlog or the full row before & after image is logged.  No such mechanism exists for any other storage engines.  This can result in massive binlog entries for tables containing large BLOB or TEXT columns.  These large entries are also responsible for causing replication slave IO thread to break.  

With either...

ERROR: Error in Log_event::read_log_event(): 'Event too big', data_len: NNNNNNNNNN, event_type: 77
or 
[ERROR] Got fatal error 1236: 'error reading log entry' from master when reading data from binary log

You can get around this error by raising the max_allowed_packet on the slave to 2 or more times that of the master.  However this still does not accommodate the largest blob columns as it is limited to only 1GB and would allow for replication of rows with 1 blob column 500M or less.

Used ndb-6.3.18, I'll try to repeat this again on a later version.

How to repeat:
Configure master-slave servers with 

binlog_format=ROW

Master mysql> 
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a INT UNSIGNED PRIMARY KEY, b LONGTEXT, c VARCHAR(10)) engine=myisam;
INSERT INTO t1 VALUES(1, REPEAT(b',@@max_allowed_packet), 'foo');

Slave mysql> 

SELECT a,LENGTH(b),c FROM t1;
+---+-----------+------+
| a | length(b) | c    |
+---+-----------+------+
| 1 |   1048576 | foo  | 
+---+-----------+------+
1 row in set (0.00 sec)

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
...
                   Last_Errno: 0
                   Last_Error: 
...

Master mysql> 
UPDATE t1 SET c='bar' WHERE a=1; 
Query OK, 1 row affected (0.12 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Slave mysql>
SELECT a,LENGTH(b),c FROM t1;
+---+-----------+------+
| a | LENGTH(b) | c    |
+---+-----------+------+
| 1 |   1048576 | foo  | 
+---+-----------+------+
1 row in set (0.00 sec)

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: 
...
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
...
                   Last_Errno: 0
                   Last_Error: 
...
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 

No error in SHOW SLAVE STATUS output but slave IO thread is stopped and server error log shows:

1:18:18 [ERROR] Got fatal error 1236: 'error reading log entry' from master when reading data from binary log

Suggested fix:
1st, display IO thread errors in SHOW SLAVE STATUS's Last_IO_Errno and Last_IO_Error fields.

2nd, since RBR already has the existing rule that "All tables replicated using row-based replication must have explicit primary keys."  -- http://dev.mysql.com/doc/refman/5.1/en/replication-sbr-rbr.html  We should write to the binlog only those fields which are updated plus the primary keys.  

I understand for NDB there is reasoning for binlogging all columns of an updated row for use potentially by conflict resolution triggers.  However NDB does not log non-updated columns by default.  Even in the case where ndb_log_updated_only=OFF would be required, it is unseemly that a BLOB or TEXT column would be used by a conflict resolution trigger.  There seems to me no reason that non-updated (BLOB or TEXT) columns would be recorded to the binlog for tables of any type.

3rd, (not critical) for NDB tables do not binlog non-updated BLOB/TEXT columns in ndb_log_updated_only=OFF mode.
[17 Feb 2009 8:08] Valeriy Kravchuk
Thank you for the problem report.
[25 Aug 2009 14:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81516

3069 Li-Bing.Song@sun.com	2009-08-25
      Bug #42914 Log event that larger than max_allowed_packet results in stop of slave I/O thread, But there is no Last_IO_Error reported.
      
      On master, if replicated event larger than max_allowed_packet,
      master sends an error message ER_MASTER_FATAL_ERROR_READING_BINLOG to slave.
      This message results in stop of slave I/O thread. 
      On slave, slave I/O thread will stop when receiveing a packet larger than max_allowed_packet.
      In both the cases, there is no Last_IO_Error reported.
      
      This patch adds code to report Last_IO_Error befor slave I/O thread stopping.
      There is another case which results in stop of slave I/O thread, but dose not report a Last_IO_Error.
      This patch fixs it too.
[26 Aug 2009 10:31] Libing Song
We discussed this bug and we will do these:
 1. If a binlog event is larger than max_allowed_packet + MAX_LOG_EVENT_HEADER_LEN on master, then slave must report an error and stop I/O thread.                            |  
 2. Before this bug is closed, the manual should be updated with information about the impact of each related variable on every step on the replication path (all the way from client on master to storage on slave, via dump, io and sql threads).
 3. The problem of not replicating non-changed data should be handled in BUG#14068, not in this report.
[7 Sep 2009 8:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82557

3114 Li-Bing.Song@sun.com	2009-09-07
      Bug #42914 Log event that larger than max_allowed_packet results in stop of slave I/O thread, 
                 But there is no Last_IO_Error reported.
      
      On the master, if a binary log event is larger than max_allowed_packet,
      the error message ER_MASTER_FATAL_ERROR_READING_BINLOG  is
      sent to a slave when it requests a dump from the master, thus leading
      the I/O thread to stop.
      
      On a slave, the I/O thread stops when receiving a packet larger than max_allowed_packet.
      
      In both cases, however, there was no Last_IO_Error reported.
      
      This patch adds code to report the Last_IO_Error before stopping the
      I/O thread and also reports the case the out memory pops up while
      handling packets from the master.
     @ sql/sql_repl.cc
        The master send the Specific reasons instead of "error reading log entry" to the slave which is requesting a dump.
        if an fatal error is returned by read_log_event function.
[14 Sep 2009 14:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83194
[18 Sep 2009 3:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83662
[18 Sep 2009 8:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83698
[18 Sep 2009 13:03] Libing Song
Pushed to mysql5.1-bugteam and mysql-pe
[30 Sep 2009 8:17] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20090929093622-1mooerbh12e97zux) (version source revid:joro@sun.com-20090921090422-jz8bdfmapgq48seg) (merge vers: 6.0.14-alpha) (pib:11)
[1 Oct 2009 8:25] Jon Stephens
Documented bugfix in the 6.0.14 changelog as follows:

        On the master, if a binary log event is larger than
        max_allowed_packet, the error message
        ER_MASTER_FATAL_ERROR_READING_BINLOG is sent to a slave when it
        requests a dump from the master, thus leading the I/O thread to
        stop. On a slave, the I/O thread stops when receiving a packet
        larger than max_allowed_packet.

        In both cases, however, there was no Last_IO_Error reported,
        which made it difficult to determine why the slave had stopped
        in such cases. Now, Last_IO_Error is reported when 
        max_allowed_packet is exceeded, and provides the reason for 
        which the slave I/O thread stopped.

Also added a note about this issue to the description of Last_IO_Error in "SHOW SLAVE STATUS", and a new subsection "Replication and max_allowed_packet" to "Replication Features and Issues". Also added a note about using --ndb-log-update-only to get round this problem in Cluster Replication section.

Set status = NDI, waiting for pushes to 5.1/Cluster/5.4.

Also updated Synopsis to reflect actual issue addressed in this bug report/fix.
[6 Oct 2009 8:59] Bugs System
Pushed into 5.1.40 (revid:joro@sun.com-20091006073316-lea2cpijh9r6on7c) (version source revid:li-bing.song@sun.com-20090918082029-vfhervp1bjgamesh) (merge vers: 5.1.39) (pib:11)
[6 Oct 2009 10:00] Jon Stephens
Also documented in the 5.1.40 changelog. 

-> NDI: Waiting for push to 5.4.
[22 Oct 2009 7:06] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091013094238-g67x6tgdm9a7uik0) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 16:15] Jon Stephens
Also documented in the 5.5.0 changelog.

Closed.
[18 Dec 2009 10:29] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:45] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:00] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:14] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[19 Dec 2009 8:49] Jon Stephens
No new changelog entries required. Closing.