Bug #42914 No LAST_IO_ERROR for max_allowed_packet errors
Submitted: 17 Feb 2009 9:01 Modified: 19 Dec 2009 9:49
Reporter: Matthew Montgomery
Status: Closed
Category:Server: RBR Severity:S2 (Serious)
Version:5.1.31, Cluster 6.3.18 OS:Any
Assigned to: Libing Song Target Version:5.1+
Triage: Triaged: D3 (Medium)

[17 Feb 2009 9:01] Matthew Montgomery
Description:
ROW based binlog format include non-updated columns in before and after image.

The NDB engine provides a mechanism (ndb_log_updated_only=ON|OFF, default ON) to specify
whether only updated columns are logged to the binlog or the full row before & after
image is logged.  No such mechanism exists for any other storage engines.  This can
result in massive binlog entries for tables containing large BLOB or TEXT columns.  These
large entries are also responsible for causing replication slave IO thread to break.  

With either...

ERROR: Error in Log_event::read_log_event(): 'Event too big', data_len: NNNNNNNNNN,
event_type: 77
or 
[ERROR] Got fatal error 1236: 'error reading log entry' from master when reading data
from binary log

You can get around this error by raising the max_allowed_packet on the slave to 2 or more
times that of the master.  However this still does not accommodate the largest blob
columns as it is limited to only 1GB and would allow for replication of rows with 1 blob
column 500M or less.

Used ndb-6.3.18, I'll try to repeat this again on a later version.

How to repeat:
Configure master-slave servers with 

binlog_format=ROW

Master mysql> 
DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a INT UNSIGNED PRIMARY KEY, b LONGTEXT, c VARCHAR(10)) engine=myisam;
INSERT INTO t1 VALUES(1, REPEAT(b',@@max_allowed_packet), 'foo');

Slave mysql> 

SELECT a,LENGTH(b),c FROM t1;
+---+-----------+------+
| a | length(b) | c    |
+---+-----------+------+
| 1 |   1048576 | foo  | 
+---+-----------+------+
1 row in set (0.00 sec)

SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
...
                   Last_Errno: 0
                   Last_Error: 
...

Master mysql> 
UPDATE t1 SET c='bar' WHERE a=1; 
Query OK, 1 row affected (0.12 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Slave mysql>
SELECT a,LENGTH(b),c FROM t1;
+---+-----------+------+
| a | LENGTH(b) | c    |
+---+-----------+------+
| 1 |   1048576 | foo  | 
+---+-----------+------+
1 row in set (0.00 sec)

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: 
...
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
...
                   Last_Errno: 0
                   Last_Error: 
...
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 

No error in SHOW SLAVE STATUS output but slave IO thread is stopped and server error log
shows:

1:18:18 [ERROR] Got fatal error 1236: 'error reading log entry' from master when reading
data from binary log

Suggested fix:
1st, display IO thread errors in SHOW SLAVE STATUS's Last_IO_Errno and Last_IO_Error
fields.

2nd, since RBR already has the existing rule that "All tables replicated using row-based
replication must have explicit primary keys."  --
http://dev.mysql.com/doc/refman/5.1/en/replication-sbr-rbr.html  We should write to the
binlog only those fields which are updated plus the primary keys.  

I understand for NDB there is reasoning for binlogging all columns of an updated row for
use potentially by conflict resolution triggers.  However NDB does not log non-updated
columns by default.  Even in the case where ndb_log_updated_only=OFF would be required,
it is unseemly that a BLOB or TEXT column would be used by a conflict resolution trigger.
 There seems to me no reason that non-updated (BLOB or TEXT) columns would be recorded to
the binlog for tables of any type.

3rd, (not critical) for NDB tables do not binlog non-updated BLOB/TEXT columns in
ndb_log_updated_only=OFF mode.
[17 Feb 2009 9:08] Valeriy Kravchuk
Thank you for the problem report.
[25 Aug 2009 16:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81516

3069 Li-Bing.Song@sun.com	2009-08-25
      Bug #42914 Log event that larger than max_allowed_packet results in stop of slave
I/O thread, But there is no Last_IO_Error reported.
      
      On master, if replicated event larger than max_allowed_packet,
      master sends an error message ER_MASTER_FATAL_ERROR_READING_BINLOG to slave.
      This message results in stop of slave I/O thread. 
      On slave, slave I/O thread will stop when receiveing a packet larger than
max_allowed_packet.
      In both the cases, there is no Last_IO_Error reported.
      
      This patch adds code to report Last_IO_Error befor slave I/O thread stopping.
      There is another case which results in stop of slave I/O thread, but dose not
report a Last_IO_Error.
      This patch fixs it too.
[26 Aug 2009 12:31] Libing Song
We discussed this bug and we will do these:
 1. If a binlog event is larger than max_allowed_packet + MAX_LOG_EVENT_HEADER_LEN on
master, then slave must report an error and stop I/O thread.                            |
 
 2. Before this bug is closed, the manual should be updated with information about the
impact of each related variable on every step on the replication path (all the way from
client on master to storage on slave, via dump, io and sql threads).
 3. The problem of not replicating non-changed data should be handled in BUG#14068, not
in this report.
[7 Sep 2009 10:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/82557

3114 Li-Bing.Song@sun.com	2009-09-07
      Bug #42914 Log event that larger than max_allowed_packet results in stop of slave
I/O thread, 
                 But there is no Last_IO_Error reported.
      
      On the master, if a binary log event is larger than max_allowed_packet,
      the error message ER_MASTER_FATAL_ERROR_READING_BINLOG  is
      sent to a slave when it requests a dump from the master, thus leading
      the I/O thread to stop.
      
      On a slave, the I/O thread stops when receiving a packet larger than
max_allowed_packet.
      
      In both cases, however, there was no Last_IO_Error reported.
      
      This patch adds code to report the Last_IO_Error before stopping the
      I/O thread and also reports the case the out memory pops up while
      handling packets from the master.
     @ sql/sql_repl.cc
        The master send the Specific reasons instead of "error reading log entry" to the
slave which is requesting a dump.
        if an fatal error is returned by read_log_event function.
[14 Sep 2009 16:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83194
[18 Sep 2009 5:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83662
[18 Sep 2009 10:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83698
[18 Sep 2009 15:03] Libing Song
Pushed to mysql5.1-bugteam and mysql-pe
[30 Sep 2009 10:17] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20090929093622-1mooerbh12e97zux) (version
source revid:joro@sun.com-20090921090422-jz8bdfmapgq48seg) (merge vers: 6.0.14-alpha)
(pib:11)
[1 Oct 2009 10:25] Jon Stephens
Documented bugfix in the 6.0.14 changelog as follows:

        On the master, if a binary log event is larger than
        max_allowed_packet, the error message
        ER_MASTER_FATAL_ERROR_READING_BINLOG is sent to a slave when it
        requests a dump from the master, thus leading the I/O thread to
        stop. On a slave, the I/O thread stops when receiving a packet
        larger than max_allowed_packet.

        In both cases, however, there was no Last_IO_Error reported,
        which made it difficult to determine why the slave had stopped
        in such cases. Now, Last_IO_Error is reported when 
        max_allowed_packet is exceeded, and provides the reason for 
        which the slave I/O thread stopped.

Also added a note about this issue to the description of Last_IO_Error in "SHOW SLAVE
STATUS", and a new subsection "Replication and max_allowed_packet" to "Replication
Features and Issues". Also added a note about using --ndb-log-update-only to get round
this problem in Cluster Replication section.

Set status = NDI, waiting for pushes to 5.1/Cluster/5.4.

Also updated Synopsis to reflect actual issue addressed in this bug report/fix.
[6 Oct 2009 10:59] Bugs System
Pushed into 5.1.40 (revid:joro@sun.com-20091006073316-lea2cpijh9r6on7c) (version source
revid:li-bing.song@sun.com-20090918082029-vfhervp1bjgamesh) (merge vers: 5.1.39) (pib:11)
[6 Oct 2009 12:00] Jon Stephens
Also documented in the 5.1.40 changelog. 

-> NDI: Waiting for push to 5.4.
[22 Oct 2009 9:06] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version
source revid:alik@sun.com-20091013094238-g67x6tgdm9a7uik0) (merge vers: 5.5.0-beta)
(pib:13)
[22 Oct 2009 18:15] Jon Stephens
Also documented in the 5.5.0 changelog.

Closed.
[18 Dec 2009 11:29] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6)
(version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers:
5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 11:45] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt)
(version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers:
5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 12:00] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0)
(version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers:
5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 12:14] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606)
(version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers:
5.1.41-ndb-7.0.11) (pib:15)
[19 Dec 2009 9:49] Jon Stephens
No new changelog entries required. Closing.