MySQL Bugs: #68490: slave_max_allowed_packet Not Honored on Slave IO Connect

Bug #68490	slave_max_allowed_packet Not Honored on Slave IO Connect
Submitted:	26 Feb 2013 3:24	Modified:	6 May 2013 15:26
Reporter:	Jervin R	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S3 (Non-critical)
Version:	5.5.30	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
It seems that in the event that a slave was disconnected from the master, under certain conditions, upon reconnect, it will report that it received a packet larger the slave_max_allowed_packet.

Using my test below, max_allowed_packet is set to 4M, slave is 1M and slave_max_allowed_packet is 1G and I am trying to update a row with a 1.5M file import to a blob column. This is easily reproducible on 5.5.30. The max_allowed_packet difference is to demonstrate that the problem could be on this value not on slave_max_allowed_packet, if I tried to set both to 4M and find a file about 4M in size as well, I think it would still trigger.

If I do stop slave; start slave; the replication would go through fine.

How to repeat:
This is using MySQL Sandbox and the test file can be anything around 1.5M, have not tested up to what size limits would work.

<script>
#!/bin/bash

VER=$1
PRT=$2
PFX=$3
PTH=/ssd/msb/rsandbox_$PFX
ERR=1

#while [ $ERR ]; do

sbtool -o delete --source_dir=$PTH 2>&1 /dev/null
make_replication_sandbox --sandbox_base_port=$PRT $VER  2>&1 /dev/null

for n in master node1 node2; do
cat >> $PTH/$n/my.sandbox.cnf <<EOF
innodb_file_per_table
binlog_format=ROW
secure-file-priv=/tmp
EOF
done

$PTH/restart_all  2>&1 /dev/null
$PTH/master/restart
$PTH/m -e 'set global max_allowed_packet = 1048576*4'

t=$(cat <<EOF
show global variables like '%packet%';
CREATE TABLE tblblobs (
  id int(11) NOT NULL AUTO_INCREMENT,
  activity_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  blob_name char(16) NOT NULL COMMENT 'unique blob name',
  status enum('INCOMPLETE','UNSUBMITTED','SUBMITTED','REJECTED','APPROVED') DEFAULT 'UNSUBMITTED',
  payload mediumblob NOT NULL,
  PRIMARY KEY (id),
  UNIQUE KEY blob_name (blob_name)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1;
INSERT INTO tblblobs (blob_name, payload) VALUES('test', LOAD_FILE('/tmp/percona-toolkit-2.1.7.tar.gz')) ON DUPLICATE KEY UPDATE payload = LOAD_FILE('/tmp/percona-toolkit-2.1.7.tar.gz');
INSERT INTO tblblobs (blob_name, payload) VALUES('test', LOAD_FILE('/tmp/percona-toolkit-2.1.7.tar.gz')) ON DUPLICATE KEY UPDATE payload = LOAD_FILE('/tmp/percona-toolkit-2.1.7.tar.gz')
EOF
)

echo $t | $PTH/m test
sleep 60
$PTH/s1 -e "SHOW SLAVE STATUS \G"|grep slave_max_allowed_packet
ERR=$?

#done
</script>

I have seen this bug several times, but not on the initial slave I/O connection.  The slave will be connected and replicating just fine, then at some point when a row-based log packet larger than max_packet_size comes through, the slave stops like this:

130221 22:54:18 [ERROR] Error reading packet from server: Got packet bigger than 'max_allowed_packet' bytes ( server_errno=2020)
130221 22:54:18 [ERROR] Log entry on master is longer than slave_max_allowed_packet (1073741824) on slave. If the entry is correct, restart the server with a higher value of slave_max_allowed_packet
130221 22:54:18 [ERROR] Slave I/O: Got a packet bigger than 'slave_max_allowed_packet' bytes, Error_code: 1153
130221 22:54:18 [Note] Slave I/O thread exiting, read up to log 'log-bin.000054', position 507448306

START SLAVE (at the same I/O position where it just stopped) gets it running again.  The slave_max_allowed_packet feature works to handle the log packet larger than max_allowed_packet on the second attempt.

Hello Jervin,

Thank you for the report.
Verified as described.

Regards,
Umesh

Bug #69104 was marked as duplicate of this one.

Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Fixed in 5.5+. Documented in the 5.5.32, 5.6.12, and 5.7.2 changelogs as follows:

        Following disconnection from the master, the slave could under
        certain conditions report erroneously on reconnection that it
        had received a packet that was larger than slave_max_allowed_packet,
        causing replication to fail.

Closed.

Fixed in 5.6+. Documented in the MySQL 5.6.12 and 5.7.2 changelogs as follows:

      Restarting the server after the slave_relay_log_info table had been
      emptied caused mysqld to fail while trying to return an error.

Closed.

Sadly, mentioning public bug number in the commits is not the norm.

5.6$ bzr log -r 4947.1.3 -n0
------------------------------------------------------------
revno: 4947.1.3 [merge]
committer: Manish Kumar<manish.4.kumar@oracle.com>
branch nick: mysql-5.6
timestamp: Mon 2013-03-25 11:41:00 +0530
message:
  BUG#16438800 - SLAVE_MAX_ALLOWED_PACKET NOT HONORED ON SLAVE IO CONNECT
  
  Merge from mysql-5.5 -> mysql-5.6
    ------------------------------------------------------------
    revno: 2875.437.32
    committer: Manish Kumar<manish.4.kumar@oracle.com>
    branch nick: mysql-5.5
    timestamp: Mon 2013-03-25 11:27:12 +0530
    message:
      BUG#16438800 - SLAVE_MAX_ALLOWED_PACKET NOT HONORED ON SLAVE IO CONNECT
            
      Problem - When the slave was disconnected from the master, under certain 
                conditions, upon reconnect, it will report that it received a 
                packet larger the slave_max_allowed_packet which causes the
                replication to stop.
       
      Analysis -The reason of this failure is that on reconnect
                the slave sets the max_allowed_packet from the master's mi->mysql
                object which keeps the max_allowed_packet as 1MB. This causes the 
                slave to report such error on recieving packet bigger than 1MB. 
                START SLAVE on the slave fixes the problem since it restarts
                slave threads which initializes the max_allowed_packet to
                slave_max_allowed_packet.
            
      Fix - The problem is fixed by some code refactoring and introduction of a new
            function which updates the max_allowed_packet for the THD object of the
            slave thread and the mysql->options max_allowed_packet.