MySQL Bugs: #3078: issue with replicating large sql statements

Bug #3078	issue with replicating large sql statements
Submitted:	5 Mar 2004 14:39	Modified:	9 Mar 2004 11:39
Reporter:	Chris Stauffer	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	4.0.18	OS:	Linux (Redhat 9.0)
Assigned to:	Dean Ellis	CPU Architecture:	Any

Description:
I noticed that sometimes when you have a large insert statement (such as one that is created from a large dump file with --opt option), that it can cause a slave to error out.  What appeared to be the problem was that the large statement would be cut somewhere in the middle, which would create invalid sql, which would stop the slave server(s).

How to repeat:
This error happened about 3 or 4 times, but then worked.

The error occorred when I executed an insert statement that occurred from a dump file of a table that had around 750,000 records in it generated with the --opt option.

Suggested fix:
maybe only allow different binary relay log files to be generated AFTER reaching the end of a SQL statement

Could you verify whether or not your max_allowed_packet was large enough to accept this query?

From the looks of it, my max packet size is 1048576 
(I just left it as the default).
The size of the big table's sql inserts was around 1031159.  So that just have been the issue.  The one thing that I don't understand, is that I ran the same thing about 4 times.  If that was the issue, it would not have ran successuffly the 4th time.  I just set the binary log name, and then set the position to 0, and let the slave catch up that way.  So it would have had to go over the same place all 4 times, and therefore, should have not worked the 4th time.

I am not having any success repeating the problem; I've tried various lengths of SQL statements and server settings, and cannot reproduce it with 4.0.18 master and slave.

Is the query written completely in the master's binlog and the slave's relay log (ie: where do you see the truncation)?

The error seemed to be that the slave was cutting the sql statement into two seperate binary log files.  I saw the complete statement, but the error seemed to have occurred due to the fact the the sql wasn't considered to be valid.  I'm not quite sure why it did this, and I have since deleted the log files (next time I'll save them).  I have not had this problem since the initial setup of the replication though.

Alright, that gives me something else to test at least.  Thanks for the update.

one other piece of info that may help, when I said it was a "long" insert statement, I should have been more specific, with the --opt option, it produces the inserts using "Extended Inserts".  So basically, the insert statement that it broke on was inserting around 4,000 rows.  Here is the schema of the table it busted on:

CREATE TABLE `activity` (
  `activity_id` int(11) NOT NULL auto_increment,
  `activity_timestamp` datetime NOT NULL default '0000-00-00 00:00:00',
  `activity_ip_address` varchar(15) NOT NULL default '',
  `keyword_id` int(11) NOT NULL default '0',
  `customer_id` int(11) NOT NULL default '0',
  `partner_id` int(11) NOT NULL default '0',
  `partner_refer_id` int(11) NOT NULL default '0',
  `activity_type` int(11) NOT NULL default '18',
  `bad_keyword` varchar(255) default NULL,
  PRIMARY KEY  (`activity_id`)
) TYPE=MyISAM