Bug #1818 | Replication failed between 4.0.16-4.0.16 on Linux. Reproductible. | ||
---|---|---|---|
Submitted: | 12 Nov 2003 10:05 | Modified: | 21 Jun 2004 11:11 |
Reporter: | Renato Weiner | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 4.0.19 | OS: | Linux (RedHat 7.2 or 7.3) |
Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
[12 Nov 2003 10:05]
Renato Weiner
[13 Nov 2003 2:35]
Renato Weiner
It looks like my 'solution' of split up the binlog didn't work either. Today I had another failure. It lasted a bit longer, but still replication doesn't work in a good way. Message: 031113 0:35:26 Error reading packet from server: log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master (server_errno=1236) 031113 0:35:26 Got fatal error 1236: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master' from master when reading data from binary log 031113 0:35:26 Slave I/O thread exiting, read up to log 'ib_logbin1.004', position 15978017
[25 Nov 2003 10:12]
Guilhem Bichot
Hi! I'm looking forward to know if using our official binaries solved the problem. Regards, Guilhem
[26 Nov 2003 12:59]
Renato Weiner
I tested with the binaries provided in the website and it didn't work yet.
[24 Apr 2004 15:58]
Renato Weiner
I tried replication with version 4.0.18 and 4.1.1-alpha and got the exactly same error. I have a version with debug on and I´m thinking what functions should I put in the stack trace ? Maybe something like: -#d:f,mysql_binlog_send:F:L:t,20 Please advise me, so I can provide more feedback.
[26 Apr 2004 14:00]
Guilhem Bichot
Doing some more tests with Mr. Renato Weiner.
[15 May 2004 19:13]
Guilhem Bichot
Continuing tests with Mr. Weiner
[7 Jun 2004 11:17]
Guilhem Bichot
User is testing on different hardware/OS.
[18 Jun 2004 21:30]
Renato Weiner
Hi Guilhem, As you recommended I completely switched my OS and now everything is working. In case anybody have this problem: I was using a RedHat AS 3.0 with the aacraid module. Randomly it truncates the master binary logs, causing the error described it this bug. By using the aic7xxx module, replication is working ok now. I recommend to check your OS in case you have this error. Thanks Guilhem for all the patience and help !!
[21 Jun 2004 11:11]
Guilhem Bichot
Glad that your system is now working fine, and that it was not a MySQL problem!
[7 Nov 2005 18:58]
Shengyong Hu
Hi, Guihem Could you tell us what suggestion you gave to Renato? And what did you modified for the testing? Thanks
[8 Nov 2005 13:48]
Guilhem Bichot
Hello Shengyong, With Renato I think we didn't get complete knowledge of what was wrong: the problem appeared on Redhat 7.3 while there were no problems with Redhat AS 3.0. So it may have been a kernel/glibc issue. We ruled out a MySQL bug by demonstrating that the binlog was shrinking (which MySQL cannot be responsible for as it never calls ftruncate() on such files): some statements disappeared from the binlog while they were there the second before. For this, Renato set up a script which prints the size of the last binlog every second, to a file. Something like while true do ls your_binlogs | tail -n1 >> list.txt sleep 1 done Then when the error occured on slave, he inspected list.txt and found out that at some moment the binlog had its size decreased. So we supposed that it was an issue with some hard drive OS driver, glibc... Good luck!