Bug #6461 Running an 'analyze table' on master causes replicating slaves to crash
Submitted: 5 Nov 2004 12:37 Modified: 10 Jan 2005 18:00
Reporter: Pete French Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:4.1.7 OS:FreeBSD (FreeBSD 4.10 / Windows 2000)
Assigned to: CPU Architecture:Any

[5 Nov 2004 12:37] Pete French
Description:
We have a mastter database with a number of slaves replicating
from it - mainly other FreeBSD systems, but also a Windows system.
I have a script which runs analyze table against every table of every
database on the master. This script used to run the command on every
machine - but as 4.1 now inserts these commands into the replication
log we now only run it against the master.

The problem is that when this happens all the slaves - both Unix and
Windows - then crash. I get the following error in the .err log
on the Unix machines:

   mysqld got signal 11;
   This could be because you hit a bug. It is also possible that this binary
   or one of the libraries it was linked against is corrupt, improperly built,
   or misconfigured. This error can also be caused by malfunctioning hardware.
   We will try our best to scrape up some info that will hopefully help diagnose
   the problem, but since we have already crashed, something is definitely wrong
   and this may fail.

On the Unix machines (including the master) the code was configured like this:
./configure --without-debug --without-innodb
when being compiled. The Windows install is the normal compiled download
from the mysql site, however. 

If I run the analyze commands direcly on the slave machines themselves it
completes fine. It is only when the analyze command arrives through the
replication that the crash occurrs.

Note that the same effect occurrs with optimize table as well!

How to repeat:
Setup master/slave replication. Run an analyze table or an
optimize table on the master - observe that the slave crashes.
[14 Dec 2004 0:34] MySQL Verification Team
I tested 4.1.8 on Windows and Linux Slackware without to repeat the
behavior reported.
[14 Dec 2004 9:34] Guilhem Bichot
Hi Pete.
Just an addition to Miguel's fine answer. If you ever get a chance to run a MySQL server binary compiled with debug support, on a crashing slave, it would be helpful for us to have the slave's debug trace (run the slave with --debug, that should create a mysqld.trace in /tmp or some other directory); it would be nice to compress it and upload it there:
ftp://ftp.mysql.com/pub/mysql/upload/
Thanks!
Guilhem
[14 Dec 2004 12:05] Pete French
I too have been testing - and it is a very hard
bug to reproduce. It only occurs on our live database
(about 2.5 million customer records in 200 tables)
and will not happen on any test databases I set up!

The database is a set of 4.0 tables that 4.1 was
installed to use - I did not recreate the tables inside 4.1
so maybe there is some inconsistency causing the problem ?

Sadly I cant do any more testing as doing so involves crashing
all the slaves which are used to run our business. If I can replicate
it then I will give you some more info though. Thanks for looking
at this.
[22 Dec 2004 23:44] Laurent Meyer
same problem here (4.1.8 / 2 slaves - win 2003). I will provide the debug log.
[10 Jan 2005 12:52] Guilhem Bichot
Looks like the same bug as http://bugs.mysql.com/bug.php?id=7658
[10 Jan 2005 13:23] Pete French
I am relieved that it isnt just me! Interesting that the other
reported bug is also FreeBSD. Does it help to know that
I am running 4.10-RELEASE and that I compile the code
myself with the following options to configure:

export CC='cc -mcpu=pentiumpro -march=pentiumpro' 
export CXX='c++ -mcpu=pentiumpro -march=pentiumpro'
./configure --without-debug --without-innodb --enable-thread-safe-client

The default compiler is gcc version 2.95.4 20020320
[10 Jan 2005 13:38] Guilhem Bichot
It's ok, we know the cause of the bug (dereferencing null pointer), I'm fixing it now.
[10 Jan 2005 18:00] Guilhem Bichot
See BUG#7658, this is now fixed.