Bug #7658 optimize crashes slave thread (1 in 1000)
Submitted: 4 Jan 2005 13:00 Modified: 10 Jan 2005 18:00
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:4.1.7 and 4.0.22 OS:FreeBSD (freebsd 4.8 and freebsd 4.10)
Assigned to: Guilhem Bichot

[4 Jan 2005 13:00] Martin Friebe
I experience the following bug: a mysql replication slave crashes while executing an optimize table from the replication log. This happens about every 1000 optimize statements.

the following setup applies:
- both have the same master. a mysql 4.1.7 server
- 1st slave: mysql 4.0.22 running on freebsd 4.8
 this slave was running fault-free against a mysql 4.0.20 server (as optimize was not replicated) 
- 2nd slave mysql 4.1.7 on freebsd 4.10
- both slaves are build from freebsd ports; with PTHREADS and optimization, non-static
- both clients have no problems, running stress test optimize outside the slave thread
- both slaves running on a dual XEON-CPU server with SMP (not sure, if that has anything to do with the problem
- see attached my.cnf and debug log (debug log is for
 mysql  Ver 12.22 Distrib 4.0.18, for portbld-freebsd4.8 (i386)
 build from ports)

- the table to be optimized, can allready be optimized, and does not have to contain much data, or be of any specific structure

there is a slight change of a problem related to freebsd, but the replication-slaves are running stable, except for optimize IN the slave thread

How to repeat:
setup replication, as above

create a random table, with some random data

start a loop sending 1000++ optimize comands

Suggested fix:
[4 Jan 2005 13:01] Martin Friebe
my.cnf for 4.0.22 slave

[5 Jan 2005 17:17] Martin Friebe
just an update,  the crash does not seem to happen if the slave thread is restarted with
  slave stop; slave start
on a regular base.

This could indicate a resource leaking problem
[5 Jan 2005 17:39] Martin Friebe
replication of analyze table does also trigger the crash.

Ok I am going to take a guess here:

form the error dump it dies after a 
# sql_base.cc:   251:    5: send_fields: packet_header: Memory: be1ff430  Bytes: (4)
while previous executions do log furter entries of this. this entry is written in 
# my_net_write

and optimize, analyze are the only statements in replication that return rows to the client (what is the client in a slave thread? nil?)

Does that help?
[7 Jan 2005 9:24] Guilhem Bichot
Dear Martin,
Thanks much. I was indeed able to repeat a slave crash. Let's hope I can repeat it again to troubleshoot. Will keep you posted.
[7 Jan 2005 9:25] Guilhem Bichot
050107 10:18:07 Slave I/O thread: connected to master 'root@localhost:3306',  replication started in log 'FIRST' at position 4
==7753== Thread 13:
==7753== Invalid read of size 4
==7753==    at 0x810D7F0: net_real_write (net_serv.cc:390)
==7753==    by 0x810D734: net_write_buff(st_net*, char const*, unsigned long) (net_serv.cc:343)
==7753==    by 0x810D4E3: my_net_write (net_serv.cc:252)
==7753==    by 0x81A2755: mysql_admin_table(THD*, st_table_list*, st_ha_check_opt*, char const*, thr_lock_type, bool, unsigned, int (*)(THD*, st_table_list*, st_ha_check_opt*), int (handler::*)(THD*, st_ha_check_opt*)) (sql_string.h:64)
==7753==  Address 0x68 is not stack'd, malloc'd or (recently) free'd
mysqld got signal 11;
[10 Jan 2005 18:00] Guilhem Bichot
Additional info:

Fixed in 4.0.24 and 4.1.9 in
ChangeSet@1.2024, 2005-01-10 13:52:32+01:00, guilhem@mysql.com
  Fix for BUG#7658 "optimize crashes slave thread (1 in 1000)]":
  mysql_admin_table() attempted to write to a vio which was 0. I could have fixed mysql_admin_table()
  but fixing my_net_write() looked more future-proof.