MySQL Bugs: #20132: Slave SQL completely crashes during replication from master SQL server

Bug #20132	Slave SQL completely crashes during replication from master SQL server
Submitted:	30 May 2006 1:07	Modified:	23 Jul 2006 9:03
Reporter:	Pavol Luptak	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	5.1.7-beta-log	OS:	Linux (Gentoo Linux)
Assigned to:		CPU Architecture:	Any

Description:
I use the latest MySQL 5.1.7-beta from Gentoo Linux distribution.

I use InnoDB+MyISA tables and "replicate-do-db" directive (look at the [mysqld] section  at the end)

Description of bug:

I make a SQL snapshot of my two databases from master server, copy them to /var/lib/mysql/db1 and /var/lib/mysql/db2 and set replication parameters:

CHANGE MASTER TO MASTER_HOST='host', MASTER_USER='user', MASTER_PASSWORD='password', MASTER_LOG_FILE='mysql-bin.XXXXX', MASTER_LOG_POS=position;

SLAVE START;  

Now replication starts. After few minutes the slave MySQL server completely crashes with error:

060530  2:51:49 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.001506' at position 56255150, relay log './server1-relay-bin.000006' position: 56631737
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=1
max_connections=100
threads_connected=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x8b72df8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
frame pointer (ebp) is NULL, did you compile with
-fomit-frame-pointer? Aborting backtrace!
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x8b7ba49 = insert into tblOrgEvent( 
                                org_event_from, org_event_to,
                                org_event_title, org_event_body, 
                                org_event_security, org_event_type,
                                org_event_up_id,
                                org_event_repeat_period,
                                org_event_repeat_count
                        ) values(
                                '2006-05-24 13:00', '2006-05-24 17:00', 
                                'Dovolenka Moussaova (1/2 den)','',
                                'all','event',
                                 NULL,
                                 '0',
                                 '0'
                        )
thd->thread_id=5
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

This works (with this version of MySQL) without problems before.
I also ran "mysqlcheck -r" on both databases (on master & slave server), but it didn't help.

My [mysqld] section looks like:

[mysqld]
skip-slave-start
user            = mysql
pid-file        = /var/run/mysqld/mysqld.pid
socket          = /var/run/mysqld/mysqld.sock
log-error       = /var/log/mysql/mysqld.err
basedir         = /usr
datadir         = /var/lib/mysql
tmpdir          = /tmp
language        = /usr/share/mysql/english
default-character-set   = utf8
character-set-server            = utf8
default-collation = utf8_slovak_ci
init-connect="SET NAMES cp1250"
skip-locking
key_buffer=16M
max_allowed_packet=1M
table_cache                             = 64
sort_buffer_size                        = 512K
net_buffer_length                       = 8K
read_buffer_size                        = 256K
read_rnd_buffer_size            = 512K
myisam_sort_buffer_size         = 8M

query_cache_limit       = 1M
query_cache_size        = 64M
query_cache_type        = 1

ft_min_word_len         = 3
low_priority_updates    = 1
long_query_time         = 5
log-slow-queries        = /var/log/mysql/mysql-slow.log
server-id               = 3
master-host             = 127.0.0.1
master-port             = 3307
master-user             = repl
master-password         = 4xuhl60KudvgKeFi
log-slave-updates
warnings
log-bin                 = /var/log/mysql/mysql-bin.log
relay-log               = server1-relay-bin
max_binlog_size         = 104857600
replicate-rewrite-db    = "db1->db1_replication"
replicate-rewrite-db    = "db2->db2_replication"
replicate-do-db         = db1_replication
replicate-do-db         = db2_replication

How to repeat:
I am able to repeat this bug only with my specific database.

Table tblOrgEvent that (maybe) causes the above-mentioned crash is an InnoDB table and has the following structure:

+-------------------------+----------------------+------+-----+---------------------+----------------+
| Field                   | Type                 | Null | Key | Default             | Extra          |
+-------------------------+----------------------+------+-----+---------------------+----------------+
| org_event_id            | int(11)              |      | PRI | NULL                | auto_increment |
| org_event_from          | datetime             |      |     | 0000-00-00 00:00:00 |                |
| org_event_to            | datetime             |      |     | 0000-00-00 00:00:00 |                |
| org_event_title         | varchar(100)         |      |     |                     |                |
| org_event_body          | mediumtext           |      |     |                     |                |
| org_event_security      | enum('all','owner')  |      |     | all                 |                |
| org_event_type          | enum('event','note') |      |     | event               |                |
| org_event_repeat_period | int(11)              |      |     | 0                   |                |
| org_event_repeat_count  | int(11)              |      |     | 0                   |                |
| org_event_up_id         | int(11)              | YES  |     | NULL                |                |
+-------------------------+----------------------+------+-----+---------------------+----------------+
10 rows in set (0.02 sec)

Thank you for a problem report. Do you have any triggers on tblOrgEvent table? Any foreign keys that reference it? Can you try to repeat with a newer version of MySQL server, 5.1.9?

Dear Valeriy,
I have no triggers on tblOrgEvent table a no foreign keys that reference it - a master SQL server is running on Debian/Stable (4.1.11-Debian_4sarge3-log). Because of high-critical productional environment I do not dare to switch it to version 5.x.

On the slave SQL server I have tried:

MySQL 5.1.7-beta from Gentoo Linux distribution 
MySQL 5.1.9-beta (static binary) from http://dev.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.9-beta-linux-i686.tar.gz/from/http:...
MySQL 5.0.21 from Gentoo Linux distribution

All these versions crashed with the error:

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=258048
max_used_connections=2
max_connections=100
threads_connected=2
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 92783 
K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x8c245f8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x46f4d724, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81d31a8
0xffffe420
0x837dbfd
0x837dbfd
0x8311ec0
0x83123f8
0x8316521
0x8314533
0x82b9e85
0x82b9682
0x82c2084
0x8224138
0x822570f
0x82213c1
0x81e93ec
Stack trace seems successful - bottom reached
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow 
instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do 
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x8c2bc30 = SELECT *,
                    `TABLE_SCHEMA`       AS `Db`,
                    `TABLE_NAME`         AS `Name`,
                    `ENGINE`             AS `Engine`,
                    `ENGINE`             AS `Type`,
                    `VERSION`            AS `Version`,
                    `ROW_FORMAT`         AS `Row_format`,
                    `TABLE_ROWS`         AS `Rows`,
                    `AVG_ROW_LENGTH`     AS `Avg_row_length`,
                    `DATA_LENGTH`        AS `Data_length`,
                    `MAX_DATA_LENGTH`    AS `Max_data_length`,
                    `INDEX_LENGTH`       AS `Index_length`,
                    `DATA_FREE`          AS `Data_free`,
                    `AUTO_INCREMENT`     AS `Auto_increment`,
                    `CREATE_TIME`        AS `Create_time`,
                    `UPDATE_TIME`        AS `Update_time`,
                    `CHECK_TIME`         AS `Check_time`,
                    `TABLE_COLLATION`    AS `Collation`,
                    `CHECKSUM`           AS `
thd->thread_id=2
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Number of processes running now: 0

The only difference between MySQL 5.0.21, MySQL 5.1.17-beta and MySQL 5.1.19-Beta is that MySQL 5.1.19-Beta is restarted after the crash occurs.

I have checked all replicated tables on master SQL (using mysqlcheck) and they are 100% OK.

The communication between master MySQL and slave SQL is encapsulated into SSL tunnel (because it passes through Internet), but I have used it for years without problem.

Sorry, but statement that crashed in you last comment is SELECT (from INFORMATION_SCHEMA's table, obviously), while initially it was INSERT. It can be 2 different bugs. Please, try to repeat with a newer version of slave, 5.1.11 and try to identify crashing statement (if it is always the same).

Please, send SHOW TABLE STATUS results for that tblOrgEvent table used in initial description. Was this table simply copieds from your master (not dumped and restored)?

Hi Valeriy, this show table status for my innoDB tables:

| tblOrgContact                                | InnoDB |       9 | Dynamic    |        8 |           2048 |       16384 |            NULL |            0 |         0 |              9 | 2005-12-02 10:31:38 | NULL                | NULL                | utf8_slovak_ci    |     NULL |                | InnoDB free: 6144 kB   |
| tblOrgEvent                                  | InnoDB |       9 | Dynamic    |     1594 |            164 |      262144 |            NULL |            0 |         0 |           2210 | 2006-05-31 10:02:07 | NULL                | NULL                | utf8_slovak_ci    |     NULL |                | InnoDB free: 6144 kB   |
| tblOrgEventMember                            | InnoDB |       9 | Fixed      |     3901 |             71 |      278528 |            NULL |       180224 |         0 |           NULL | 2005-12-13 14:38:18 | NULL                | NULL                | utf8_slovak_ci    |     NULL |                | InnoDB free: 

Yes, these tables were simply copied from my master server (according to http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html). I create a snapshot of master DB (all tables have read lock using "FLUSH TABLES WITH READ LOCK;").

Is there any difference in binary structure of InnoDB table for MySQL 4.1 and 5.0? Should I use mysql dump / restore instead creating of above-mentioned snapshot?

> Is there any difference in binary structure of InnoDB table for MySQL 4.1 and
5.0? 

Yes.

> Should I use mysql dump / restore instead creating of above-mentioned
snapshot?

Yes, in general. Please, answer my question in related bug #20275.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".