Description:
I have been fighting with this issue for quite some time now.
Every so often, at least once per day MySQL will crash. It seems to come back up
ok, but the database tables have to be repaired each time as some minor damage happens each time the DB crashes.
This is a stock CentOS 5 mysql package, which is based on MySQL 5.0.22. I have not made any changes to the default RPM shipped with CentOS. FYI, I have also tried running with a similar configuration using 5.1.x with the same crashes.
Both database servers are Sun Fire V40z servers with a pair of AMD Opteron 885 dual core processors and 8GB of RAM. My databases and logs are all on SAN-attached storage, in the case of these databases that is a Hitachi 9570V box with about 4TB of storage. Both systems are using ext3 filesystems on LVM volumes on the SAN LUNs.
I had previously been running in a master-master cross-replication setup with each master being a master of the other and a slave of the other. This doesn't seem to be the problem. I have disabled the replication and turned one system off and the problem persists.
Last night I completely rebuilt all of the databases, doing mysqldumps to file, wiping out all of the databases and rebuilding from the dump file. This doesn't seem to have helped.
Nothing seems to be causing it. I have turned on query logging to see if I can track down which query might be killing mysqld.
Here is what shows up in the error log:
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=536870912
read_buffer_size=8384512
max_used_connections=38
max_connections=200
threads_connected=38
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 3800286 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Number of processes running now: 0
070620 13:18:10 mysqld restarted
070620 13:18:10 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
070620 13:18:10 InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 0 174234649.
InnoDB: Doing recovery: scanned up to log sequence number 0 174234649
InnoDB: Last MySQL binlog file position 0 21600151, file name /san/lun02/log/mysql-bin.000015
070620 13:18:11 InnoDB: Started; log sequence number 0 174234649
070620 13:18:11 [Note] Recovering after a crash using /san/lun02/log/mysql-bin
070620 13:18:11 [Note] Starting crash recovery...
070620 13:18:11 [Note] Crash recovery finished.
070620 13:18:13 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.0.22-log' socket: '/san/lun01/mysql/mysql.sock' port: 3306 Source distribution
Here is my my.cnf:
[mysqld]
port = 3306
datadir = /san/lun01/mysql
socket = /san/lun01/mysql/mysql.sock
pid-file = /san/lun02/log/mysql.pid
key_buffer = 512M
max_allowed_packet = 16M
table_cache = 2048
sort_buffer_size = 8M
join_buffer_size = 8M
read_buffer_size = 8M
read_rnd_buffer_size = 32M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 64M
query_cache_limit = 2M
binlog_cache_size = 1M
max_heap_table_size = 64M
ft_min_word_len = 2
default_table_type = MyISAM
thread_stack = 192K
tmp_table_size = 64M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 8
back_log = 50
max_connections = 200
memlock
#InnoDB
innodb
innodb_data_home_dir = /san/lun01/ibdata
innodb_log_group_home_dir = /san/lun01/ibdata
innodb_additional_mem_pool_size = 16M
innodb_buffer_pool_size = 2G
innodb_data_file_path = ibdata1:10M:autoextend
innodb_file_io_threads = 4
innodb_thread_concurrency = 16
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 120
innodb_fast_shutdown=2
# Clustering
#ndbcluster
server-id = 2
log-bin = /san/lun02/log/mysql-bin
log-bin-index = /san/lun02/log/mysql-bin.index
#log-output = TABLE,FILE
#log = /san/lun02/log/hermes-query.log
log-error = /san/lun02/log/hermes-error.log
log-slow-queries = /san/lun02/log/hermes-slow.log
long_query_time = 2
log_long_format
log_warnings
relay-log = /san/lun02/log/relay-log
relay-log-info-file = /san/lun02/log/relay-log.info
relay-log-index = /san/lun02/log/relay-log.index
master-info-file = /san/lun02/log/master.info
auto_increment_increment = 5
auto_increment_offset = 2
report-host = hermes
master-host = 192.168.10.20
master-user = replicate
master-password = <replication password>
replicate-same-server-id = 0
log-slave-updates
############################################################################
[mysqld_safe]
pid-file = /san/lun02/log/mysql.pid
err-log = /san/lun02/log/hermes-error.log
############################################################################
[mysql.server]
user=mysql
basedir=/san/lun02
############################################################################
[mysqldump]
quick
max_allowed_packet = 16M
############################################################################
[mysql]
no-auto-rehash
############################################################################
# The following options will be passed to all MySQL clients
[client]
port = 3306
socket = /san/lun01/mysql/mysql.sock
############################################################################
[isamchk]
key_buffer = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
############################################################################
[myisamchk]
key_buffer = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M
############################################################################
[mysqlhotcopy]
interactive-timeout
############################################################################
How to repeat:
Wait a few hours and watch it crash again.