Description:
Over the past 2 days, MySQL on my slave server has died and restarted, below you will find three different err logs entries about each one of them. I've included my.cnf. This box has been running without any problems for the past month. The box is a dual proc 3.06 xeon with hyperthreading with 2 gigs of memory. I know we aren't supposed to turn on swap, but we did anyway and it's set to 4 gigs, but it's never been used before.
Besides this box being a slave server, it also acts as a secondary database server for databases that are not needed on the master server.
After the third time I reinstalled the rpms of 4.1.1 with rpm -F, just to make sure that isn't the problem. But I haven't restarted mysql since all of the tables are still being fixed.
Please, any suggestions would be appreciated. And I do have a mysql support contract.
Thanks.
Donny
********* First time *************
040227 17:50:25 /usr/sbin/mysqld: Normal shutdown
040227 17:50:25 Aborted connection 49 to db: 'mysql' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
040227 17:50:25 Slave I/O thread killed while reading event
040227 17:50:25 Slave I/O thread exiting, read up to log 'parking-master-bin.000096', position 16014498
040227 17:50:25 Aborted connection 112 to db: 'mysql' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
040227 17:50:25 Aborted connection 99 to db: 'mysql' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
040227 17:50:25 mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=536870912
read_buffer_size=2093056
Aborted connection 110 to db: 'mysql' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
max_used_connections=47
max_connections=500
threads_connected=20
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1153068 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=0x86376d0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xbfe1f638, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8089167
0x82da818
040227 17:50:25 Aborted connection 29 to db: 'test' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
0x83014e9
0x8301383
0x82b7046
0x82b7639
0x8094940
0x82d7fcc
0x830b8fa
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil) 040227 17:50:25 Aborted connection 73 to db: 'mysql' user: 'phpmyadmin' host: `parking-slave-backend.directnic.com' (Got timeout reading communication packets)
is invalid pointer
thd->thread_id=11
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
040227 17:50:26 Slave SQL thread exiting, replication stopped in log 'parking-master-bin.000094' at position 660174650
Number of processes running now: 0
040227 17:52:28 mysqld restarted
040227 17:52:29 Warning: Asked for 196608 thread stack, but got 126976
/usr/sbin/mysqld: ready for connections.
Version: '4.1.1-alpha-standard-log' socket: '/var/lib/mysql/mysql.sock' port: 3306
********* Second time *************
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=536870912
read_buffer_size=2093056
max_used_connections=136
max_connections=500
threads_connected=113
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1153068 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=0x60c90088
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xbf49f638, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x8089167
0x82da818
0x83014e9
0x8301383
0x82b7046
0x82b7639
0x8094940
0x82d7fcc
0x830b8fa
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil) is invalid pointer
thd->thread_id=156
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
Number of processes running now: 0
040227 23:53:15 mysqld restarted
040227 23:53:15 Warning: Asked for 196608 thread stack, but got 126976
/usr/sbin/mysqld: ready for connections.
Version: '4.1.1-alpha-standard-log' socket: '/var/lib/mysql/mysql.sock' port: 3306
********* Third time (this morning) *************
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Number of processes running now: 1
mysqld process hanging, pid 1926 - killed
040228 09:37:21 mysqld restarted
040228 9:37:21 Warning: Asked for 196608 thread stack, but got 126976
/usr/sbin/mysqld: ready for connections.
Version: '4.1.1-alpha-standard-log' socket: '/var/lib/mysql/mysql.sock' port: 3306
********* Resolve Stack Dump *************
I ran a resolve_stack_dump on #1 and #2 and this is what I got:
0x8089167 handle_segfault + 423
0x82da818 pthread_sighandler + 184
0x83014e9 chunk_free + 297
0x8301383 free + 147
0x82b7046 my_no_flags_free + 22
0x82b7639 free_root + 153
0x8094940 handle_one_connection + 608
0x82d7fcc pthread_start_thread + 220
0x830b8fa thread_start + 4
********* my.cnf *************
##
## Intercosmos MySQL configuration
## For MySQL 4.x versions
##
##
[client]
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
skip-locking
datadir = /www/mysql
log-warnings
# We dont need innodb.
skip-innodb
open-files-limit=36864
set-variable = max_connect_errors=500
set-variable = max_connections=400
ft_min_word_len=3
# Who knows.
set-variable = key_buffer_size=256M
# Used for ORDER BY or GROUP BY operations
#set-variable = sort_buffer=8M
set-variable = sort_buffer_size=16M
set-variable = read_rnd_buffer_size=4M
# table_cache should be max_connections * number of joins.
set-variable = table_cache=1024
#No actual reason.
set-variable = thread_cache_size=8
# Set as big as the largest blob.
set-variable = max_allowed_packet=8M
set-variable = read_buffer_size=2M
# Try number of CPU's*2 for thread_concurrency
set-variable = thread_concurrency=8
set-variable = myisam_sort_buffer_size=64M
[mysqld]
server-id=3
#In theory will kick all connections every hour.
set-variable = interactive_timeout=3600
set-variable = wait_timeout=3600
set-variable = query_cache_size=8M
tmpdir = /tmp
myisam-recover = BACKUP,FORCE
log-long-format
log-slow-queries
set-variable = long_query_time=3
[mysqldump]
quick
set-variable = max_allowed_packet=16M
[mysql]
no-auto-rehash
[myisamchk]
set-variable = key_buffer=256M
set-variable = sort_buffer=256M
set-variable = read_buffer=2M
set-variable = write_buffer=2M
[mysqlhotcopy]
interactive-timeout
How to repeat:
Well, this is the third time this happened in the past 2 days. I'm repairing the tables for the third time and with over 50 gigs of data it takes about 3 hours to complete. So this is getting annoying really fast.