MySQL Bugs: #43543: OPTIMIZE command on ~500meg table crashes MySql

Bug #43543	OPTIMIZE command on ~500meg table crashes MySql
Submitted:	10 Mar 2009 19:32	Modified:	24 Oct 2009 16:58
Reporter:	Ben Wallach	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server	Severity:	S1 (Critical)
Version:	5.0.77-log	OS:	Linux
Assigned to:		CPU Architecture:	Any
Tags:	Optimize, Signal 11

Description:
We just upgraded to Linux fedora 9 with MySql 5.0.51a-log. When attempting to OPTIMIZE a MyIsam table that is around ~500 megs (~300,000 rows), MySql crashes with signal 11.

090309  5:37:06 - mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=402653184
read_buffer_size=2093056
max_used_connections=265
max_connections=2048
threads_connected=21
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 385,008K bytes
of memory
Hope that's ok; if not, decrease some variables in the equation.
thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x6645ef8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x818b124
0x847452a
0x84745cc
0x8474658
0x8475a40
0x845e76c
0x67e32f
0x5b920e
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow
instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.

Did the following:
-------------------
nm -D -n /usr/libexec/mysqld > /tmp/mysqld.sym
resolve_stack_dump -s /tmp/mysqld.sym -n mysqld.stack | c++filt
output:
-------
0x818b124 handle_segfault + 836
0x847452a _my_b_get + 506
0x84745cc _my_b_get + 668
0x8474658 my_b_flush_io_cache + 120
0x8475a40 remove_io_thread + 224
0x845e76c thr_find_all_keys + 156
0x67e32f (?)
0x5b920e (?)

Any clues?

As a workaround to OPTIMIZE, we are now doing this:
CREATE TABLE ABC_TEMP LIKE ABC
INSERT INTO ABC_TEMP SELECT * FROM ABC
DROP TABLE ABC
RENAME TABLE ABC_TEMP TO ABC

Maybe this is a better solution any how?

Thanks in advance!
-Ben

How to repeat:
OPTIMIZE command on large table

additional info: 4gigs of ram, nothing else running on server

Thank you for the report.

But version 5.0.51 is old. Please try with current version 5.0.77 and if problem still exists provide your configuration file and output of SHOW CREATE TABLE and SHOW TABLE STATUS for the problem table.

Thanks for your comment Sveta. We just got this box and running fedora-9. Yum update only brings back Mysql-server 5.0.51a? Are you saying this version is no good? 

Thanks

Thank you for the feedback. What Sveta meant is that since the version you have reported several bugs were reported and fixed, i.e: http://bugs.mysql.com/bug.php?id=32103 was fixed on 5.0.52. You are able to upgrade to the latest version downloading the suitable package from our page. Please read the Manual how to do an upgrade. Please test the new version and comment here your results. Thanks in advance.

Thanks for your feedback. I was hoping that upgrading to a new version would be a last resort. This is not trivial for us as it requires special RPM install (outside of YUM) and downtime.

I was hoping that maybe a setting change could address the issue. For example, current max_connections is set to 2048. Maybe this should be dropped down to ~400?

Thank you

Thank you for the feedback.

But bugs database is for bugs in MySQL software, not for help with resolving user problems. So we need to identify if bug exists in latest released version. Additionally we don't backport bugfixes, so if it is MySQL bug you will have to upgrade anyway.

To make test easier for you and if table is MyISAM you could:

1. Install version 5.0.77 into local directory (use tar.gz package)
2. Create separate data directory, fill it with mysql_install_db
3. Copy problem table (MYI, MYD and frm files) to the new datadir (test database)
4. Start 5.0.77 server with newly created datadir, unique port and socket.
5. Connect to new server and run OPTIMIZE

Thanks..

If I am understanding you correctly, you are saying that since the problem is not in the latest release, there is nothing to do but upgrade. If that is correct, you may want to provide version options in the bug database as there is no point in filing a bug for any release that is not supported..

Anyway, we are going to stick with our workaround for now and upgrade later. I am a little surprised that a fairly recent release behaves badly as such (this is the default latest version in Fedora Core 9). We were on much older version for many years.

Thanks for your support & feedback

Upgraded to 5.0.77 (via YUM) on Fedora Core 9
DB still crashes on "OPTIMIZE" command.

recap:  
- myisam table with ~900,000 rows
- deleted a bunch of rows (~50,000 for example)
- run 'OPTIMZE TABLE TBL_NAME'
- DB crashes
from mysqld.log:
----------------
090917 12:43:00 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=402653184
read_buffer_size=2097152
max_used_connections=327
max_connections=2048
threads_connected=42
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 393216
 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=(nil)
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x3c39ef8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x818f189
0x847f6ea
0x847f78c
0x847f818
0x8480c00
0x846989c
0x67e32f
0x5b920e
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/using-stack-trace.html and follow
instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Thank you for the feedback.

Please provide your configuration file, information about hardware such as how many physical RAM do you have and output of SHOW CREATE TABLE problem_table

Linux 2.6.25.11-97.fc9.i686.PAE (fedora core 9)

4-gigs of ram

Filesystem             Size   Used  Avail Use% Mounted on
/dev/sda3              153G    20G   125G  14% /
/dev/sda1              2.2G    37M   2.0G   2% /tmp
tmpfs                  2.2G      0   2.2G   0% /dev/shm

show_create_table.txt & my.cnf files were just added as well.

Thanks

One thing to keep in mind is that OPTIMIZE crashes various tables (not just the one I included) so I don't think this is related to the specific structure of this table. If there is any significant data that has been removed from a table then running OPTIMIZE afterwards will crash the DB.

Thanks

Thank you for the feedback.

I can not repeat described problem using information provided. Please use debug version of server, turn option --core and try to create core file. Also this can be simple out of resources problem: check your OS log files and/or monitor RAM usage during OPTIMIZE run.

Which OS log files? Is there another log file (other than mysqld.log) that will have more information?

As mentioned earlier, there are 4-megs of ram. MySql is the only thing running on the box. Optimize crashes within 1-2 seconds of invoking it so it is not easy to monitor anything in real time.

What about the stack provided earlier (from mysld.log)? Does that provide any clues?

What about my.cnf? Anything wrong in it?

Also, does OPTIMIZE need a lot of ram? More than 4-gig of ram?

How do we run this in "debug" mode (can you provide step-by-step instructions)? Right now we just do '/etc/init.d/mysqld restart' to cycle the server. This is a production server, so it is not simple for us to bring it down.

One other thing I tried yesterday, is point the tmpdir (in my.cnf) to another partition (not /tmp) with much more storage. Same behavior though.

Thank you for the feedback.

> Which OS log files? Is there another log file (other than mysqld.log) that will have more
information?

I mean system logs which are in /var/log/messages or where your OS keeps them.

> As mentioned earlier, there are 4-megs of ram. MySql is the only thing running on the box.
> Optimize crashes within 1-2 seconds of invoking it so it is not easy to monitor anything
> in real time.
...
> What about the stack provided earlier (from mysld.log)? Does that provide any clues?

Log shows "mysqld got signal 11;" which means "Out of memory" Regarding to stack there is no strong clue yet.

> Also, does OPTIMIZE need a lot of ram? More than 4-gig of ram?

Not and surely it should not crash. But error 11 can mean bad memory also. Or some other program used RAM, then affects mysqld. This is why I asked you to check system log files. Please check your memory also.

> How do we run this in "debug" mode (can you provide step-by-step instructions)? Right now
we just do '/etc/init.d/mysqld restart' to cycle the server. This is a production server,
so it is not simple for us to bring it down.

There is mysqld-debug file in the same directory where mysqld is located. Just rename mysqld to mysqld-rel and mysqld-debug to mysqld, then restart the server. Don't forget to place option core into configuration file and allow mysqld to create core files on OS level.

* There is nothing in /var/log/messages

* There is no mysqld-debug on our system:
[root@GGfd1 /]# find . -name "mysqld*"
./etc/rc.d/init.d/mysqld
./tmp/mysqld.sym
./tmp/mysqld.stack
./usr/bin/mysqldump
./usr/bin/mysqld_multi
./usr/bin/mysqldumpslow
./usr/bin/mysqld_safe
./usr/libexec/mysqld
./usr/include/mysql/mysqld_ername.h
./usr/include/mysql/mysqld_error.h
./usr/share/man/man1/mysqldump.1.gz
./usr/share/man/man1/mysqld_multi.1.gz
./usr/share/man/man1/mysqld_safe.1.gz
./usr/share/man/man1/mysqldumpslow.1.gz
./usr/share/man/man8/mysqld.8.gz
./var/lock/subsys/mysqld
./var/run/mysqld
./var/run/mysqld/mysqld.pid
./var/log/mysqld.log

Thank you for the feedback.

Which exact 5.0.77 distribution do you run? (Filename of the package you downloaded)

This is Fedora Core 9 which uses YUM utility to install/update packages.

$rpm -qa | grep mysql
mysql-server-5.0.77-1.fc9.i386
php-mysql-5.2.6-2.fc9.i386
mysql-libs-5.0.77-1.fc9.i386
mysql-devel-5.0.77-1.fc9.i386
mysql-5.0.77-1.fc9.i386

Thank you for the feedback.

Please try one of our packages available at http://dev.mysql.com/downloads: this can be bug in Fedora package.

Bug in this version as well?
But we just upgraded from 5.0.51 (where the same behavior was present).
Does not seem likely that both versions are buggy. 
This is a major recent distribution widely used.

Thank you for the feedback.

We don't support Fedora packages. Also we don't backport bugfixes. We could not repeat this bug with our packages of version 5.0.77.

So we should be sure bug exists in our current packages of version 5.0.86. This is why I ask you to check.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".