Bug #22440 MySQLd crashes with "out of memory" error
Submitted: 18 Sep 2006 11:46 Modified: 28 Feb 2007 8:55
Reporter: Chris Taylor Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.0.18 OS:Linux (Slackware Linux 10.2)
Assigned to: CPU Architecture:Any

[18 Sep 2006 11:46] Chris Taylor
Description:
I'm using the binaries from MySQL.com, version "standard-5.0.18-linux-i686-glibc23".

The problem is that mysqld crashes intermittently. I admit that this is the first time I have been bothered enough to file a bug report so I've spent some time looking into this when I hadn't before. I suspect it was the same issue in the past, as our software generates emails when the DB fails and I've seen that the first error email was the result of a query similar to the one below (see the end of the trace).

The server is a P4 3.0GHz (with Hyperthreading) and 2gigs of RAM. Other than MySQL, it runs Lighttpd (1.4.11) and PHP (4.4.2). Primary usage is vBulletin (version 3.5.4 at present) but there are a few other scripts on there too. Nothing has changed in the software but we've had these intermittent crashes for a while. By intermittent, I mean every couple of weeks I guess.

We are planning a full set of upgrades in the very near future, but I figured that I should report this in case it's not a bug that has been fixed. This is my first bug report, so please go easy on me :) I have search the database and found some similar results but I was unable to find anything exactly like this. The closest matches seemed to be closed or unresolved (like this one: http://bugs.mysql.com/bug.php?id=19494)

Anyway, here's the stack trace (I followed instructions here: http://dev.mysql.com/doc/refman/5.0/en/using-stack-trace.html).

======================================
START TRACE
======================================

mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=100663296
read_buffer_size=1044480
max_used_connections=72
max_connections=500
threads_connected=72
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1632300 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x9115690
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0xa5ffc5a8, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
0x81536e8 handle_segfault + 356
0xb7f87c85 _end + -1348170571
0x81dd464 _ZN7SEL_ARGC1EP5FieldhPcS2_hhh + 12
0x81e9520 _ZN7SEL_ARG9clone_andEPS_ + 168
0x81e8082 _Z7key_andP7SEL_ARGS0_j + 526
0x81e5f28 _Z8tree_andP13st_qsel_paramP8SEL_TREES2_ + 232
0x81e5aaa _Z16get_func_mm_treeP13st_qsel_paramP9Item_funcP5FieldP4Item11Item_resultb + 418
0x81e3f9c _Z11get_mm_treeP13st_qsel_paramP4Item + 892
0x81e42a4 _Z11get_mm_treeP13st_qsel_paramP4Item + 1668
0x81ddd97 _ZN10SQL_SELECT17test_quick_selectEP3THD6BitmapILj64EEyyb + 1283
0x81a3a23 _Z22get_quick_record_countP3THDP10SQL_SELECTP8st_tablePK6BitmapILj64EEy + 59
0x819aee2 _Z20make_join_statisticsP4JOINP13st_table_listP4ItemP16st_dynamic_array + 2550
0x8191fb6 _ZN4JOIN8optimizeEv + 750
0x8195226 _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_mP13select_resultP18st_select_lex_unitP13st_sel + 130
0x819135e _Z13handle_selectP3THDP6st_lexP13select_resultm + 234
0x8167301 _Z21mysql_execute_commandP3THD + 585
0x816e176 _Z11mysql_parseP3THDPcj + 306
0x8165c3e _Z16dispatch_command19enum_server_commandP3THDPcj + 1178
0x8165769 _Z10do_commandP3THD + 129
0x8164c71 handle_one_connection + 609
0xb7f8254e _end + -1348192898
0xb7eadb8a _end + -1349063750
New value of fp=(nil) failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. 
Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x9550180 = SELECT
                        thread.threadid
                        FROM thread AS thread
                        INNER JOIN post AS post ON(thread.threadid = post.threadid )
                        WHERE post.postid 
IN(15,70,72,73,74,75,80,81,82,85,97,141,142,143,145,159,160,161,166,169,185,186,190,195,196,197,199,200,202,203,215,217,220,231,233,238,241,242,259,260,281,286,293,299,302,342,343,360,374,375,376,378,379,399,401,407,412,421,423,430,433,438,443,444,449,456,458,472,474,478,479,498,499,506,540,596,597,598,629,634,635,636,639,643,644,646,647,653,656,662,673,674,677,679,680,681,683,695,697,699,711,732,734,736,739,743,747,748,751,752,754,755,770,774,776,777,779,781,782,790,817,827,828,832,838,840,843,844,845,861,864,872,874,882,892,899,900,905,917,922,952,954,955,958,961,984,988,989,990,994,996,1004,1007,1009,1016,1020,1028,1042,1046,1053,1068,1071,1072,1085,1090,1098,1099,1103,1104,1105,1138,1148,1154,1156,1157,1160,1169,1210,1211,1212,1216,1217,1267,1275,1302,1305,1306,1310,1315,1317,1319,1339,1340,1341,1344,1345,1347,1350,1359,1360,1374,1389,1396,1403,1424,1439,1440,1444,1445,1
thd->thread_id=23748

======================================
END TRACE
======================================

If it's relevant, the "post" table is around 690MB and the "thread" table is around 30M. MySQL normally occupies most of the RAM on the box (although much of the usage, typically 1.2GB will be "cached" in top), but I've never seen more than 100MB or so of swap used (it's setup with 1.5GB available). It generally hovers at around 100-200MB of free RAM, and performance is always satisfactory. CPU usage is light and the load is typically no more than 0.5 even during busy periods.

As I said, I haven't reported a bug before, so if I have missed anything please let me know.

Thanks!

How to repeat:
I'm not yet sure. I have reduced the "max_connections" in my.cnf from 500 to 300, as if this is a genuine out of memory error (rather than a bug), this may solve it (I hope)?

Otherwise, I may have to wait some time for it to crash again. Queries similar to the one that failed are run by vBulletin all the time though, so I'm at a loss to say what caused this.
[18 Sep 2006 12:52] Valeriy Kravchuk
Thank you for a problem report. Please, send your my.cnf file content. Please, try to repeat with a newer version, 5.0.24a. Your 5.0.18 is old enough, and some bugs already fixed may influence you.
[18 Sep 2006 15:54] Chris Taylor
I've attached it using the Files section: http://bugs.mysql.com/file.php?id=4363

Much of it is the same as the defaults - it's based on my-medium if memory serves. The altered values are the result of recommendations from other vB users and some tuning over time.

I'll try to upgrade MySQL (and PHP, I presume?) within the next few days or so.
[11 Oct 2006 11:40] Chris Taylor
Just a quick update on this...

I left the config alone since the original bug report (although, as mentioned, I did reduce the memory usage before I wrote the report) and the error has not occurred again.

I've just upgraded the server to 5.0.26 and I'll report back if we get something like this again.

Thanks,
Chris
[11 Oct 2006 13:36] Valeriy Kravchuk
Please, inform about any results with 5.0.26.
[12 Nov 2006 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[28 Feb 2007 8:55] Chris Taylor
I received an email asking for more feedback on this bug.

All I can report is that MySQL seems more stable than it's ever been on our server. We get *very* few problems, perhaps one "lost connection to database" email a week from vBulletin now - where at times we used to see 10+ a day - and the load on the server only continues to increase.

I can only assume that whatever caused the initial bug report was fixed between 5.0.18 and 5.0.27 (which we're currently running), so I'm going to mark this bug as closed.

Thanks for all of your help, and keep up the good work - you've got a really first-class product here :)