Bug #33820 mysqld debian amd 64bit random crashes - reproducible
Submitted: 11 Jan 2008 15:59 Modified: 15 Jan 2008 15:50
Reporter: Vassilis Virvilis Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.0.45 OS:Linux
Assigned to:

[11 Jan 2008 15:59] Vassilis Virvilis
Description:
Halo,

I am running debian - stable in AMDx4 64bit.
The mysql server is version 5.0.45 from backports but I can reproduce it
with the debian stable original mysql 5.0.26

I have random crashes with our custom software at 2 or 3 distinct places.
I am able to reproduce it always in 30m but it is also reproducible
if I do a CHECK all tables from mysqlcc

Please note that I first run myismachk in our tables without any problem

The problem looks like the block structure is poisoned/garbage in all cases
I have seen so far (2).

Please if you need more info don't hesitate to ask. As I said It is always
reproducible under 30min

Crash Info:
-----------

I am also attaching the my.cnf file

greek:/usr/src/mysql-dfsg-5.0-5.0.45# uname -a
Linux greek 2.6.18-5-amd64 #1 SMP Sat Dec 22 20:43:59 UTC 2007 x86_64 GNU/Linux
greek:/usr/src/mysql-dfsg-5.0-5.0.45# free
             total       used       free     shared    buffers     cached
Mem:      28667304   28598604      68700          0     104004   23867184
-/+ buffers/cache:    4627416   24039888
Swap:      9679152       1856    9677296

GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Using host libthread_db library "/lib/libthread_db.so.1".
Source directories searched: /usr/src/mysql-dfsg-5.0-5.0.45/mysys:/usr/src/mysql-dfsg-5.0-5.0.45/sql:$cdir:$cwd
[Thread debugging using libthread_db enabled]
[New Thread 46985537111664 (LWP 10521)]
[New Thread 1082132832 (LWP 10524)]
[Thread 1082132832 (zombie) exited]
[New Thread 1082399072 (LWP 10525)]
080111 17:00:25 [Note] /usr/src/mysql-dfsg-5.0-5.0.45/sql/mysqld: ready for connections.
Version: '5.0.45-5'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)
[New Thread 1082665312 (LWP 10526)]
[New Thread 1082931552 (LWP 10527)]
[New Thread 1083197792 (LWP 10660)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1082931552 (LWP 10527)]
0x000000000084f2cf in unlink_block (keycache=0xc98900, block=0x2aab55802010) at mf_keycache.c:1013
1013        block->next_used->prev_used= block->prev_used;
Current language:  auto; currently c
(gdb)
(gdb) print *block
$1 = {next_used = 0x89de360001eb03, prev_used = 0x2aab6749f5d0, next_changed = 0x6c4da11200008ade, prev_changed = 0x1200008bde360001,
  hash_link = 0x8cde360001bc4da1, wqueue = {{last_thread = 0x1144ea1120000}, {last_thread = 0x4ea11200008dde36}}, requests = 905970024,
  buffer = 0xde360001c04ea112 <Address 0xde360001c04ea112 out of bounds>, offset = 301990031, length = 18370465, status = 2430481920,
  temperature = 2702311424, hits_left = 94287, last_hit_time = 3891112037889151488, condvar = 0x1c50a112000092de}
(gdb) print *block->wqueue
$2 = {last_thread = 0x1144ea1120000}
(gdb) print block->wqueue[0]
$3 = {last_thread = 0x1144ea1120000}
(gdb) print *block->next_used
Cannot access memory at address 0x89de360001eb03
(gdb) print *block->prev_used
$4 = (struct st_block_link *) 0x2aab55802010
(gdb) print block->prev_used
$5 = (struct st_block_link **) 0x2aab6749f5d0
(gdb) print *(block->prev_used)
$6 = (struct st_block_link *) 0x2aab55802010
(gdb) print *(*(block->prev_used))
$7 = {next_used = 0x89de360001eb03, prev_used = 0x2aab6749f5d0, next_changed = 0x6c4da11200008ade, prev_changed = 0x1200008bde360001,
  hash_link = 0x8cde360001bc4da1, wqueue = {{last_thread = 0x1144ea1120000}, {last_thread = 0x4ea11200008dde36}}, requests = 905970024,
  buffer = 0xde360001c04ea112 <Address 0xde360001c04ea112 out of bounds>, offset = 301990031, length = 18370465, status = 2430481920,
  temperature = 2702311424, hits_left = 94287, last_hit_time = 3891112037889151488, condvar = 0x1c50a112000092de}
(gdb) print block
$8 = (BLOCK_LINK *) 0x2aab55802010
(gdb) bt
#0  0x000000000084f2cf in unlink_block (keycache=0xc98900, block=0x2aab55802010) at mf_keycache.c:1013
#1  0x000000000084f4bb in unreg_request (keycache=0xc98900, block=0x2aab55802010, at_end=1) at mf_keycache.c:1102
#2  0x00000000008502f1 in key_cache_read (keycache=0xc98900, file=80, filepos=40714240, level=3, buff=0x408b6660 "\003οΏ½\001",
    length=1024, block_length=1024, return_buffer=0) at mf_keycache.c:1879
#3  0x000000000081f0c8 in _mi_fetch_keypage (info=0xd6b650, keyinfo=0xd2e0b8, page=40714240, level=1, buff=0xc989b0 "\001",
    return_buffer=<value optimized out>) at mi_page.c:31
#4  0x0000000000833a75 in chk_index_down (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=40714240, buff=0x408b6660 "\003οΏ½\001",
    keys=0x408b8378, key_checksum=0x408b88e8, level=4) at mi_check.c:623
#5  0x00000000008337dd in chk_index (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=40747008, buff=0x408b7000 "\203οΏ½",
    keys=0x408b8378, key_checksum=0x408b88e8, level=3) at mi_check.c:785
#6  0x0000000000833ac0 in chk_index_down (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=40747008, buff=0x408b7000 "\203οΏ½",
    keys=0x408b8378, key_checksum=0x408b88e8, level=3) at mi_check.c:630
#7  0x00000000008337dd in chk_index (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=43574272, buff=0x408b79a0 "\203οΏ½",
    keys=0x408b8378, key_checksum=0x408b88e8, level=2) at mi_check.c:785
#8  0x0000000000833ac0 in chk_index_down (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=43574272, buff=0x408b79a0 "\203οΏ½",
    keys=0x408b8378, key_checksum=0x408b88e8, level=2) at mi_check.c:630
#9  0x00000000008337dd in chk_index (param=0x408b83f0, info=0xd6b650, keyinfo=0xd2e0b8, page=91249664, buff=0xd6b980 "\201οΏ½",
    keys=0x408b8378, key_checksum=0x408b88e8, level=1) at mi_check.c:785
#10 0x00000000008388c5 in chk_key (param=0x408b83f0, info=0xd6b650) at mi_check.c:477
#11 0x000000000065d543 in ha_myisam::check (this=0xd7d110, thd=0xceabc0, check_opt=<value optimized out>) at ha_myisam.cc:703
#12 0x00000000006580aa in handler::ha_check (this=0xd7d110, thd=0xceabc0, check_opt=0xcec998) at handler.cc:2140
#13 0x000000000068227c in mysql_admin_table (thd=0xceabc0, tables=<value optimized out>, check_opt=0xcec998,
    operator_name=0x90b921 "check", lock_type=TL_READ_NO_INSERT, open_for_modify=false, no_warnings_for_error=true, extra_open_options=0,
    prepare_func=0, operator_func={__pfn = 0x658060 <handler::ha_check(THD*, st_ha_check_opt*)>, __delta = 0},
    view_operator_func=0x6b2370 <view_checksum(THD*, st_table_list*)>) at sql_table.cc:2389
#14 0x0000000000682bac in mysql_check_table (thd=0xc98900, tables=0x2aab55802010, check_opt=0x89de360001eb03) at sql_table.cc:2919
#15 0x00000000005bcc00 in mysql_execute_command (thd=0xceabc0) at sql_parse.cc:3398
#16 0x00000000005bfee2 in mysql_parse (thd=0xceabc0,
    inBuf=0xd5ef50 "CHECK TABLE\n`AdverseReactions`, `AdverseReactionsDrugs`, `AdverseReactionsReact`, `CustomTargets`, `EntityTokens`, `FreetextRequests`, `FreetextTrace`, `JournalNames`, `Labels`, `TargetClasses`, `acce"..., length=1061, found_semicolon=0x408c2f28)
    at sql_parse.cc:6097
#17 0x00000000005c03ab in dispatch_command (command=COM_QUERY, thd=0xceabc0, packet=0xd1b0c5 "\033", packet_length=<value optimized out>)
    at sql_parse.cc:1812
#18 0x00000000005c14f3 in do_command (thd=0xceabc0) at sql_parse.cc:1586
#19 0x00000000005c1ff3 in handle_one_connection (arg=<value optimized out>) at sql_parse.cc:1197
#20 0x00002abbab9d5f1a in start_thread () from /lib/libpthread.so.0
#21 0x00002abbac2885d2 in clone () from /lib/libc.so.6
#22 0x0000000000000000 in ?? ()
(gdb) -                                                                                                                                                                                                                         

How to repeat:
I have random crashes with our custom software at 2 or 3 distinct places.

I am able to reproduce it always in under 30m but it is also reproducible
if I do a CHECK all tables from mysqlcc
[12 Jan 2008 9:03] Shane Bester
Crash in keycache code is rare.  Do you have a key_buffer_size larger than 4GB ?  That would be a possible cause.
[14 Jan 2008 8:12] Vassilis Virvilis
That looks spot on.

I had a key_buffer of 8GB (40GB here). Now I have changed to 4GB and I cannot reproduce the crash anymore with my simple tests. Right now I am doing some more tests that will take a day or so to complete. I will post here what happened later today or tomorrow.

On the other hand I would be the first to admit that I am not a mysql guru but this kind of behavior really surprised me.

If I am not supposed to set key_buffer above 4G then an error preventing mysqld to start should happen.

If it is ok to set key_buffer > 4G but it is considered dangerous and/or bleeding edge,  a warning would be appreciated.

Finally if this is just a bug (and bugs happen) I am volunteering to help find and fix it by
 -compiling with --debug
 -applying and testing patches
 -any other procedure is within my power to perform

  Thanks for the invaluable help
[14 Jan 2008 8:25] Shane Bester
Hi!  Actually, this bug is already fixed in the newest releases.  Have a look at Bug #5731 for those details.  We'll wait for your feedback tomorrow.
[15 Jan 2008 15:48] Vassilis Virvilis
ok everything works as advertised. I suppose you can mark this bug as a duplicate of 5731 (or should I?).

If I read this bug correctly it will be automatically fixed when/If I upgrade to
5.0.52/5.1.23/5.2.6/6.0.3 but I will wait for the proper debian upgrade.

Thanks for the help
[15 Jan 2008 15:50] Shane Bester
this bug is a duplicate of #bug 5731