Bug #6110 myisamchk segfaults
Submitted: 15 Oct 2004 3:34 Modified: 1 Dec 2004 21:49
Reporter: Thomas Johnson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: MyISAM storage engine Severity:S1 (Critical)
Version:4.0.21 OS:Linux (Linux)
Assigned to: Sergei Golubchik CPU Architecture:Any

[15 Oct 2004 3:34] Thomas Johnson
Description:
myisampack segfaults when I try to pack a huge table (approx 3.14 billion rows). It seems to analyze the first of two indices, then segfault immediately after. I'd be more than happy to provide more info on the crash if you can tell me how.

This problem occurs on both 4.0.20 and 4.0.21. It occurs both with the binaries available from mysql.com and binaries I compiled on Gentoo.

The table has two integer columns and 3 float columns. There is a PRIMARY index on the two int columns, and an index on one of the float columns.

How to repeat:
Run myisampack on my table :-)
I'm not specifying any options to myisampack other than --tmpdir=/x
[15 Oct 2004 14:23] Hartmut Holzgraefe
> I'd be more than happy to provide more info on the crash if you can tell me how.

I've been trying to reproduce this but had to gave up due to lack of disk space.

Can you try to produce a gdb backtrace using the myisampack binary from
the debug distribution?
[15 Oct 2004 16:08] MySQL Verification Team
Hi!

Can you tell us more about OS, CPU and compiler ??
[16 Oct 2004 22:51] Thomas Johnson
The OS is Gentoo Linux. The CPU is a P4 2.4Ghz with hyperthreading. Regarding the compiler, this happens both when I compile it myself with gcc 3.3.4 and when I download the  binaries directly from the mysql website. I'm running the debug binary that I downloaded from mysql.com right now now under gdb and I'll post the bt as soon as it segfaults
[17 Oct 2004 2:11] Thomas Johnson
Interestingly when I ran it to try to generate the backtrace, it finished without segfaulting. Possibly it's because I ran myisamchk beforehand just to ensure that I had a clean table? In any case, myisamchk -rq failed afterwards with a segfault. Below are gdb logs of both runs. Note the error in the myisampack run.

GNU gdb 6.0
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) set args --tmpdir=/bigdrive/mysqltmp -v -v -v ehsCache
(gdb) run
Starting program: /home/hazard/mysql-debug-4.0.21-pc-linux-i686/bin/myisampack --tmpdir=/bigdrive/mysqltmp -v -v -v ehsCache
Compressing ehsCache.MYD: (3140844486 records)
- Calculating statistics
            000
normal:      4  empty-space:       0  empty-zero:         2  empty-fill:   0
pre-space:   0  end-space:         0  intervall-fields:   2  zero:         0
Original trees:  8  After join: 8
error: Huff-tree-length: 518 != calc_length: 6
- Compressing file
Min record length:     10   Max length:     36   Mean total length:     15
46.8%
Remember to run myisamchk -rq on compressed tables

User time 14461.92, System time 1219.81
Maximum resident set size 0, Integral resident set size 0
Non-physical pagefaults 548, Physical pagefaults 61, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 389283, Involuntary context switches 6969266

Program exited normally.
(gdb)

>> Now we run myisamchk:
GNU gdb 6.0
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) set args --tmpdir=/bigdrive/mysqltmp -v -v -v -rq ehsCache
(gdb) run
Starting program: /home/hazard/mysql-debug-4.0.21-pc-linux-i686/bin/myisamchk --tmpdir=/bigdrive/mysqltmp -v -v -v -rq ehsCache

Program received signal SIGSEGV, Segmentation fault.
find_longest_bitstream (table=0x815ec94) at mi_packrec.c:396
396     mi_packrec.c: No such file or directory.
        in mi_packrec.c
(gdb) bt
#0  find_longest_bitstream (table=0x815ec94) at mi_packrec.c:396
#1  0x080556b9 in find_longest_bitstream (table=0x8152bb4) at mi_packrec.c:397
#2  0x08055463 in read_huff_table (bit_buff=0xbfffcaa0, decode_tree=0x814cde0,
    decode_table=0xbfffca88, intervall_buff=0xbfffca8c, tmp_buff=0x8152bb4)
    at mi_packrec.c:300
#3  0x08055054 in _mi_read_pack_info (info=0xbfffe30c, fix_keys=1 '\001')
    at mi_packrec.c:216
#4  0x0804c648 in mi_open (name=0xbffff625 "ehsCache", mode=2, open_flags=32)
    at mi_open.c:419
#5  0x08048c66 in myisamchk (param=0x8149180, filename=0xbffff625 "ehsCache")
    at myisamchk.c:740
#6  0x0804827e in main (argc=0, argv=0x814c670) at myisamchk.c:104
[18 Oct 2004 10:24] MySQL Verification Team
How big is a table when bzipped or gzipped ??

Try to upload it as a private file to this bug record
[18 Oct 2004 20:16] Thomas Johnson
The data and indices combined are still 33G gzipped. If I send you just the MYD file (that's just the data right?), would that be sufficient or do you need the index too?
[18 Oct 2004 20:18] Thomas Johnson
Do you really think it's something to do with the data or the table itself? I can send you the output of DESCRIBE TABLE and you could generate a similar table filled with junk data and see if you can reproduce the bug on that. 

Tom
[19 Oct 2004 12:55] MySQL Verification Team
It is data itself, most probably that make a problem.

It is enough to send .MYD and .frm.

How are those two big when gzipped ??

Try to upload them.
[19 Oct 2004 15:59] Sergei Golubchik
according to backtrace myisamchk crashes when it tries to open the table.
The problem must be somewhere in the header.

Could you upload, say, first 50K of the compressed MYD file ?
(I don't know how big a header can be, it includes Huffman trees)
[20 Oct 2004 8:19] Thomas Johnson
Uploaded! Let me know if you need any other data.
[24 Nov 2004 14:40] Sergei Golubchik
sorry for delay :(
Do you still have the table in question ? we also need a MYI header (1-2K but let's say 10K to be on the safe side). as it's MYI header that contains table structure, myisampack cannot open the table without it.
.frm file is ok too. (TRUNCATE TABLE can be used to create MYI from frm)
[26 Nov 2004 10:39] Thomas Johnson
I don't have the compressed one anymore, I could recreate it if you really need it but it does take quite a few hours. Can I give you the first 10K of the uncompressed MYI or do you definitely need the compressed one?
[26 Nov 2004 12:38] Sergei Golubchik
Yes, MYI of uncompressed table is fine.
[29 Nov 2004 2:58] Thomas Johnson
Uploaded the first 10,000 bytes of the MYI file. Let me know if there's anything else you need!

Tom
[1 Dec 2004 21:49] Sergei Golubchik
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Ok, I fixed myisamchk - it does not crash anymore but reports "table is corrupted" error (as the table is indeed corrupted). It also means mysqld won't crash on such a table either.

Still I don't know why myisampack generated such a broken table header - I would need a way to repeat the problem for this (that is a table that, being compressed, results in corrupted MYD file).