Bug #29601 Falcon crashes during DBT2 initial load of tables
Submitted: 6 Jul 2007 17:33 Modified: 6 Aug 2007 22:45
Reporter: Vadim TKACHENKO Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Falcon storage engine Severity:S1 (Critical)
Version:6.0-bk OS:Any
Assigned to: Jim Starkey CPU Architecture:Any

[6 Jul 2007 17:33] Vadim TKACHENKO
Description:
I'm trying to test Falcon with DBT2 benchmarks.

Unfortunately Falcon crashes for me during load data into tables

with next backtrace:

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1085028704 (LWP 10232)]
0x0000003e3f70c38e in raise () from /lib64/tls/libpthread.so.0
(gdb) bt
#0  0x0000003e3f70c38e in raise () from /lib64/tls/libpthread.so.0
#1  0x0000000000762e1d in Error::error ()
#2  0x00000000007eab96 in SerialLogFile::write ()
#3  0x00000000007a8b25 in SerialLogWindow::write ()
#4  0x00000000007a4e56 in SerialLog::flush ()
#5  0x00000000007c1ff5 in SRLUpdateRecords::append ()
#6  0x0000000000735a2d in Transaction::addRecord ()
#7  0x00000000007309ab in Table::insert ()
#8  0x0000000000724019 in StorageTable::insert ()
#9  0x000000000071c340 in StorageInterface::write_row ()
#10 0x000000000069d63d in handler::ha_write_row ()
#11 0x000000000063c44d in write_record ()
#12 0x00000000006b15c0 in mysql_load ()
#13 0x00000000005d5605 in mysql_execute_command ()
#14 0x00000000005d8eb6 in mysql_parse ()
#15 0x00000000005d9b6d in dispatch_command ()
#16 0x00000000005da6c3 in do_command ()
#17 0x00000000005cbb83 in handle_one_connection ()
#18 0x0000003e3f706137 in start_thread () from /lib64/tls/libpthread.so.0
#19 0x0000003e3ecc7113 in clone () from /lib64/tls/libc.so.6
(gdb) 

I generated data for 400 warehouses, script to load data:
bash mysql_load_db.sh -s /tmp/mysql.sock -f /mnt/data/dbt2-400w/ -d dbt2_falcon -e falcon

I used Falcom from today's bk tree

How to repeat:
generate data for 400 warehouses and execute load script:
mysql_load_db.sh -s /tmp/mysql.sock -f /mnt/data/dbt2-400w/ -d dbt2_falcon -e falcon
[7 Jul 2007 21:08] Hakan Küçükyılmaz
Vadim,

which version of DBT2 are you using?
[7 Jul 2007 21:24] Vadim TKACHENKO
Hakan,

I use dbt2-0.37
[8 Jul 2007 6:42] Hakan Küçükyılmaz
Verified as described.

./mysql_load_db.sh -d test -f /tmp/dbt2-w400 -c /usr/local/mysql/bin/mysql -h localhost -u root -e FALCON

...
...
...
Loading table customer
Loading table district
Loading table history
Loading table item
Loading table new_order
Loading table order_line

Backtrace is

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1076373856 (LWP 23342)]
0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
#1  0x0000000000843184 in Error::error (string=<value optimized out>) at Error.cpp:92
#2  0x00000000008b4343 in Cache::fetchPage (this=0x2a9655aff0, dbb=0x2a965a36c8, pageNumber=-572662307,
    pageType=PAGE_any, lockType=Exclusive) at Cache.cpp:162
#3  0x000000000087b50c in Section::getSectionPage (this=0x2c3098b9b0, sequence=100, lockType=Exclusive,
    transId=100) at Section.cpp:1240
#4  0x000000000087c038 in Section::insertStub (this=0x2c3098b9b0, transId=100) at Section.cpp:313
#5  0x000000000080f604 in Table::insert (this=0x2a965b00c0, transaction=0x2a965682a8, stream=0x2b951ad870)
    at Table.cpp:2550
#6  0x0000000000803c57 in StorageTable::insert (this=0x2b951accf8) at StorageTable.cpp:88
#7  0x00000000007fb029 in StorageInterface::write_row (this=0x2a97015fd0, buff=<value optimized out>)
    at ha_falcon.cpp:888
#8  0x00000000007372a0 in handler::ha_write_row (this=0x2a97015fd0, buf=0x5b2e <Address 0x5b2e out of bounds>)
    at handler.cc:3669
#9  0x00000000006cb794 in write_record (thd=0xfb3f60, table=0x2a97014780, info=0x4027ff60)
    at sql_insert.cc:1515
#10 0x000000000077f458 in mysql_load (thd=0xfb3f60, ex=<value optimized out>, table_list=0x2a9700b260,
    fields_vars=@0xfb5dc8, set_fields=@0xfb5df8, set_values=@0xfb5de0, handle_duplicates=DUP_ERROR,
    ignore=false, read_file_from_client=false) at sql_load.cc:822
#11 0x0000000000661c2b in mysql_execute_command (thd=0xfb3f60) at sql_parse.cc:2981
#12 0x0000000000665831 in mysql_parse (thd=0xfb3f60,
    inBuf=0x2a9700b0d0 "LOAD DATA  INFILE \"/tmp/dbt2-w400/order_line.data\"", ' ' <repeats 17 times>, "INTO TABLE order_line FIELDS TERMINATED BY '\\t'", length=114, found_semicolon=0x40281598) at sql_parse.cc:5381
#13 0x00000000006664ed in dispatch_command (command=COM_QUERY, thd=0xfb3f60,
    packet=0x1000b31 "LOAD DATA  INFILE \"/tmp/dbt2-w400/order_line.data\"", ' ' <repeats 17 times>, "INTO TABLE order_line FIELDS TERMINATED BY '\\t'", packet_length=115) at sql_parse.cc:907
#14 0x0000000000666e22 in do_command (thd=0xfb3f60) at sql_parse.cc:669
#15 0x00000000006581fb in handle_one_connection (arg=<value optimized out>) at sql_connect.cc:1091
[9 Jul 2007 0:33] Jim Starkey
I can't reproduce this because I don't have the require files.  Please
attach or send pointer to the directory dbt2-w400.
[9 Jul 2007 7:54] Calvin Sun
from Hakan's email:

That's the first part of DBT2; generating the data and then loading it.

You can generate the data with:

cd dbt2-fixed-20042007/src
./datagen -w400 -d /tmp/dbt2-w400 --mysql

I have just done that on fluffy. But it ended with:
   datagen.c:259:fn: No space left on device
[9 Jul 2007 16:48] Hakan Küçükyılmaz
Vadim,

the backtrace looks like that you are getting out of disk space while loading data.
[9 Jul 2007 16:55] Vadim TKACHENKO
Hakan,

I think that is not the case.

I have
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-Data
                      1.3T  121G  1.2T  10% /mnt/data

on partition with mysql datadir.
[10 Jul 2007 16:35] Hakan Küçükyılmaz
Current problem with this scenario is that the DBT2 load uses LOAD DATA INFILE to load the DBT2 tables.

LOAD DATA INFILE is a single transaction for Falcon. The table "stock" has a data file of 12GB, which leads to an OOM problem on my 8GB system. The OOM killer kills mysqld while loading stock.

hakan@ls3235:/tmp/dbt2-w400> du -h *
6.5G    customer.data
384K    district.data
704M    history.data
7.3M    item.data
38M     new_order.data
474M    order.data
7.6G    order_line.data
12G     stock.data
36K     warehouse.data

Possible solution would be to detect that we are running in a LOAD DATA INFILE and then commit every - say 10.000 records or so.

Best regards,

Hakan
[10 Jul 2007 17:04] Peter Zaitsev
Heh, That is a bummer.

Interesting does it do the same in case of ALTER TABLE ? 

The current implementation of LOAD DATA INFILE  obviously does not work for tables of more than the toy size.

This shows another issue - there seems to be no restriction about how much memory Falcon can consume which makes it easy target to DOS attacks. 

Note Innodb has kind of similar issue - in INSERT ... SELECT for example it can run out of lock table space, but this is stored inside the buffer pool which has fixed limit.
[10 Jul 2007 17:53] Hakan Küçükyılmaz
Peter,

you are right ALTER TABLE will have the very same affect as a big LOAD DATA INFILE. We are already investigating the best possible solution.

Thanks for your feedback,

Hakan
[10 Jul 2007 23:32] Hakan Küçükyılmaz
Vadim,

how much RAM and how many CPU's does your system have?

Best regards,

Hakan
[11 Jul 2007 5:44] Vadim TKACHENKO
Hakan,

I have 16GB of RAM
and 4 CPU

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5148  @ 2.33GHz
stepping        : 6
cpu MHz         : 2327.529
cache size      : 4096 KB
[13 Jul 2007 8:52] Hakan Küçükyılmaz
The LOAD DATA INFILE issue is fixed by commiting every 10000 rows in case of a LOAD DATA INFILE is running.

After that fix we see another assertion which looks like an integer
overflow:

(gdb) f 4
#4  0x000000000091c3ed in Cache::fetchPage (this=0x2a9655aff0,
dbb=0x2a965a3818, pageNumber=-572662307,
    pageType=PAGE_any, lockType=Exclusive) at Cache.cpp:165
165             ASSERT (pageNumber >= 0);
(gdb) p pageNumber
$1 = -572662307

Backtrace:

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1085045088 (LWP 29914)]
0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
#1  0x000000000088f1b8 in Error::debugBreak () at Error.cpp:92
#2  0x000000000088f140 in Error::error (string=0xbfcd70 "assertion
failed at line %d in file %s\n") at Error.cpp:69
#3  0x000000000088f19d in Error::assertionFailed (fileName=0xc107b0
"Cache.cpp", line=165) at Error.cpp:76
#4  0x000000000091c3ed in Cache::fetchPage (this=0x2a9655aff0,
dbb=0x2a965a3818, pageNumber=-572662307,
    pageType=PAGE_any, lockType=Exclusive) at Cache.cpp:165
#5  0x0000000000885f55 in Dbb::fetchPage (this=0x2a965a3818,
pageNumber=-572662307, pageType=PAGE_any,
    lockType=Exclusive) at Dbb.cpp:204
#6  0x00000000008dc1d7 in Section::getSectionPage (this=0x2b21ec5228,
sequence=100, lockType=Exclusive,
    transId=8252) at Section.cpp:1251
#7  0x00000000008d9581 in Section::insertStub (this=0x2b21ec5228,
transId=8252) at Section.cpp:313
#8  0x000000000088620a in Dbb::insertStub (this=0x2a965a3818,
section=0x2b21ec5228, transaction=0x2a96567678)
    at Dbb.cpp:281
#9  0x0000000000856ae0 in Table::insert (this=0x2a965afd80,
transaction=0x2a96567678, stream=0x2b2dad0568)
    at Table.cpp:2550
#10 0x000000000083ff1f in StorageDatabase::insert (this=0x2a9655a1c8,
connection=Internal: global symbol `Connection' found in handshake.cpp
psymtab but not in symtab.
Connection may be an inlined function, or may be a template function
(if a template, try specifying an instantiation: Connection<type>).
) at StorageDatabase.cpp:225
#11 0x000000000084624b in StorageTable::insert (this=0x2b2dacf9f0) at
StorageTable.cpp:88
#12 0x0000000000837fd5 in StorageInterface::write_row (this=0x1072d08,
buff=0x1072f48 "�p\004") at ha_falcon.cpp:890
#13 0x000000000077835e in handler::ha_write_row (this=0x1072d08,
buf=0x1072f48 "�p\004") at handler.cc:3669
#14 0x00000000006f6c66 in write_record (thd=0x10049b8, table=0x106e658,
info=0x40ac5630) at sql_insert.cc:1515
#15 0x000000000079bb12 in read_sep_field (thd=0x10049b8,
info=@0x40ac5630, table_list=0x1061488,
    fields_vars=@0x1006880, set_fields=@0x10068b0,
set_values=@0x1006898, read_info=@0x40ac5440,
    enclosed=@0xf9b580, skip_lines=0, ignore_check_option_errors=false)
at sql_load.cc:822
#16 0x000000000079a9bd in mysql_load (thd=0x10049b8, ex=0x10613f8,
table_list=0x1061488, fields_vars=@0x1006880,
    set_fields=@0x10068b0, set_values=@0x1006898,
handle_duplicates=DUP_ERROR, ignore=false,
    read_file_from_client=false) at sql_load.cc:390
#17 0x00000000006761d8 in mysql_execute_command (thd=0x10049b8) at
sql_parse.cc:2981
#18 0x000000000067c3e2 in mysql_parse (thd=0x10049b8,
    inBuf=0x10612f8 "LOAD DATA  INFILE \"/tmp/dbt2-w400/order_line.data
\"", ' ' <repeats 17 times>, "INTO TABLE order_line FIELDS TERMINATED BY
'\\t'", length=114, found_semicolon=0x40ac6f98) at sql_parse.cc:5381
#19 0x0000000000671b33 in dispatch_command (command=COM_QUERY,
thd=0x10049b8,
    packet=0x10661f9 "LOAD DATA  INFILE \"/tmp/dbt2-w400/order_line.data
\"", ' ' <repeats 17 times>, "INTO TABLE order_line FIELDS TERMINATED BY
'\\t'", packet_length=115) at sql_parse.cc:907
#20 0x00000000006712e6 in do_command (thd=0x10049b8) at sql_parse.cc:669
#21 0x000000000066fd2c in handle_one_connection (arg=0x106c1b8) at
sql_connect.cc:1091
[13 Jul 2007 15:10] Hakan Küçükyılmaz
Latest back trace:
Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1076373856 (LWP 10168)]
0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95bf0479 in raise () from /lib64/tls/libc.so.6
#1  0x000000000088f840 in Error::debugBreak () at Error.cpp:92
#2  0x000000000088f7c8 in Error::error (string=0xbfdcb0 "assertion failed at line %d in file %s\n")
    at Error.cpp:69
#3  0x000000000088f825 in Error::assertionFailed (fileName=0xc11950 "Cache.cpp", line=170) at Error.cpp:76
#4  0x000000000091d269 in Cache::fetchPage (this=0x2a9655aff0, dbb=0x2a965a3818, pageNumber=-572662307,
    pageType=PAGE_any, lockType=Exclusive) at Cache.cpp:170
#5  0x00000000008865b1 in Dbb::fetchPage (this=0x2a965a3818, pageNumber=-572662307, pageType=PAGE_any,
    lockType=Exclusive) at Dbb.cpp:204
#6  0x00000000008dc8d7 in Section::getSectionPage (this=0x2af1fadf78, sequence=100, lockType=Exclusive,
    transId=8256) at Section.cpp:1251
#7  0x00000000008d9c49 in Section::insertStub (this=0x2af1fadf78, transId=8256) at Section.cpp:313
#8  0x0000000000886866 in Dbb::insertStub (this=0x2a965a3818, section=0x2af1fadf78, transaction=0x2a96567880)
    at Dbb.cpp:281
#9  0x000000000085716e in Table::insert (this=0x2a965afec0, transaction=0x2a96567880, stream=0x2b135fe568)
    at Table.cpp:2556
#10 0x000000000084058b in StorageDatabase::insert (this=0x2a9655a1c8, connection=Internal: global symbol `Connection' found in handshake.cpp psymtab but not in symtab.
Connection may be an inlined function, or may be a template function
(if a template, try specifying an instantiation: Connection<type>).
) at StorageDatabase.cpp:225
#11 0x00000000008468b7 in StorageTable::insert (this=0x2b135fd9f0) at StorageTable.cpp:88
#12 0x0000000000838641 in StorageInterface::write_row (this=0x10734c8, buff=0x1073708 "�p\004")
    at ha_falcon.cpp:895
#13 0x00000000007789ca in handler::ha_write_row (this=0x10734c8, buf=0x1073708 "�p\004") at handler.cc:3669
#14 0x00000000006f6f7e in write_record (thd=0x10059b8, table=0x1071c08, info=0x40280630) at sql_insert.cc:1515
#15 0x000000000079c17e in read_sep_field (thd=0x10059b8, info=@0x40280630, table_list=0x1065e58,
    fields_vars=@0x1007880, set_fields=@0x10078b0, set_values=@0x1007898, read_info=@0x40280440,
    enclosed=@0xf9c0c0, skip_lines=0, ignore_check_option_errors=false) at sql_load.cc:822
#16 0x000000000079b029 in mysql_load (thd=0x10059b8, ex=0x1065dc8, table_list=0x1065e58,
    fields_vars=@0x1007880, set_fields=@0x10078b0, set_values=@0x1007898, handle_duplicates=DUP_ERROR,
    ignore=false, read_file_from_client=false) at sql_load.cc:390
[14 Jul 2007 3:36] Hakan Küçükyılmaz
Vadim,

with latest change set I cannot reproduce the problem anymore:

time ./mysql_load_db.sh -d test -f /tmp/dbt2-w400 -c /usr/local/mysql/bin/mysql -h localhost -u root -e FALCON 

...
...
...

Loading table customer
Loading table district
Loading table history
Loading table item
Loading table new_order
Loading table order_line
Loading table orders
Loading table stock
Loading table warehouse

real    243m57.473s
user    0m0.228s
sys     0m0.115s
[6 Aug 2007 17:10] MySQL Verification Team
I am not able to repeat too.
[6 Aug 2007 18:17] Vadim TKACHENKO
Yes, DBT2 works for me now
[6 Aug 2007 22:45] Hakan Küçükyılmaz
Closed with "can't repeat'.
[15 Oct 2007 23:56] Harita Chilukuri
Does anyone know how does DBT2 work?? I mean how does it calculates the Terminals(users) and what exactly the client does??? Difference between the terminals and client.