Bug #21322 mysql server crashes during execution of ScalableHW test
Submitted: 27 Jul 2006 21:08 Modified: 9 Aug 2006 7:18
Reporter: Alexey Stroganov Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1.11 OS:Linux (Linux/IA32)
Assigned to: Assigned Account CPU Architecture:Any

[27 Jul 2006 21:08] Alexey Stroganov
Description:
Some combination of simultaneous transaction from ScalableHW test from windows based benchmark tool 'BenchmarkFactory', cause a crash of mysql/innodb. It happens when number of db connections is 10 and more. The statements that lead to crash are different but as you can see from backtraces below it looks like they all very similar traces:

bt1:

(gdb) bt
#0  0xffffe002 in ?? ()
#1  0x0831f7e3 in write_core (sig=17774) at stacktrace.c:220
#2  0x081da9c1 in handle_segfault (sig=22) at mysqld.cc:2155
#3  <signal handler called>
#4  0x083cb698 in row_ins_index_entry_set_vals (index=0x413680a8, entry=0x41368440, row=0x6e1f5d00)
    at data0data.ic:227
#5  0x083cb730 in row_ins_index_entry_step (node=0x413680a8, thr=0x41368440) at row0ins.c:2249
#6  0x083cb908 in row_ins (node=0x413680a8, thr=0x41368440) at row0ins.c:2385
#7  0x083cba5c in row_ins_step (thr=0x41368440) at row0ins.c:2495
#8  0x083cd6ad in row_insert_for_mysql (mysql_rec=0x8ff0360 "Ъ\001", prebuilt=0x4135aa68) at row0mysql.c:1147
#9  0x082abf71 in ha_innobase::write_row(char*) (this=0x8ff0248, record=0x8ff0360 "Ъ\001") at ha_innodb.cc:3312
#10 0x0829b1e2 in handler::ha_write_row(char*) (this=0x8ff0248, buf=0x8ff0360 "Ъ\001") at handler.cc:3295
#11 0x082454db in write_record(THD*, st_table*, st_copy_info*) (thd=0x9076878, table=0x8f36c40, info=0x6e1f5f58)
    at sql_insert.cc:1131
#12 0x08244488 in mysql_insert(THD*, st_table_list*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (thd=0x9076878, table_list=0x904fdf8, fields=@0x9076d50, values_list=@0x9076d74,
    update_fields=@0x9076d68, update_values=@0x9076d5c, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:519
#13 0x081f520b in mysql_execute_command(THD*) (thd=0x9076878) at sql_parse.cc:3332
#14 0x081fa6bf in mysql_parse(THD*, char*, unsigned) (thd=0x9076878,
    inBuf=0x904fca8 "insert into SH_InsertTable(Project_Number,Model,ProductID,Product_Name,Price,Class,Evaluations,Description,EmployeeID,Type,VendorID) \t\tvalues (1,5,0,'Hammer',9.34,4,'in the morning','Works well with n"..., length=151480504) at sql_parse.cc:5899
#15 0x081f1ecb in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY,
    thd=0x9076878, packet=0x9076878 "(ыd\b\200VС\bхoР\b<ыd\bх\f\005\t\230h\a\t", packet_length=213)
    at sql_parse.cc:1762
#16 0x081f1886 in do_command(THD*) (thd=0x9076878) at sql_parse.cc:1548
#17 0x081f0b8f in handle_one_connection (arg=0x0) at sql_parse.cc:1190
#18 0x40038332 in start_thread () from /lib/tls/libpthread.so.0

bt2:
(gdb) bt
#0  0xffffe002 in ?? ()
#1  0x0831f7e3 in write_core (sig=16878) at stacktrace.c:220
#2  0x081da9c1 in handle_segfault (sig=22) at mysqld.cc:2155
#3  <signal handler called>
#4  0x083cb698 in row_ins_index_entry_set_vals (index=0x413540a8, entry=0x41354440, row=0x6e1f5d00)
    at data0data.ic:227
#5  0x083cb730 in row_ins_index_entry_step (node=0x413540a8, thr=0x41354440) at row0ins.c:2249
#6  0x083cb908 in row_ins (node=0x413540a8, thr=0x41354440) at row0ins.c:2385
#7  0x083cba5c in row_ins_step (thr=0x41354440) at row0ins.c:2495
#8  0x083cd6ad in row_insert_for_mysql (mysql_rec=0x8f46c40 "Ъ\003", prebuilt=0x41344e68) at row0mysql.c:1147
#9  0x082abf71 in ha_innobase::write_row(char*) (this=0x8f46b28, record=0x8f46c40 "Ъ\003") at ha_innodb.cc:3312
#10 0x0829b1e2 in handler::ha_write_row(char*) (this=0x8f46b28, buf=0x8f46c40 "Ъ\003") at handler.cc:3295
#11 0x082454db in write_record(THD*, st_table*, st_copy_info*) (thd=0x8dcef30, table=0x8ede880, info=0x6e1f5f58)
    at sql_insert.cc:1131
#12 0x08244488 in mysql_insert(THD*, st_table_list*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (thd=0x8dcef30, table_list=0x8eea3f8, fields=@0x8dcf408, values_list=@0x8dcf42c,
    update_fields=@0x8dcf420, update_values=@0x8dcf414, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:519
#13 0x081f520b in mysql_execute_command(THD*) (thd=0x8dcef30) at sql_parse.cc:3332
#14 0x081fa6bf in mysql_parse(THD*, char*, unsigned) (thd=0x8dcef30,
    inBuf=0x8eea2a8 "insert into SH_InsertTable(Project_Number,Model,ProductID,Product_Name,Price,Class,Evaluations,Description,EmployeeID,Type,VendorID) \t\tvalues (3,5,1805,'Hammer',9.34,4,'in the morning','Works well wit"..., length=148696944) at sql_parse.cc:5899
#15 0x081f1ecb in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY,
    thd=0x8dcef30, packet=0x8dcef30 "(ыd\b\030╓С\bЮеИ\b<ыd\bх╡Н\bPОэ\b", packet_length=216) at sql_parse.cc:1762
#16 0x081f1886 in do_command(THD*) (thd=0x8dcef30) at sql_parse.cc:1548
#17 0x081f0b8f in handle_one_connection (arg=0x0) at sql_parse.cc:1190
#18 0x40038332 in start_thread () from /lib/tls/libpthread.so.0

bt3:
(gdb) bt
#0  0xffffe002 in ?? ()
#1  0x0831f7e3 in write_core (sig=17259) at stacktrace.c:220
#2  0x081da9c1 in handle_segfault (sig=22) at mysqld.cc:2155
#3  <signal handler called>
#4  0x083cb698 in row_ins_index_entry_set_vals (index=0x412f2148, entry=0x412f2998, row=0x6e37dd00)
    at data0data.ic:227
#5  0x083cb730 in row_ins_index_entry_step (node=0x412f2148, thr=0x412f2998) at row0ins.c:2249
#6  0x083cb908 in row_ins (node=0x412f2148, thr=0x412f2998) at row0ins.c:2385
#7  0x083cba5c in row_ins_step (thr=0x412f2998) at row0ins.c:2495
#8  0x083cd6ad in row_insert_for_mysql (mysql_rec=0x8ea4198 "Щ\003", prebuilt=0x412f0668) at row0mysql.c:1147
#9  0x082abf71 in ha_innobase::write_row(char*) (this=0x8ea4080, record=0x8ea4198 "Щ\003") at ha_innodb.cc:3312
#10 0x0829b1e2 in handler::ha_write_row(char*) (this=0x8ea4080, buf=0x8ea4198 "Щ\003") at handler.cc:3295
#11 0x082454db in write_record(THD*, st_table*, st_copy_info*) (thd=0x8ecef78, table=0x8e584f8, info=0x6e37df58)
    at sql_insert.cc:1131
#12 0x08244488 in mysql_insert(THD*, st_table_list*, List<Item>&, List<List<Item> >&, List<Item>&, List<Item>&, enum_duplicates, bool) (thd=0x8ecef78, table_list=0x8facb48, fields=@0x8ecf450, values_list=@0x8ecf474,
    update_fields=@0x8ecf468, update_values=@0x8ecf45c, duplic=DUP_ERROR, ignore=false) at sql_insert.cc:519
#13 0x081f520b in mysql_execute_command(THD*) (thd=0x8ecef78) at sql_parse.cc:3332
#14 0x081fa6bf in mysql_parse(THD*, char*, unsigned) (thd=0x8ecef78,
    inBuf=0x8fac6a0 "INSERT INTO bf_test_results (station_id, virtual_station_id, station_cfg_id, userid, test_run_id, user_load, tps, bps, ups, avg_transaction_time, min_transaction_time, max_transaction_time, total_tran"..., length=149745592) at sql_parse.cc:5899
#15 0x081f1ecb in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY,
    thd=0x8ecef78, packet=0x8ecef78 "(ыd\b╗\rГ\bЮ\215А\b<ыd\b\230\215Ь\b\230ОЛ\b", packet_length=1073)
    at sql_parse.cc:1762
#16 0x081f1886 in do_command(THD*) (thd=0x8ecef78) at sql_parse.cc:1548
#17 0x081f0b8f in handle_one_connection (arg=0x0) at sql_parse.cc:1190
#18 0x40038332 in start_thread () from /lib/tls/libpthread.so.0

How to repeat:
Install BenchmarkFactory trial version 
Run ScalableHW test with scale 1 against server started on Linux through MyODBC
[27 Jul 2006 22:11] Heikki Tuuri
Alexey,

can you in the debugger determine where exactly is execution in this function?

Then we could add assertions to catch the bug.

Regards,

Heikki

void
row_ins_index_entry_set_vals(
/*=========================*/
        dict_index_t*   index,  /* in: index */
        dtuple_t*       entry,  /* in: index entry to make */
        dtuple_t*       row)    /* in: row */
{
        dict_field_t*   ind_field;
        dfield_t*       field;
        dfield_t*       row_field;
        ulint           n_fields;
        ulint           i;
        dtype_t*        cur_type;

        ut_ad(entry && row);

        n_fields = dtuple_get_n_fields(entry);

        for (i = 0; i < n_fields; i++) {
                field = dtuple_get_nth_field(entry, i);
                ind_field = dict_index_get_nth_field(index, i);

                row_field = dtuple_get_nth_field(row, ind_field->col->ind);

                /* Check column prefix indexes */
                if (ind_field->prefix_len > 0
                        && dfield_get_len(row_field) != UNIV_SQL_NULL) {

                        cur_type = dict_col_get_type(
                                dict_field_get_col(ind_field));

                        field->len = dtype_get_at_most_n_mbchars(cur_type,
                                  ind_field->prefix_len,
                                  dfield_get_len(row_field), row_field->data);
                } else {
                        field->len = row_field->len;
                }

                field->data = row_field->data;
        }
}
[27 Jul 2006 22:43] Alexey Stroganov
Yes, I will to find out this.

Also I've got a bit different bt: 
(gdb) bt
#0  0xffffe002 in ?? ()
#1  0x0831f7e3 in write_core (sig=17884) at stacktrace.c:220
#2  0x081da9c1 in handle_segfault (sig=22) at mysqld.cc:2155
#3  <signal handler called>
#4  0x083d842b in row_sel_store_mysql_rec (mysql_rec=0x41322068 "Оo╝\004h03Ah╓2A", prebuilt=0x44bc3053,
    rec=0x6e257758 "d", offsets=0x6e2578e8) at row0sel.c:2641
#5  0x083db893 in row_sel_push_cache_row_for_mysql (prebuilt=0x41322068, rec=0x44bc3053 "\200",
    offsets=0x6e257758) at row0sel.c:3074
#6  0x083dabbf in row_search_for_mysql (buf=0x8e9f130 "", mode=2, prebuilt=0x41322068, match_mode=1,
    direction=1) at row0sel.c:4152
#7  0x082accdd in ha_innobase::general_fetch(char*, unsigned, unsigned) (this=0x8e9f018, buf=0x8e9f130 "",
    direction=1, match_mode=1) at ha_innodb.cc:4064
#8  0x082acdd3 in ha_innobase::index_next_same(char*, char const*, unsigned) (this=0x8e9f018, buf=0x8e9f130 "",
    key=0x8eaa478 "\002", keylen=8) at ha_innodb.cc:4120
#9  0x08239cff in join_read_next_same (info=0x0) at sql_select.cc:10266
#10 0x08238d72 in sub_select(JOIN*, st_join_table*, bool) (join=0x8f51ad0, join_tab=0x8eaa308,
    end_of_records=11) at sql_select.cc:9696
#11 0x0823892c in do_select (join=0x8f51ad0, fields=0x0, table=0x8f41660, procedure=0x0) at sql_select.cc:9454
#12 0x08228ed5 in JOIN::exec() (this=0x8f51ad0) at sql_select.cc:1377
#13 0x0822a4cc in mysql_select(THD*, Item***, st_table_list*, unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st_order*, unsigned long, select_result*, st_select_lex_unit*, st_select_lex*) (
    thd=0x8e8dc40, rref_pointer_array=0x8e8dfe4, tables=0x8ea90e0, wild_num=0, fields=@0x8e8df48,
    conds=0x8ea97c0, og_num=2, order=0x8ea9a28, group=0x8ea9938, having=0x0, proc_param=0x0,
    select_options=2147764736, result=0x8ea9ab0, unit=0x8e8dc90, select_lex=0x8e8dec0) at sql_select.cc:1933
#14 0x082261d2 in handle_select(THD*, st_lex*, select_result*, unsigned long) (thd=0x8e8dc40, lex=0x8e8dc80,
    result=0x8ea9ab0, setup_tables_done_option=0) at sql_select.cc:238
#15 0x081f37a3 in mysql_execute_command(THD*) (thd=0x8e8dc40) at sql_parse.cc:2541
#16 0x081fa6bf in mysql_parse(THD*, char*, unsigned) (thd=0x8e8dc40,
    inBuf=0x8ea8788 "select Model,avg(Price),min(Price),max(Price),count(*) \t\t\t    from SH_Product where Project_Number = 2 and \t\t\t\tEmployeeID = 0 and \t\t\t\tType = 90 group by Model order by Model", length=149478528)
    at sql_parse.cc:5899
#17 0x081f1ecb in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_QUERY,
    thd=0x8e8dc40, packet=0x8e8dc40 "(ыd\b\b╩К\bЬ`Я\b<ыd\bp.У\b`эХ\b", packet_length=174) at sql_parse.cc:1762
#18 0x081f1886 in do_command(THD*) (thd=0x8e8dc40) at sql_parse.cc:1548
#19 0x081f0b8f in handle_one_connection (arg=0x0) at sql_parse.cc:1190
#20 0x40038332 in start_thread () from /lib/tls/libpthread.so.0

Is it possible that these problems have the same roots?
[28 Jul 2006 6:32] Heikki Tuuri
Alexey,

see my note at:

http://bugs.mysql.com/bug.php?id=20213

When InnoDB crashes, in gdb do:

info threads

thread 8996

bt full

etc.

Print the stacks of all threads. If MySQL is using the same table handle in two different queries, we may notice it from there.

We could also add assertions to ha_innodb.cc to check that a table handle is only used for one query at a time.

Regards,

Heikki
[28 Jul 2006 14:09] Alexey Stroganov
I've attached files with output of bt and bt full for all threads of one of the core. Let me know if you need additional info.
[28 Jul 2006 14:10] Alexey Stroganov
Output of bt

Attachment: bug21322-core17884-bt (application/octet-stream, text), 30.27 KiB.

[28 Jul 2006 14:11] Alexey Stroganov
Output of bt full

Attachment: bug21322-core17884-bt-full (application/octet-stream, text), 97.06 KiB.

[28 Jul 2006 19:21] Heikki Tuuri
The stack traces do not reveal illegal reuse of a table handle. We may next add some assertions to 5.1 that check illegal reuse never happens.

We will also study diff -bu of row0sel.c and other relevant files between 5.0 and 5.1.
[29 Jul 2006 6:02] Heikki Tuuri
These bug reports suggest that some serious bug was introduced in 5.1.11:

http://bugs.mysql.com/bug.php?id=20213
http://bugs.mysql.com/bug.php?id=20337
http://bugs.mysql.com/bug.php?id=20595
http://bugs.mysql.com/bug.php?id=21322
[29 Jul 2006 10:00] Heikki Tuuri
Alexey,

can you test if 5.1.9 exhibits the same crashes?

Regards,

Heikki
[1 Aug 2006 1:11] Alexey Stroganov
Heikki,

I ran test several times for 5.1.9 and it passed ok.

Also I've attached bt and bt full for another core that fails on insert stmt.
[1 Aug 2006 1:12] Alexey Stroganov
bt for core 17259

Attachment: bt.17259.tar.bz2 (application/x-bzip2, text), 2.38 KiB.

[1 Aug 2006 1:12] Alexey Stroganov
bt full for core 17259

Attachment: bt.full.17259.tar.bz2 (application/x-bzip2, text), 5.64 KiB.

[1 Aug 2006 5:20] Heikki Tuuri
Alexey,

thank you. It is the large patches between 5.1.9 and 5.1.11 that have broken
MySQL/InnoDB. We are concentrating on reproducing
http://bugs.mysql.com/bug.php?id=20213
on our own computers.

Regards,

Heikki
[3 Aug 2006 21:34] Heikki Tuuri
Alexey,

if you replace the start of ::store_lock() in ha_innodb.cc with the code below, does it still crash?

Regards,

Heikki

/*********************************************************************
Converts a MySQL table lock stored in the 'lock' field of the handle to
a proper type before storing pointer to the lock into an array of pointers.
MySQL also calls this if it wants to reset some table locks to a not-locked
state during the processing of an SQL query. An example is that during a
SELECT the read lock is released early on the 'const' tables where we only
fetch one row. MySQL does not call this when it releases all locks at the
end of an SQL statement. */

THR_LOCK_DATA**
ha_innobase::store_lock(
/*====================*/
                                                /* out: pointer to the next
                                                element in the 'to' array */
        THD*                    thd,            /* in: user thread handle */
        THR_LOCK_DATA**         to,             /* in: pointer to an array
                                                of pointers to lock structs;
                                                pointer to the 'lock' field
                                                of current handle is stored
                                                next to this array */
        enum thr_lock_type      lock_type)      /* in: lock type to store in
                                                'lock'; this may also be
                                                TL_IGNORE */
{
        row_prebuilt_t* prebuilt        = (row_prebuilt_t*) innobase_prebuilt;
        trx_t*          trx;

        update_thd(thd); /* This must be called BEFORE we know prebuilt->trx
                         is up to date! */

        trx = prebuilt->trx;

        /* NOTE: MySQL  can call this function with lock 'type' TL_IGNORE!
        Be careful to ignore TL_IGNORE if we are going to do something with
        only 'real' locks! */
[9 Aug 2006 7:18] Heikki Tuuri
I optimistically mark this as a duplicate of the other stress test crashes in 5.1.