Bug #644 crash in add_key_field when using prepared statements
Submitted: 12 Jun 2003 14:56 Modified: 11 Mar 2004 8:59
Reporter: Dennis Haney Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.1 OS:Linux (linux 2.4)
Assigned to: Oleksandr Byelkin CPU Architecture:Any

[12 Jun 2003 14:56] Dennis Haney
Description:
crash in add_key_field

add_key_field (key_fields=0x41db3dc4, and_level=0, field=0x88de588, eq_func=true, value=0x88dbab0, usable_tables=18446744073709551615) at sql_select.cc:1875
1875          stat[0].keys|= possible_keys;             // Add possible keys
(gdb) bt full
#0  add_key_field (key_fields=0x41db3dc4, and_level=0, field=0x88de588, eq_func=true, value=0x88dbab0, usable_tables=18446744073709551615) at sql_select.cc:1875
        stat = (st_join_table *) 0x0
        possible_keys = 8
        exists_optimize = false
#1  0x08157ea6 in add_key_fields (stat=0x88d41e0, key_fields=0x41db3dc4, and_level=0x41db3dc8, cond=0x88dbaf0, usable_tables=18446744073709551615) at sql_select.cc:1989
        equal_func = true
#2  0x08158000 in add_key_fields (stat=0x88d41e0, key_fields=0x41db3dc4, and_level=0x41db3dc8, cond=0x88dbd70, usable_tables=18446744073709551615) at sql_select.cc:1927
        item = (Item *) 0x1
        li = {<base_list_iterator> = {list = 0x88dbdc8, el = 0x88dbdd8, prev = 0x0, current = 0x0}, <No data fields>}
        org_key_fields = (key_field_t *) 0x88d4410
#3  0x081584d1 in update_ref_and_keys (thd=0x0, keyuse=0x88d4124, join_tab=0x88d41e0, tables=1, cond=0x88dbd70, normal_tables=18446744073709551615, select_lex=0x1) at sql_select.cc:2163
        and_level = 0
        i = 138068192
        found_eq_constant = 2
#4  0x08156d7b in make_join_statistics (join=0x88d3808, tables=0x0, conds=0x88dbd70, keyuse_array=0x88d4124) at sql_select.cc:1564
        error = 1104887536
        i = 1
        table_count = 0
        const_count = 0
        found_ref = 143527960
        refs = 143434744
        key = 143546248
        const_ref = 0
        eq_part = 7564645
        found_const_table_map = 0
        all_table_map = 1
        table_vector = (st_table **) 0x88d4408
        stat_end = (st_join_table *) 0x88d430c
        s = (st_join_table *) 0x88d430c
        stat_ref = (st_join_table **) 0x88d4310
        keyuse = (keyuse_t *) 0x88d3808
        start_keyuse = (keyuse_t *) 0x88b4fc8
        outer_join = 0
        stat_vector = {0x88d41e0, 0x0, 0x4003af3b, 0x811e8c3, 0x88b4fd4, 0x41db3f28, 0x41db3f44, 0x88dbbb0, 0x88dbaf0, 0x41db3fb8, 0x41db3f74, 0x815c1e4, 0x88dbbb0, 0x88dc858, 0x1, 0x8392368,
  0x88dbd70, 0x1, 0x88db9e8, 0x88dbde4, 0x88b4fc8, 0x88d3808, 0x19000005, 0x88b4fc8, 0x88dbdc8, 0x83ade68, 0x88dbde0, 0x83ade68, 0x88b4fc8, 0x88d3808, 0x41db3ff4, 0x88b4fc8, 0x41db3fd4,
  0x88d41a4, 0x41db3ff4, 0x88dbcd0, 0x88dbaf0, 0x88d4138, 0x41db3fe4, 0x815c685, 0x88dbcd0, 0x41db3fb8, 0x88dbcd0, 0x88dbd70, 0x2, 0x0, 0x41db3fe4, 0x2, 0x41db3fac, 0x41db3fac, 0x41db3fb0,
  0x88d41d8, 0x41db4078, 0x0, 0x41db4004, 0x41db4078, 0x41db3fec, 0x0, 0x41db4004, 0x816363a, 0x1d51, 0x41db3fe4, 0x41db3fe8}
        _db_func_ = 0x88ca694 ""
        _db_file_ = 0x8375c08 "mi_locking.c"
        _db_level_ = 1104887556
        _db_framep_ = (char **) 0x80cc9f9
        ref_changed = 135579381
        ptr = 0x88d41e0 "¨T\216\b"
#5  0x08153e1d in JOIN::optimize() (this=0x88d3808) at sql_select.cc:511
        _db_func_ = 0x88db9e8 ""
        _db_file_ = 0x0
        _db_level_ = 143506800
        _db_framep_ = (char **) 0x0
#6  0x081565c2 in mysql_select(THD*, Item***, st_table_list*, unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, st_order*, unsigned long, select_result*, st_select_lex_unit*, st_select_lex*, bool) (thd=0x88b4fc8, rref_pointer_array=0x88b5388, tables=0x88db9e8, wild_num=0, fields=@0x88b53cc, conds=0x88dbd70, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0,
    select_options=18387968, result=0x88d37f8, unit=0x88b524c, select_lex=0x88b5338, tables_and_fields_initied=false) at sql_select.cc:1386
        err = 143472632
        free_join = true
        _db_func_ = 0x88b4fc8 "¸¶9\b\204ð:\bÈ\222\212\b \211\212\b¨ã\213\b¨#\214\b¨ã\213\b¨ã\213\b\016"
        _db_file_ = 0x88d37f8 "H¶9\bÈO\213\bLR\213\b_ci\n"
        _db_level_ = 138082816
        _db_framep_ = (char **) 0x41db4134
        join = (JOIN *) 0x88d3808
#7  0x08153500 in handle_select(THD*, st_lex*, select_result*) (thd=0x88b4fc8, lex=0x88b5240, result=0x88d37f8) at sql_select.cc:173
        res = 143348536
        select_lex = (st_select_lex *) 0x88b5338
#8  0x08136475 in mysql_execute_command(THD*) (thd=0x88b4fc8) at sql_parse.cc:1780
        res = 0
        lex = (st_lex *) 0x88b5240
        tables = (st_table_list *) 0x88db9e8
        select_lex = (class st_select_lex *) 0x88b5338
        unit = (class st_select_lex_unit *) 0x88b524c
        _db_func_ = 0x0
        _db_file_ = 0x88b525c "8S\213\b"
        _db_level_ = 143348300
        _db_framep_ = (char **) 0x8381e60
---Type <return> to continue, or q <return> to quit---
#9  0x0816d409 in mysql_stmt_execute(THD*, char*) (thd=0x88b4fc8, packet=0x1 <Address 0x1 out of bounds>) at sql_prepare.cc:934
        stmt_id = 143533552
        stmt = (class st_prep_stmt *) 0x88e25f0
        _db_func_ = 0xd9 <Address 0xd9 out of bounds>
        _db_file_ = 0x41db4318 "È\2208\b"
        _db_level_ = 1104888604
        _db_framep_ = (char **) 0x41db4320
        thd_lex = {yylineno = 1, yytoklen = 2, yylval = 0x41db472c, unit = {<st_select_lex_node> = {_vptr.st_select_lex_node = 0x83890c8, next = 0x0, prev = 0x0, master = 0x0, slave = 0x88b5338,
      link_next = 0x0, link_prev = 0x0, options = 0, linkage = GLOBAL_OPTIONS_TYPE, order_list = {elements = 0, first = 0x0, next = 0x88b5274},
      expr_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b527c, elements = 0}, <No data fields>},
      when_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5288, elements = 0}, <No data fields>}, select_limit = 4294967295, offset_limit = 0,
      ref_pointer_array = 0x0, with_sum_func = 0, create_refs = false, dependent = false, uncacheable = false, no_table_names_allowed = false},
    item_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b52a8, elements = 0}, <No data fields>}, result_table_list = {next = 0x1, db = 0x0, alias = 0x0,
      real_name = 0xf0004 <Address 0xf0004 out of bounds>, on_expr = 0x4000001, natural_join = 0xcc0155, use_index = 0xd00008, ignore_index = 0x0, {table = 0x0, table_list = 0x0},
      derived = 0x88b5330, grant = {grant_table = 0x88b5390, version = 137093704, privilege = 137097932, want_privilege = 137103036}, lock_type = 137105344, outer_join = 137144300,
      shared = 137154752, db_length = 917505, real_name_length = 67108865, straight = 85, updating = true, force_index = 81}, union_result = 0x0, table = 0x0, result = 0x0, res = 143348648,
    describe = 192, found_rows_for_union = 83, prepared = false, optimized = false, executed = false, t_and_f = 228, global_parameters = 0x88b5338, return_to = 0x0,
    select_limit_cnt = 4294967295, offset_limit_cnt = 0, item = 0x0, thd = 0x88b4fc8, union_option = 0}, select_lex = {<st_select_lex_node> = {_vptr.st_select_lex_node = 0x8389068, next = 0x0,
      prev = 0x88b525c, master = 0x88b524c, slave = 0x0, link_next = 0x0, link_prev = 0x88b5434, options = 0, linkage = UNSPECIFIED_TYPE, order_list = {elements = 0, first = 0x0,
        next = 0x88b5360}, expr_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5368, elements = 0}, <No data fields>},
      when_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5374, elements = 0}, <No data fields>}, select_limit = 20, offset_limit = 0,
      ref_pointer_array = 0x0, with_sum_func = 0, create_refs = false, dependent = false, uncacheable = false, no_table_names_allowed = false}, db = 0x0, db1 = 0x4 <Address 0x4 out of bounds>,
    table1 = 0x0, db2 = 0x0, table2 = 0x0, where = 0x89073c0, having = 0x0, olap = UNSPECIFIED_OLAP_TYPE, table_list = {elements = 0, first = 0x0, next = 0x88b53b8}, group_list = {elements = 0,
      first = 0x0, next = 0x88b53c4}, item_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b53cc, elements = 0}, <No data fields>},
    interval_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b53d8, elements = 0}, <No data fields>},
    use_index = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b53e4, elements = 0}, <No data fields>}, use_index_ptr = 0x0,
    ignore_index = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b53f4, elements = 0}, <No data fields>}, ignore_index_ptr = 0x0, ftfunc_list = 0x88b5408,
    ftfunc_list_alloc = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5408, elements = 0}, <No data fields>}, join = 0x0,
    type = 0x400000 <Address 0x400000 out of bounds>, table_join_options = 0, in_sum_expr = 0, select_number = 1, with_wild = 0, braces = false, having_fix_field = false},
  current_select = 0x88b5338, all_selects_list = 0x88b5338, ptr = 0x88d37f7 "oH¶9\bÈO\213\bLR\213\b_ci\n", tok_start = 0x88d37f7 "oH¶9\bÈO\213\bLR\213\b_ci\n",
  tok_end = 0x88d37f7 "oH¶9\bÈO\213\bLR\213\b_ci\n", end_of_query = 0x88d37f6 "", length = 0x0, dec = 0x0, change = 0x0, name = 0x0, backup_dir = 0x3c0000 <Address 0x3c0000 out of bounds>,
  to_log = 0x8d <Address 0x8d out of bounds>, purge_time = 0, x509_subject = 0x0, x509_issuer = 0x0, ssl_cipher = 0x0, found_colon = 0x0, ssl_type = SSL_TYPE_NONE, wild = 0x4d0000,
  exchange = 0x0, result = 0x0, col_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5484, elements = 0}, <No data fields>},
  ref_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5490, elements = 0}, <No data fields>},
  drop_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b549c, elements = 0}, <No data fields>},
  alter_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54a8, elements = 0}, <No data fields>},
  interval_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54b4, elements = 0}, <No data fields>},
  users_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54c0, elements = 0}, <No data fields>},
  columns = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54cc, elements = 0}, <No data fields>}, key_list = {<base_list> = {<Sql_alloc> = {<No data fields>},
      first = 0x83ade68, last = 0x88b54d8, elements = 0}, <No data fields>}, create_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54e4,
      elements = 0}, <No data fields>}, insert_list = 0x0, field_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b54f4, elements = 0}, <No data fields>},
  value_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5500, elements = 0}, <No data fields>},
  many_values = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b550c, elements = 0}, <No data fields>},
  var_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x88d3890, last = 0x88d3890, elements = 1}, <No data fields>},
  param_list = {<base_list> = {<Sql_alloc> = {<No data fields>}, first = 0x83ade68, last = 0x88b5524, elements = 0}, <No data fields>}, proc_list = {elements = 1819374457, first = 0x0,
    next = 0x5f73656c}, auxilliary_table_list = {elements = 1986622064, first = 0x49594d2e <Address 0x49594d2e out of bounds>, next = 0x0}, interval = 0x796d2f2e, last_field = 0x2f6c7173,
  default_value = 0x6c626174, comment = 0x705f7365, uint_geom_type = 779512178, grant_user = 0x49594d, yacc_yyss = 0x0, yacc_yyvs = 0x0, thd = 0x88b4fc8, udf = {name = {
      str = 0x705f7365 <Address 0x705f7365 out of bounds>, length = 779512178}, returns = 4479309, type = 4294967295, dl = 0xffffffff <Address 0xffffffff out of bounds>, dlhandle = 0xffffffff,
    func = 0xffffffff, func_init = 0xffffffff, func_deinit = 0xffffffff, func_reset = 0x0, func_add = 0x0, usage_count = 0}, check_opt = {sort_buffer_size = 0, flags = 0, sql_flags = 0},
  create_info = {table_charset = 0x1, comment = 0x0, password = 0x0, data_file_name = 0x0, index_file_name = 0x0, max_rows = 0, min_rows = 4294967296, auto_increment_value = 0,
    table_options = 2529, avg_row_length = 143316064, raid_chunksize = 143312160, used_fields = 143350540, merge_list = {elements = 0, first = 0x0, next = 0x0}, db_type = DB_TYPE_UNKNOWN,
    row_type = ROW_TYPE_DEFAULT, options = 0, raid_type = 0, raid_chunks = 0, merge_insert_method = 0, if_not_exists = false}, mi = {host = 0x0, user = 0x0, password = 0x0, log_file_name = 0x0,
    port = 143316088, connect_retry = 143316368, pos = 143316648, server_id = 0, relay_log_name = 0x112 <Address 0x112 out of bounds>, relay_log_pos = 280}, mqh = {questions = 1, updates = 5,
    connections = 0, bits = 273}, thread_id = 268, type = 268, sql_command = SQLCOM_COMMIT, lock_option = TL_READ, next_state = MY_LEX_END, duplicates = DUP_REPLACE,
  tx_isolation = ISO_READ_COMMITTED, ha_read_mode = RFIRST, ha_rkey_mode = HA_READ_KEY_OR_NEXT, alter_keys_onoff = ENABLE, option_type = OPT_DEFAULT, grant = 1, grant_tot_col = 0,
  which_columns = 143350608, fk_delete_opt = 0, fk_update_opt = 0, fk_match_option = 0, param_count = 0, drop_primary = 7, drop_if_exists = false, drop_temporary = false, local_file = false,
  in_comment = false, ignore_space = false, verbose = false, simple_alter = false, no_write_to_binlog = 28, derived_tables = false, describe = false, safe_to_cache_query = true,
  slave_thd_opt = 143350544, charset = 0x9, help_arg = 0x0, gorder_list = 0x8}
#10 0x08135efd in dispatch_command(enum_server_command, THD*, char*, unsigned) (command=COM_EXECUTE, thd=0x88b4fc8, packet=0x88be3a9 "\003", packet_length=5) at sql_parse.cc:1270
        net = (st_net *) 0x88b4fd4
        error = false
        slow_command = false
        _db_func_ = 0x3f <Address 0x3f out of bounds>
        _db_file_ = 0x41db48d8 " \f\004@\200\214\023BÿÿÿÿôHÛA\005"
        _db_level_ = 1104890076
        _db_framep_ = (char **) 0x41db48e0
        start_of_query = 23
#11 0x08134ca5 in do_command(THD*) (thd=0x88b4fc8) at sql_parse.cc:1094
        packet = 0x88be3a8 "\027\003"
        old_timeout = 30
        packet_length = 5
---Type <return> to continue, or q <return> to quit---
        net = (st_net *) 0x88b4fd4
        command = COM_EXECUTE
        _db_func_ = 0x8118421 "\203Ä\020\213]üÉÃ\220\211öU\211åS\203ì\020\213]\bSè<"
        _db_file_ = 0x88b5b60 "øW\215\b"
        _db_level_ = 4096
        _db_framep_ = (char **) 0x1000
#12 0x081344a5 in handle_one_connection (arg=0x88b4fc8) at sql_parse.cc:876
        thd = (THD *) 0x88b4fc8
        set = {__val = {0 <repeats 32 times>}}
#13 0x40037881 in pthread_start_thread () from /lib/i686/libpthread.so.0
No symbol table info available.
#14 0x40037985 in pthread_start_thread_event () from /lib/i686/libpthread.so.0
No symbol table info available.

(gdb) p *field
$3 = {_vptr.Field = 0x839ac88, ptr = 0x88ddecf "\021", null_ptr = 0x0, table = 0x88dd618, table_name = 0x88b4c88 "items", field_name = 0x88ddd10 "category", comment = {str = 0x8311e2c "",
    length = 0}, query_id = 3185, key_start = 8, part_of_key = 8, part_of_sortkey = 8, unireg_check = NONE, field_length = 10, flags = 16425, null_bit = 0 '\0'}
(gdb) p *field->table
$4 = {file = 0x88ddb70, field = 0x88ddc10, blob_field = 0x88de1c8, name_hash = {key_offset = 0, key_length = 0, records = 0, blength = 0, current_record = 0, flags = 0, array = {buffer = 0x0,
      elements = 0, max_element = 0, alloc_increment = 0, size_of_element = 0}, get_key = 0, free = 0, charset = 0x0}, record = {0x88dde30 "C", 0x88dded8 "Ã"}, default_values = 0x88ddf80 "Ã",
  insert_values = 0x0, fields = 13, reclength = 163, rec_buff_length = 168, keys = 6, key_parts = 6, primary_key = 0, max_key_length = 9, max_unique_length = 4, total_key_length = 32,
  uniques = 0, null_fields = 8, blob_fields = 1, keys_in_use = 63, keys_for_keyread = 63, quick_keys = 40, used_keys = 0, keys_in_use_for_query = 63, key_info = 0x88dd9a8, keynames = {count = 6,
    name = 0x0, type_names = 0x88ddc80}, max_rows = 0, min_rows = 0, avg_row_length = 0, raid_chunksize = 0, fieldnames = {count = 13, name = 0x0, type_names = 0x88ddc48}, intervals = 0x0,
  db_type = DB_TYPE_MYISAM, row_type = ROW_TYPE_DEFAULT, db_create_options = 9, db_options_in_use = 3, db_record_offset = 0, db_stat = 39, raid_type = 0, raid_chunks = 0, status = 2, system = 0,
  time_stamp = 0, timestamp_field_offset = 0, next_number_index = 0, blob_ptr_size = 8, next_number_key_offset = 0, current_lock = 2, tmp_table = NO_TMP_TABLE, copy_blobs = 0 '\0',
  null_row = 0 '\0', maybe_null = 0 '\0', outer_join = 0 '\0', force_index = 0 '\0', distinct = 0 '\0', const_table = 0 '\0', no_rows = 0 '\0', key_read = 0 '\0', bulk_insert = 0 '\0',
  crypted = 0 '\0', db_low_byte_first = 1 '\001', locked_by_flush = 0 '\0', locked_by_name = 0 '\0', fulltext_searched = 0 '\0', crashed = 0 '\0', is_view = 0 '\0', no_keyread = 0 '\0',
  next_number_field = 0x0, found_next_number_field = 0x88ddd20, rowid_field = 0x88ddd20, timestamp_field = 0x0, comment = 0x88ddc08 "", table_charset = 0x83acac0, reginfo = {join_tab = 0x0,
    lock_type = TL_READ, not_exists_optimize = false, impossible_range = false}, mem_root = {free = 0x88de438, used = 0x88dde20, pre_alloc = 0x0, min_malloc = 32, block_size = 996,
    block_num = 7, first_block_usage = 0, error_handler = 0x80fecf0 <sql_alloc_error_handler>}, grant = {grant_table = 0x0, version = 0, privilege = 1, want_privilege = 0},
  table_cache_key = 0x88de1e0 "rubis", table_name = 0x88b4c88 "items", real_name = 0x88dd990 "items", path = 0x88dd998 "./rubis/items", key_length = 12, tablenr = 0, used_fields = 14,
  null_bytes = 1, map = 1, version = 1, flush_version = 1, null_flags = 0x88dde30 "C", sort = {io_cache = 0x0, addon_buf = 0x0, addon_length = 0, addon_field = 0x0, unpack = 0,
    record_pointers = 0x0, found_records = 0}, group = 0x0, quick_rows = {0, 0, 0, 289, 0, 1, 0 <repeats 26 times>}, quick_key_parts = {0, 0, 0, 1, 0, 1, 0 <repeats 26 times>},
  const_key_parts = {0, 0, 0, 1, 0 <repeats 28 times>}, query_id = 0, {temp_pool_slot = 0, pos_in_table_list = 0x0}, derived_select_number = 0, in_use = 0x0, next = 0x88cda48, prev = 0x88df608}
(gdb) p field->table->map
$5 = 1
(gdb) p field->table->reginfo
$6 = {join_tab = 0x0, lock_type = TL_READ, not_exists_optimize = false, impossible_range = false}

How to repeat:
The query is : "SELECT items.id,items.name,items.initial_price,items.max_bid,items.nb_of_bids,items.end_date FROM items WHERE category=17 AND end_date>=NOW() LIMIT 0,20"
and it returns an empty set

mysql> desc SELECT items.id,items.name,items.initial_price,items.max_bid,items.nb_of_bids,items.end_date FROM items WHERE category=17 AND end_date>=NOW() LIMIT 0,20;
+----+-------------+-------+-------+------------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys    | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+------------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | items | range | category_id,eday | eday |       9 | NULL |    1 | Using where |
+----+-------------+-------+-------+------------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

Only happens when running with 2 or more threads at the same time!
[18 Jun 2003 4:51] Dennis Haney
Its seems the problem is centered around prepared statements. When I changed my program to using normal mysql_query, mysql_fetch_row, mysql_free_result it works fine.
[8 Jul 2003 0:42] Venu Anuganti
Hi !!

Could you please provide me the table schema to reproduce the bug ? Also, is it from the latest Bk tree or from 4.1.0 release version ?

Thanks
Venu
[8 Jul 2003 5:26] Dennis Haney
Latest bk of the 4.1 tree.
Crash happens with the rubis benchmarks (which is this crash dump), but also with a normal tpcw bechmark without thinktime.
I happens every time I try to run more than two threads using the prepared stetements, but not at the same statement (usually the same type though), and usually within a few seconds of starting.
[19 Aug 2003 14:31] Venu Anuganti
will be fixed in the next release ...
[20 Feb 2004 8:07] Min Xu
I have encounted the same problem on solaris with the latest 4.1.1 release. I wonder if this bug is fixed in the latest development tree?

From the behavior is the bug, it looks to me like a race condition. Since I am currently working on an automatic race detector, I try applied my detector with the execution. I subsequently found a few races in the buggy execution. I wonder do you want me to submit such information? I can either post the races to this bug or file new bug report for each of the race, or simple post them on the internals@ mailing list. Which method do you prefer? Just let me know.

I understand the reason the race is not fixed for half year is perhaps 

1. nobody is using prepared statement feature yet. 
2. It perhaps is also because Venu left MySQL and then the bug is left unnoticed. 
3. Or on the other hand, maybe this bug is harder to repeat comparing deterministic ones. 
4. Or finally, maybe this bug is just harder to understand without a good tool to help find the race condition. 

Which of the four do you think contributed more of the non-fix? Is the bug easy to find by just look at the stack trace provided by Dennis?

Thank you very much!

-Min
[21 Feb 2004 7:24] Sergei Golubchik
it was forgotten mostly because of "latest bk" in the version field.
This bug didn't show up when we were trying to make sure that no bugs are left unfixed before doing a release.

But after your email to mailing list (thank you!) I changed "version" field to a more correct "4.1", and reassigned a bug to a developer who is fixing it.
[21 Feb 2004 13:38] Min Xu
Thanks very much for the reply. So it looks like it is still a bug. Should I post some potential races I found to this bug report or some places for people to look at?
[22 Feb 2004 6:45] Oleksandr Byelkin
Can you provide more information about bug (schema, data). I tested it, but 
how I can be sure that it is not just wrong data/schema to repeat bug. 
 
Or you might be able to test last bk repository version (I have pushed a lot 
of changes recently).
[22 Feb 2004 15:11] Min Xu
Hi Oleksandr Byelkin,

Thanks for your reply. This is gonna be rather long comments, since I will put
as much as I know about this bug in the comments. :-)

I am running 4.1.1 on solaris. I haven't try the latest BK since I am not
familiar with bitkeeper. But I will try when I have more time. In the mean time
let me give you information on when does the bug show up on my system and what
my experimental tool says about the potential race conditions (they may or may
not be the causes of the bug). I hope this can save you time in finding the bug.

I use a script called "pupulate_db.sh" to create the DB and insert data before
my test. The script is based on Dennis's code. You can find the script here:
(I cannot upload the file to the bug database since I am not the submitter or
the developer, sorry.)

http://www.cs.wisc.edu/~files/ [1]

As you can see, the schema is very simple. And the table size is quite small
too. I actually found the crash happens more often when the table size is
small.

The script should generate a query trace file called populate_db.txt (at the
same directory [1]). The trace contains simple transactions than only perform a
join on a select query).

The client is writen in c++, in a file called runtrans.cc (also from Dennis,
thanks Dennis!). I modified it to use prepared statement for the queries
instead of the straight ones. When running the client with multiple threads, (
I've tested 5 and 9 threads), the server crashed a few seconds after the client
starts. Here is the sample error message I got from the error log:

===========================================================================
040212 11:27:00  mysqld started
040212 11:27:00  InnoDB: Started; log sequence number 0 43634
/scratch/mysql/libexec/mysqld: ready for connections.
Version: '4.1.1-alpha'  socket: '/tmp/mysql.sock'  port: 3306
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8388600
read_buffer_size=131072
max_used_connections=8
max_connections=100
threads_connected=9
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections =
225791
 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

-e \nNumber of processes running now: 0
040212 11:30:51  mysqld restarted
040212 11:30:52  InnoDB: Started; log sequence number 0 43634
/scratch/mysql/libexec/mysqld: ready for connections.

...

===========================================================================

BTW, the machine I am running this test on is a SUN SPARC box with 8 processors
and 32 GB main memory.

OK, that's all about how to reproduce the bug. It seems it is not hard to
reproduce since the DB schema was not quite uncommon. In fact, Dennis told me
that I used a different one than the one he used. The bug doesn't seem to be
specific to DB schema. It does only happen with prepared statement though.

Now, if you wish, the following are potential race conditions found by my
experimental race detector during execution close to the crash of the server.
They may have caused the bug. They are multiple of them. I am sure not all of
them are really harmful. You know, it was found by an automatic tool, and my
knowledge of MySQL doesn't allow me distinguish which is real bug and which
is not.

First of all, let me say a few words on the format of the following report. My
detector reports two piece of information:

  * Two statements executed from two threads that raced. One of the statement
    is a read from a shared variable, and the other is a write to the same
    variable.

  * A trace of statements (a program slice, if you wish) that was affected by
    the read statement from the first bullet. The trace of statements have the
    format of read A -> write B -> read B -> write C -> etc. Between each
    read/write pair of statements, there is some data dependency, i.e. A's
    value affect B's value. The trace finally ends with a write statement.
    The race happens (i.e. the write statement from the first bullet happens
    just before the final write statement but after the previous write in the
    trace.
    (Notice that, these statements are not necesarily from within the same
     function. They may came from different function, or multiple calls of the
     same the function.)

The tool is still experimental. It is possible to make mistakes. Some race may
seem non-sense, but keep in mind it may due to context switch, where the tool
thinks all accesses from the same _processor_ are all from the same _thread_,
where in fact they can be from different threads.

But don't worry, I have done initial screening as much as I could. The obviouly
false ones are not reported here. I will sort these races from the most likely
to the least likely.

============================================================================
queue_remove <--> queue_insert

At least two races on queue->elements and queue->root[*] was found in the two
functions above.

1. The racing statements are:

mysys/queues.c:202 queue->root[idx]=queue->root[queue->elements--];

and

mysys/queues.c:175 idx= ++queue->elements;

After "queue->elements" was loaded, decremented and stored on line 202, it was
used to compute half_queue on line
227 (half_queue=(elements=queue->elements) >> 1).
Finally the value of half_queue was saved by a kernel context switch. But
before the save finishes, the original "queue->elements" value was incremented
by another thread on line 175.

The context switch should happen before queue_remove returns (_downheap was
inlined in my code). The value of "half_queue" may or may not be still useful
when the context switch happens. If it is useful, then it is likely a race
condition.

2. The racing statements are:

queues.c:202 queue->root[idx]=queue->root[queue->elements--];

and

queues.c:187 queue->root[idx]=element;

After the read of "queue->root[queue->elements--]" on line 202, the value was
passed to "element" in the inlined function of _down_heap on line
226(element=queue->root[idx]). Then the value was passed to "queue->root[idx]"
on line 246 (queue->root[idx]=element). Then another invocation of queue_remove
passed the value to "queue->root[idx]" (idx maybe different from the previous
idx) on line 202 and similarly the value was passed to "queue->root[idx]" at
line 246. But before the write on line 246 finishes, the original value of
"queue->root[queue->elements--]" was changed by a different thread on line 187
(queue->root[idx]=element).

This may not be harmful. I don't know if the value of
"queue->root[queue->elements--]" should hold during the two calls of
queue_remove.

In summary, the race was:

"line 202 read" --> "line 226" --> "line 246 write" --> "line 202 read" -->
"line 226" --> "line 246 write"

Another race is quite similar:
"line 202 read" --> "line 202 write" --> "line 202 read" --> "line 226" -->
"line 246 write"

===========================================================================
queue_remove <--> queue remove

One race on queue->root[] was found. The racing statements are:

201     byte *element=queue->root[++idx]; /* Intern index starts from 1 */

and

246   queue->root[idx]=element;

The value of "queue->root[++idx]" was first copied into a register used for
element. Then a context switch happens (this should happen before the function
returns, otherwise, the register won't be saved since sparc has windowed
register and function returns pop a register window). During the context switch
the register was saved, but before the save finishes, the original value of
"queue->root[++idx]" was updated by a different thread on line 246.

===========================================================================
hash_search <--> my_hash_insert

The race condition is on "pos->data" variable. The two racing statements are:

mysys/hash.c:157 DBUG_RETURN (pos->data);

and

hash.c:338       pos->data=(byte*) record;

After pos->data is read, it is used in open_table (line 830 in sql_base.cc).
  "for (table=(TABLE*) hash_search(&open_cache,(byte*) key,key_length)"
Then in the same function, the value of "table" is passed to thd->open_tables
on line 897. Later on, in function close_thread_table on line 424, the value is
passed to "table", finally "table" is saved in "unused_table" and "table->next"
and "table->prev" in the same function on line 455. But before the value is
saved another thread modifies pos->data on line 338 of hash.c in function
my_hash_insert().

This may or may not be a bug. It depends on does the value of the hash table
has to hold for the time of open_table and close_thread_table.

Similarly, in hash_next, on line 183 of hash.c, the variable "pos->next" is
raced between two thread. The dependency again goes into both open_table and
close_thread_table. And finally, the overwriting statement is on line 339 of
hash.c in my_hash_insert.

============================================================================
mi_open <--> mi_open

The two racing statements are:

myisam/mi_open.c:57  for (pos=myisam_open_list ; pos ; pos=pos->next)

and

myisam/mi_open.c:583 myisam_open_list=list_add(myisam_open_list,&m_info->open_list)

"myisam_open_list" is used to compute pos then the data dependency is
eventually propagated to

580       thr_lock_data_init(&share->lock,&m_info->lock,(void*) m_info);

by "share->lock" varible. Finally the dependency is propagated into

thr_lock.c:661       pthread_mutex_lock(&lock->mutex);

However, in the mean time another thread executes mi_open() and changed
myisam_open_list on line 583.

This may not be a bug. It is perhaps the case once myisam_open_list is used to
compute some pointer in the for loop, it can be freely changed.

============================================================================
my_hash_insert <--> my_hash_insert

This is also similar to the race in hash_search and my_hash_insert. The two
racing statements are:

hash.c:324 idx=hash_mask(rec_hashnr(info,record),info->blength,info->records+1);

and

hash.c:349    info->blength+= info->blength;

"info->blength" was used to compute idx. The value of idx is then passed to
empty[0] on line 334 (empty[0]=pos[0];). Then the value is used in hash_next on
line 183 (hash->current_record= idx;). But before the value of
"hash->current_record" is saved, another thread changed "info->blength" in
my_hash_insert on line 349.

Again, this might be false alarm since info->blength does have to be fixed by
the time hash_next is called.

============================================================================
_mi_read_rnd_static_record <--> mi_lock_database

The two racing statements are:

myisam/mi_statrec.c:241 if ((! cache_read || share->base.reclength > cache_length) && share->tot_locks == 0)

and

myisam/mi_locking.c:160       share->tot_locks++;

The race is on variable "share->tot_locks". Once the value of it was read on
line 241, a context switch may occur. Even though the variable only controls
the if statement, the value can remain in a register until the context switch
happens. I don't know if the switch happens before or after the if statement
finishes. But during the save of register of the context switch another thread
come and changed "share->tot_locks" on line 160 of mi_locking.c. Anyway, this
might be false alarm. But if the store of "share->tot_locks" is before the
if-statement finishes, the race might be likely harmful.

============================================================================
The following are definitely not bugs. But I thought it might be useful to
report how often they happen.

Basically, by default MySQL compile without SAFE_STATISTIC flag being set.
Therefore, I have seen races on the statistic variables in various functions.
The following are relative frequency of them:

times  line              statement
1  sql/net_serv.cc:556   statistic_add(bytes_sent,length,&LOCK_bytes_sent);
3  sql/sql_parse.cc:1800 statistic_increment(com_stat[lex->sql_command],&LOCK_status);
5  sql/ha_myisam.cc:979  statistic_increment(ha_read_next_count,&LOCK_status);
22 sql/net_serv.cc:768   statistic_add(bytes_received,(ulong) length,&LOCK_bytes_received);
24 sql/ha_myisam.cc:1029 statistic_increment(ha_read_rnd_next_count,&LOCK_status);

-Min
[22 Feb 2004 15:13] Min Xu
Sorry, the url was:

http://www.cs.wisc.edu/~xu/files/

not the incorrect one in my last comment.
[23 Feb 2004 10:18] Sergei Golubchik
Min, could you do the following, please:

1. move this race-condition report to a separate bug entry (basically copy-paste) ?
   I could do it myself, but it would be nice to have you as an original author,
   for you to get email notifications, to be able to attach files, etc.

2. could you extend your tool to include backtrace for each race condition ?
   many races you report happen in low-level functions that are not expected to be
   thread safe, instead a caller should take care of necessary locking. the code to
   get a stack trace you may find in sql/stacktrace.[ch]

as for statistic_add/statistic_increment you were right - it's not a bug but an
intentional optimization.These macros are used to increment statistical values - e.g. number of hits/misses in key cache. So, it is allowed for these variables to be slightly "off" the actual value because of not thread-safe increment.As comment says:

        /* statistics_xxx functions are for not essential statistic */

Still, if atomic_add() is available it is used. If not - we gain some concurrency by losing few updates to these variables.
[23 Feb 2004 10:55] Min Xu
Sergei, thanks for quick reply.

I have opened another bug for these potential race conditions. It is bug #2926.
(As you suggested it was just copy & paste :-)

Collecting a stack trace is always on my todo list. I am painfully aware of
that it might not be easy for me to do it soon, since my tool works on a binary
level. My tool runs inside a simulator where everything available was binary
instruction stream. But I will definitely look at the files you pointed out
and try to figure out how to added this feature.

It was also my impression these functions are not intended thread-safe. This
was quite an uncommon multithreading practice to me. I previously thought the
most common way to write multithreaded programs is to write thread-safe 
functions and use those functions without worrying about races. One thing
I thought might be true in MySQL is that it seems the developer some times
put a data structure (queue, hash, etc) local to a thread. Therefore, the
data structure is only accessible through the current thread pointer. In these
cases, there is no need for any locking to the data structure really. Is this
what's happening in MySQL as an optimization?

About the statistics updates. I should mention the frequency I was reporting
was the actually harmful races one each of them. It means each time the race
happens at least one thread's update to the stats is lost.

-Min
[25 Feb 2004 6:31] Konstantin Osipov
Could you provide us with a repeatable test case for this bug, which contains:
- tables definition
- data to populate tables 
- query that makes mysqld crash.
We were fixing this piece of code recently, but can't check up whether the bug is still present.
Thank you!
[25 Feb 2004 7:02] Konstantin Osipov
Sorry, behind the long stacktrace I haven't noticed your URL.
Please ignore the previous note.
[26 Feb 2004 6:46] Oleksandr Byelkin
Thank you for bugreport. 
 
I have runned your program 3 times (5.9.5 threads). 
It looks like it is fixed already.
[26 Feb 2004 9:23] Min Xu
Can you tell me what version of the code you are using? Thanks.
[27 Feb 2004 0:28] Oleksandr Byelkin
If you mean MySQL server then it was last bk tree sources of version 4.1
[27 Feb 2004 12:19] Min Xu
Lastest bk, doesn't compile:

Making all in libmysql
make[2]: Entering directory `/scratch/mysql/src/mysql-4.1/libmysql'
source='libmysql.c' object='libmysql.lo' libtool=yes \
depfile='.deps/libmysql.Plo' tmpdepfile='.deps/libmysql.TPlo' \
depmode=gcc3 /bin/bash ../depcomp \
--preserve-dup-deps --mode=compile gcc -DDEFAULT_CHARSET_HOME="\"/scratch/mysql\"" -DDATADIR="\"/scratch/mysql/var\"" -DSHAREDIR="\"/scratch/mysql/share/mysql\"" -DUNDEF_THREADS_HACK -DDONT_USE_RAID  -I. -I. -I.. -I../include     -O3 -DDBUG_OFF   -D_FILE_OFFSET_BITS=64 -DHAVE_CURSES_H -I/scratch/mysql/src/mysql-4.1/include -DHAVE_RWLOCK_T -c -o libmysql.lo `test -f libmysql.c || echo './'`libmysql.c
../depcomp: --preserve-dup-deps: command not found
make[2]: *** [libmysql.lo] Error 127
make[2]: Leaving directory `/scratch/mysql/src/mysql-4.1/libmysql'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/scratch/mysql/src/mysql-4.1'
make: *** [all] Error 2
[11 Mar 2004 8:34] Min Xu
Sorry for my last msg. I have found an installation error in my automake/autoconf.

I tested this bug on the latest source from BK, the crash didn't happen. I think
some changes after 4.1.1 release must have fixed the bugs.

Please feel free to close the bug. And I really appreciate your time working with me.
[11 Mar 2004 8:59] Oleksandr Byelkin
Thank you for bug reporting, too. 
 
IMHO state "can't repeat" is good enoiugh, it mean that bug was fixed as side 
effect of some fix, but we can't spot (point) that fix exactly.
[11 Mar 2004 9:32] Min Xu
Thanks!

BTW, do you know any other likely race conditions, fixed or not fixed, that I can try to help to track down? I want to test the effectiveness with my detector.
[11 Mar 2004 9:52] Konstantin Osipov
We are currently working on stabilization of prepared statements code.
Many bugs haven't been fixed yet.
That's why now there is a good chance that you spot already reported bug with your 
detector.
I would delay your activity until bugs #2274, #2401, #2473, #2654, #2794, #2812, and several others are fixed.
[11 Mar 2004 10:19] Min Xu
Great! Thanks for all the information.

Please remember I will be more than happy to help in hunting down potential bugs. Should you have a bug that you can't find the root cause, I can try to apply my detectors, provided a repeatable execution. I currently have two detectors:

The first one tells you the pairs of conflicting accesses for which no synchronizations are observed in between. (Conflicting means at least one
access of the pair is write. And the two accesses are from two threads
of course)

The second detectors tells you when a write happens, if one of the input loads,
which may affect the store (address & value) has been modified by other
threads. It also tells you how does the load affect the store by giving you
a trace of instructions from the load to the store.

Just let me know.