Bug #13025 | Server crash during filesort | ||
---|---|---|---|
Submitted: | 6 Sep 2005 23:04 | Modified: | 12 Sep 2005 12:50 |
Reporter: | Andrey Hristov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S1 (Critical) |
Version: | 5.0.13-20050828000000 | OS: | Linux (Linux SLES9 64bit, SuSE9.3 32bit) |
Assigned to: | Michael Widenius | CPU Architecture: | Any |
[6 Sep 2005 23:04]
Andrey Hristov
[6 Sep 2005 23:06]
Andrey Hristov
Test case
Attachment: bug13025.c (text/x-csrc), 1.16 KiB.
[7 Sep 2005 7:44]
Andrey Hristov
the file is attached as private. strange, I thought the devs can see it.
[7 Sep 2005 10:17]
Heikki Tuuri
Jan, Andrei, assigning this to Jan Lindström. In which (prepared) query does it crash? The tables do contain TINYBLOB columns in the primary key. Maybe it reads the BLOB from a wrong position in the MySQL row buffer, and therefore blob_data is garbage? Regards, Heikki ha_innobase::store_key_val_for_row( /*===============================*/ /* out: key value length as stored in buff */ uint keynr, /* in: key number */ char* buff, /* in/out: buffer for the key value (in MySQL format) */ uint buff_len,/* in: buffer length */ const mysql_byte* record)/* in: row in MySQL format */ { KEY* key_info = table->key_info + keynr; KEY_PART_INFO* key_part = key_info->key_part; KEY_PART_INFO* end = key_part + key_info->key_parts; char* buff_start = buff; enum_field_types mysql_type; Field* field; ulint blob_len; byte* blob_data; ibool is_null; DBUG_ENTER("store_key_val_for_row"); /* The format for storing a key field in MySQL is the following: 1. If the column can be NULL, then in the first byte we put 1 if the field value is NULL, 0 otherwise. 2. If the column is of a BLOB type (it must be a column prefix field in this case), then we put the length of the data in the field to the next 2 bytes, in the little-endian format. If the field is SQL NULL, then these 2 bytes are set to 0. Note that the length of data in the field is <= column prefix length. 3. In a column prefix field, prefix_len next bytes are reserved for data. In a normal field the max field length next bytes are reserved for data. For a VARCHAR(n) the max field length is n. If the stored value is the SQL NULL then these data bytes are set to 0. 4. We always use a 2 byte length for a true >= 5.0.3 VARCHAR. Note that in the MySQL row format, the length is stored in 1 or 2 bytes, depending on the maximum allowed length. But in the MySQL key value format, the length always takes 2 bytes. We have to zero-fill the buffer so that MySQL is able to use a simple memcmp to compare two key values to determine if they are equal. MySQL does this to compare contents of two 'ref' values. */ bzero(buff, buff_len); for (; key_part != end; key_part++) { is_null = FALSE; if (key_part->null_bit) { if (record[key_part->null_offset] & key_part->null_bit) { *buff = 1; is_null = TRUE; } else { *buff = 0; } buff++; } field = key_part->field; mysql_type = field->type(); if (mysql_type == MYSQL_TYPE_VARCHAR) { /* >= 5.0.3 true VARCHAR */ ulint lenlen; ulint len; byte* data; if (is_null) { buff += key_part->length + 2; continue; } ... } else if (mysql_type == FIELD_TYPE_TINY_BLOB || mysql_type == FIELD_TYPE_MEDIUM_BLOB || mysql_type == FIELD_TYPE_BLOB || mysql_type == FIELD_TYPE_LONG_BLOB) { ut_a(key_part->key_part_flag & HA_PART_KEY_SEG); if (is_null) { buff += key_part->length + 2; continue; } blob_data = row_mysql_read_blob_ref(&blob_len, (byte*) (record + (ulint)get_field_offset(table, field)), (ulint) field->pack_length()); ut_a(get_field_offset(table, field) == key_part->offset); /* All indexes on BLOB and TEXT are column prefix indexes, and we may need to truncate the data to be
[7 Sep 2005 10:31]
Heikki Tuuri
Hi! Looking at the queries, all do the ORDER BY on a (prefix) of the primary key of the table. Thus, there should not be any filesort at all if the query is optimized in the normal way. And the only table that does contain rows is TCP01, and there is no BLOB column in the primary key there. Maybe the data dictionary of MySQL becomes corrupt with prepared statements, and both the query optimization and the accessing of the columns in the 'record' buffer go astray? This is probably not an InnoDB bug. Regards, Heikki
[7 Sep 2005 10:36]
Heikki Tuuri
Andrey, ok it is now crashing for you in the second query. "SELECT * FROM ACEPSOH ORDER BY MANDT, COMP, BUKRS, OBJID", What does EXPLAIN SELECT say? There should be no filesort. And in the dump there are no rows that are inserted into the table ACEPSOH. Thus, no rows should be stored at all! Regards, Heikki
[7 Sep 2005 10:39]
Heikki Tuuri
Andrey, Jan, if the behavior is as random as you describe, then this probably is memory corruption. It might be an InnoDB bug after all, since InnoDB might make do a massive buffer overrun in some operation. We should find out at which stage do the MySQL data dictionary objects (field array etc.) become corrupt. Jan, can you repeat it in your own computer? Regards, Heikki
[7 Sep 2005 10:45]
Heikki Tuuri
Andrey, in the EXPLAIN SELECT it says '1' row. This is because to fix a bug, I changed the InnoDB estimator about 3 weeks ago so that it always estimates a table to be non-empty. I had to do that because the MySQL optimizer in a left join trusted the estimator too much and thought that it KNOWS the table to be empty. The optimizer can only know that fact in MyISAM where a table lock is set on the table. It cannot know it for InnoDB. But the estimate '1' row cannot explain the strange query plan (unnecesssary filesort) nor the crash. Regards, Heikki
[7 Sep 2005 10:57]
Jan Lindström
Yes, I can repeat using my laptop: Version: '5.0.13-beta-debug-log' socket: '/home/jplindst/bugsocket' port: 5120 Source distribution mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=8388572 read_buffer_size=131072 max_used_connections=2 max_connections=100 threads_connected=1 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 225788 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0x8c7f520 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0xac69bd4c, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x8198232 0xffffe420 0x8276e8a 0x8276e8a 0x82787f9 0x8206fec 0x82130ae 0x8213638 0x82138f8 0x81b4176 0x81bc025 0x81bcab5 0x81bdfcd 0x81be3d0 0xb7f37ccd 0xb7e61b0e New value of fp=(nil) failed sanity check, terminating stack trace! Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x8ca6688 = SELECT * FROM ACEPSOH ORDER BY MANDT, COMP, BUKRS, OBJID thd->thread_id=1 The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash.
[7 Sep 2005 11:03]
Andrey Hristov
up in the stack table->record[0] is full of 0xA5 which seems very suspicious.
[7 Sep 2005 11:34]
Jan Lindström
Here is full stack trace using my laptop and gdb: Version: '5.0.13-beta-debug' socket: '/tmp/mysql.sock' port: 3306 Source distribution [New Thread -1255822416 (LWP 7376)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1255822416 (LWP 7376)] 0x08289524 in ha_innobase::store_key_val_for_row (this=0x8caaf78, keynr=0, buff=0x8cab4bb "", buff_len=281, record=0x8cab0b8 '' <repeats 64 times>) at ha_innodb.cc:2935 2935 memcpy(buff, blob_data, blob_len); (gdb) bt #0 0x08289524 in ha_innobase::store_key_val_for_row (this=0x8caaf78, keynr=0, buff=0x8cab4bb "", buff_len=281, record=0x8cab0b8 '' <repeats 64 times>) at ha_innodb.cc:2935 #1 0x0828b0fc in ha_innobase::position (this=0x8caaf78, record=0x8cab0b8 '' <repeats 64 times>) at ha_innodb.cc:4341 #2 0x08276fc2 in find_all_keys (param=0xb5259434, select=0x8cb0028, sort_keys=0xb5047020, buffpek_pointers=0xb52592ac, tempfile=0xb5259370, indexfile=0x0) at filesort.cc:470 #3 0x08278931 in filesort (thd=0x8c870f8, table=0x8caa810, sortorder=0x8cb0130, s_length=4, select=0x8cb0028, max_rows=18446744073709551615, examined_rows=0xb5259560) at filesort.cc:224 #4 0x08207126 in create_sort_index (thd=0x8c870f8, join=0x8cae9e0, order=0x8cae680, filesort_limit=18446744073709551615, select_limit=18446744073709551615) at sql_select.cc:11271 #5 0x082131e8 in JOIN::exec (this=0x8cae9e0) at sql_select.cc:1629 #6 0x08213772 in mysql_select (thd=0x8c870f8, rref_pointer_array=0x8c87454, tables=0x8cae450, wild_num=1, fields=@0x8c873bc, conds=0x0, og_num=4, order=0x8cae680, group=0x0, having=0x0, proc_param=0x0, select_options=2156153344, result=0x8cae9d0, unit=0x8c87148, select_lex=0x8c8733c) at sql_select.cc:2093 #7 0x08213a32 in handle_select (thd=0x8c870f8, lex=0x8c87138, result=0x8cae9d0, setup_tables_done_option=0) at sql_select.cc:250 #8 0x081b4234 in mysql_execute_command (thd=0x8c870f8) at sql_parse.cc:2464 ---Type <return> to continue, or q <return> to quit--- #9 0x081bc0e3 in mysql_parse (thd=0x8c870f8, inBuf=0x8cae328 "SELECT * FROM ACEPSOH ORDER BY MANDT, COMP, BUKRS, OBJID", length=56) at sql_parse.cc:5395 #10 0x081bcb73 in dispatch_command (command=COM_QUERY, thd=0x8c870f8, packet=0xb5201021 "SELECT * FROM ACEPSOH ORDER BY MANDT, COMP, BUKRS, OBJID", packet_length=57) at sql_parse.cc:1673 #11 0x081be08b in do_command (thd=0x8c870f8) at sql_parse.cc:1468 #12 0x081be48e in handle_one_connection (arg=0x8c870f8) at sql_parse.cc:1120 #13 0xb7f4bccd in start_thread () from /lib/tls/libpthread.so.0 #14 0xb7e75b0e in clone () from /lib/tls/libc.so.6 (gdb) p record $2 = (const mysql_byte *) 0x8cab0b8 '' <repeats 64 times> (gdb) x record 0x8cab0b8: 0xa5a5a5a5 Regards, Jan
[7 Sep 2005 18:09]
Heikki Tuuri
Jan, if the table has no rows, how does the execution ever reach that point? InnoDB should return HA_END_OF_FILE or similar to MySQL. Step inside gdb and find out. Regards, Heikki
[8 Sep 2005 11:50]
Jan Lindström
Correct, InnoDB does return error code: Version: '5.0.13-beta-debug' socket: '/tmp/mysql.sock' port: 3306 Source distribution [New Thread -1255715920 (LWP 11273)] [Switching to Thread -1255715920 (LWP 11273)] Breakpoint 1, row_search_for_mysql (buf=0x8c88e28 "\003000\001 ", mode=1, prebuilt=0xb7c62c68, match_mode=0, direction=0) at row0sel.c:3059 3059 dict_index_t* index = prebuilt->index; Current language: auto; currently c (gdb) next 3060 ibool comp = index->table->comp; (gdb) 3061 dtuple_t* search_tuple = prebuilt->search_tuple; (gdb) 3062 btr_pcur_t* pcur = prebuilt->pcur; (gdb) 3063 trx_t* trx = prebuilt->trx; (gdb) 3070 ulint err = DB_SUCCESS; (gdb) 3071 ibool unique_search = FALSE; (gdb) 3072 ibool unique_search_from_clust_index = FALSE; (gdb) 3073 ibool mtr_has_extra_clust_latch = FALSE; (gdb) 3074 ibool moves_up = FALSE; (gdb) 3075 ibool set_also_gap_locks = TRUE; (gdb) 3086 mem_heap_t* heap = NULL; (gdb) 3088 ulint* offsets = offsets_; (gdb) 3090 *offsets_ = (sizeof offsets_) / sizeof *offsets_; (gdb) 3095 if (UNIV_UNLIKELY(prebuilt->table->ibd_file_missing)) { (gdb) 3110 if (UNIV_UNLIKELY(prebuilt->magic_n != ROW_PREBUILT_ALLOCATED)) { (gdb) 3123 if (trx->n_mysql_tables_in_use == 0 (gdb) 3152 if (UNIV_UNLIKELY(btr_search_latch.writer != RW_LOCK_NOT_LOCKED)(gdb) 3170 if (srv_locks_unsafe_for_binlog (gdb) 3179 if (UNIV_UNLIKELY(direction == 0)) { (gdb) 3180 trx->op_info = "starting index read"; (gdb) 3182 prebuilt->n_rows_fetched = 0; (gdb) 3183 prebuilt->n_fetch_cached = 0; (gdb) 3184 prebuilt->fetch_cache_first = 0; (gdb) 3186 if (prebuilt->sel_graph == NULL) { (gdb) 3188 row_prebuild_sel_graph(prebuilt); (gdb) 3250 if (match_mode == ROW_SEL_EXACT (gdb) 3278 mtr_start(&mtr); (gdb) 3289 if (UNIV_UNLIKELY(direction == 0) (gdb) 3390 if (trx->has_search_latch) { (gdb) 3395 trx_start_if_not_started(trx); (gdb) 3397 if (trx->isolation_level <= TRX_ISO_READ_COMMITTED (gdb) 3419 if (UNIV_UNLIKELY(direction == 0)) { (gdb) 3420 if (mode == PAGE_CUR_GE || mode == PAGE_CUR_G) { (gdb) 3421 moves_up = TRUE; (gdb) 3427 thr = que_fork_get_first_thr(prebuilt->sel_graph); (gdb) 3429 que_thr_move_to_run_state_for_mysql(thr, trx); (gdb) 3431 clust_index = dict_table_get_first_index(index->table); (gdb) 3433 if (UNIV_LIKELY(direction != 0)) { (gdb) 3440 } else if (dtuple_get_n_fields(search_tuple) > 0) { (gdb) 3448 if (mode == PAGE_CUR_G) { (gdb) 3449 btr_pcur_open_at_index_side(TRUE, index, (gdb) 3457 if (!prebuilt->sql_stat_start) { (gdb) 3470 } else if (prebuilt->select_lock_type == LOCK_NONE) { (gdb) 3474 trx_assign_read_view(trx); (gdb) 3475 prebuilt->sql_stat_start = FALSE; (gdb) 3496 rec = btr_pcur_get_rec(pcur); (gdb) 3508 if (page_rec_is_infimum(rec)) { (gdb) 3996 if (UNIV_UNLIKELY(mtr_has_extra_clust_latch)) { (gdb) 4019 if (moves_up) { (gdb) 4020 if (UNIV_UNLIKELY(!btr_pcur_move_to_next(pcur, &mtr))) {(gdb) 3496 rec = btr_pcur_get_rec(pcur); (gdb) 3508 if (page_rec_is_infimum(rec)) { (gdb) 3517 if (page_rec_is_supremum(rec)) { (gdb) 3519 if (set_also_gap_locks (gdb) 3996 if (UNIV_UNLIKELY(mtr_has_extra_clust_latch)) { (gdb) 4019 if (moves_up) { (gdb) 4020 if (UNIV_UNLIKELY(!btr_pcur_move_to_next(pcur, &mtr))) {(gdb) 4022 btr_pcur_store_position(pcur, &mtr); (gdb) 4024 if (match_mode != 0) { (gdb) 4027 err = DB_END_OF_INDEX; (gdb) 4030 goto normal_return; (gdb) 4105 que_thr_stop_for_mysql_no_error(thr, trx); (gdb) 4107 mtr_commit(&mtr); (gdb) 4109 if (prebuilt->n_fetch_cached > 0) { (gdb) 4120 if (err == DB_SUCCESS) { (gdb) 4125 trx->op_info = ""; (gdb) 4126 if (UNIV_LIKELY_NULL(heap)) { (gdb) 4129 return(err); (gdb) 4130 } (gdb) ha_innobase::index_read (this=0x8c88ce8, buf=0x8c88e28 "\003000\001 ", key_ptr=0x0, key_len=0, find_flag=HA_READ_AFTER_KEY) at ha_innodb.cc:3905 3905 innodb_srv_conc_exit_innodb(prebuilt->trx); Current language: auto; currently c++ (gdb) 3907 if (ret == DB_SUCCESS) { (gdb) 3911 } else if (ret == DB_RECORD_NOT_FOUND) { (gdb) 3915 } else if (ret == DB_END_OF_INDEX) { (gdb) 3916 error = HA_ERR_KEY_NOT_FOUND; (gdb) 3917 table->status = STATUS_NOT_FOUND; (gdb) 3923 DBUG_RETURN(error); (gdb) 3924 } (gdb) ha_innobase::index_first (this=0x8c88ce8, buf=0x8c88e28 "\003000\001 ") at ha_innodb.cc:4151 4151 if (error == HA_ERR_KEY_NOT_FOUND) { (gdb) 4152 error = HA_ERR_END_OF_FILE; (gdb) 4155 DBUG_RETURN(error); (gdb) 4156 } (gdb) ha_innobase::rnd_next (this=0x8c88ce8, buf=0x8c88e28 "\003000\001 ") at ha_innodb.cc:4242 4242 if (error == HA_ERR_KEY_NOT_FOUND) { (gdb) 4245 start_of_scan = 0; (gdb) 4250 DBUG_RETURN(error); (gdb) p error $1 = 137 (gdb) next 4251 } (gdb) find_all_keys (param=0xb5273434, select=0x8cb0028, sort_keys=0xb508b020, buffpek_pointers=0xb52732ac, tempfile=0xb5273370, indexfile=0x0) at filesort.cc:464 464 if (!flag) (gdb) 470 file->position(sort_form->record[0]); (gdb) p flag $2 = 1 Regards, Jan
[8 Sep 2005 13:55]
Andrey Hristov
Okay, the bug is not in innodb but MySQL does not check the error code immediately but tries to read. The following patch seems to fix the problem. filesort.cc has the same code in 4.1 ===== sql/filesort.cc 1.103 vs edited ===== --- 1.103/sql/filesort.cc 2005-06-21 18:13:36 +02:00 +++ edited/sql/filesort.cc 2005-09-08 15:12:22 +02:00 @@ -461,13 +461,16 @@ else { error=file->rnd_next(sort_form->record[0]); - if (!flag) - { - my_store_ptr(ref_pos,ref_length,record); // Position to row - record+= sort_form->s->db_record_offset; - } - else - file->position(sort_form->record[0]); + if (!error) + { + if (!flag) + { + my_store_ptr(ref_pos,ref_length,record); // Position to row + record+= sort_form->s->db_record_offset; + } + else + file->position(sort_form->record[0]); + } } if (error && error != HA_ERR_RECORD_DELETED) break;
[8 Sep 2005 16:26]
Heikki Tuuri
But why does MySQL use filesort in the first place? Regards, Heikki
[8 Sep 2005 17:35]
Andrey Hristov
Changing this as optimizer bug, though the problem of not checking the error code, fixed in the attached patch, should be fixed too.
[9 Sep 2005 1:01]
Michael Widenius
I think that Andrey's patch for filesort is correct and will add it to 4.1 tomorrow (with a test case). The reason MySQL doesn't use filesort for the query: SELECT * FROM ACEPSOH ORDER BY MANDT, COMP, BUKRS, OBJID Is that OBJID is not fully indexed (only the first 255 characters is part of the index) and can thus not be used to resolve the ORDER BY.
[12 Sep 2005 12:50]
Michael Widenius
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: Fix will be in 4.1.15 & 5.0.13 For docs team: The core dump only happens with InnoDB if one has a BLOB as part of the primary key.