Bug #56540 | Exception (crash) in sql_show.cc during rqg_info_schema test on Windows | ||
---|---|---|---|
Submitted: | 3 Sep 2010 14:27 | Modified: | 20 Nov 2010 17:49 |
Reporter: | John Embretsen | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Locking | Severity: | S1 (Critical) |
Version: | 5.5.6-m3-Celosia | OS: | Windows |
Assigned to: | Konstantin Osipov | CPU Architecture: | Any |
Tags: | rqg_pb2 |
[3 Sep 2010 14:27]
John Embretsen
[13 Sep 2010 22:49]
Christopher Powers
The failure is caused by a debug assertion raised in ha_partition::index_read_idx_map(): ========== int ha_partition::index_read_idx_map(...) { [...] get_partition_set(table, buf, index, &m_start_key, &m_part_spec); /* How can it be more than one partition with the current use? */ DBUG_ASSERT(m_part_spec.start_part == m_part_spec.end_part); for (part= m_part_spec.start_part; part <= m_part_spec.end_part; part++) { if (bitmap_is_set(&(m_part_info->used_partitions), part)) ========== The key is assumed to reside within a range of partitions defined by m_part_spec.start_part and end_part. get_partition_set() looks for the key within this range of partitions. If the key cannot be found, it sets start_part = end_part + 1. In this case, start_part == end_part == 0, but get_partitions_set() does not find the key, so it sets start_part = 1 which triggers the fatal assertion. Is this a valid assertion?
[14 Sep 2010 8:18]
Mattias Jonsson
The assertion is not correct, OK to change into: /* How can it be more than one partition with the current use? */ DBUG_ASSERT(m_part_spec.start_part >= m_part_spec.end_part); i.e. changing '==' to '>='. Note that this is a DBUG_ASSERT only, so it does not affect a production binary.
[14 Sep 2010 8:35]
Mattias Jonsson
the assert is already filed as bug#55146.
[21 Sep 2010 20:50]
Christopher Powers
The debug assertion is addressed in Bug#55146. With this fix applied, however, I am still able to recreate a crash in mysql-5.5-bugfixing on WinXP x64, similar to the one originally reported. The crash appears to be caused by a race condition in get_schema_views_record(), where multiple threads are iterating through the same view. I will continue to debug this issue, and will post status updates and debugging details as they become available.
[24 Sep 2010 18:50]
Christopher Powers
The crash on Windows XP 64 has a call stack similar to that of first one reported: get_schema_views_record(thd=0x000000000ce61ec0, tables=0x000000000f077000,...) get_all_tables(thd=0x000000000ce61ec0, tables=0x000000000cd779f0,...) get_schema_tables_result(join=0x000000000d3fe600, executed_place=PROCESSED_BY_JOIN_EXEC) JOIN::exec(...) mysql_select(...) [...] In this case, the query was a complex join, however, I think this is incidental: SELECT * FROM (mysql.time_zone_leap_second AS table1 STRAIGHT_JOIN INFORMATION_SCHEMA.VIEWS AS table2 ON (table2.TABLE_SCHEMA = table1.Correction)) WHERE table2.TABLE_CATALOG != table1.Transition_time ORDER BY table2.IS_UPDATABLE, table1.Transition_time, table2.CHECK_OPTION, table2.CHARACTER_SET_CLIENT, table2.TABLE_NAME; The RQG test case conducts a concurrent mix of similar queries across 10 threads, in addition to various CREATE TABLE/VIEW operations. The crash occurs because of a corrupted Field object associated with table 'r' encountered here: get_schema_views_record(...) [...] List<Item> *fields= &tables->view->select_lex.item_list; List_iterator<Item> it(*fields); Item *item; Item_field *field; /* check that at least one column in view is updatable */ while ((item= it++)) { if ((field= item->filed_for_view_update()) && field->field && !field->field->table->pos_in_table_list->schema_table) { updatable_view= 1; break; } } [...] Note the invalid content of 'field->field': Local Variables + item 0x000000000ea29eb0 {field=0x000000000ce30050 ...} - field 0x000000000ea29eb0 {field=0x000000000ce30050 ...} + Item_ident {orig_db_name="test" orig_table_name="d" orig_field_name="pk" ...} - field 0x000000000ce30050 {ptr=0xdddddddddddddddd null_ptr=0xdddddddddddddddd...} + __vfptr 0xdddddddddddddddd + ptr 0xdddddddddddddddd + null_ptr 0xdddddddddddddddd + table 0xdddddddddddddddd + orig_table 0xdddddddddddddddd + table_name 0xdddddddddddddddd + field_name 0xdddddddddddddddd + comment {str=0xdddddddddddddddd length=0xdddddddddddddddd} + key_start {map=0xdddddddddddddddd} + part_of_key {map=0xdddddddddddddddd} + part_of_key_not_clustered {map=0xdddddddddddddddd} + part_of_sortkey {map=0xdddddddddddddddd} unireg_check 0xdddddddd field_length 0xdddddddd flags 0xdddddddd field_index 0xdddd null_bit 0xdd 'Ý' is_created_from_null_item true + result_field 0x000000000ce30050 {ptr=0xdddddddddddddddd ... + item_equal 0x0000000000000000 {fields={...} const_item=??? eval_item=??? ...} no_const_subst false have_privileges 0x00000000 any_privileges false + fields 0x000000000f077ec8 List<Item> * [...] + db_name 0x000000000e6bc4c8 {str= "test"...} + table_name 0x000000000e6bc498 {str= "r" ...} + definer 0x000000000e6bb800 "ÌÌÌÌÌÌÌÌÌÌÌÌÌ...ÌÌ" The larger issue is that table 'r' is in the table list and being processed, but table 'r' is invalid, reason unknown. The tabe list is generated by get_all_tables(), which simply performs a file system enumeration of the database directory, then dutifully iterates through the resulting list of table names, assuming that each name maps to a valid table. A clue to the source of the corrupt table 'r' is found in the RQG output: "Query: CREATE TABLE r LIKE `aa` failed: 1356 View 'test.r' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them." Note that after a restart and recovery, two other tables were found to be similarly corrupt. Next actions to stake: 1. Recreate error, examine mysql query log in detail. 2. Recreate crash on Windows XP 32. Questions: Has the InnoDB/MySQL metadata gone out of sync?
[30 Sep 2010 23:51]
Christopher Powers
Confirmed failure in mysql-5.5-bugfixing on Windows XP 32.
[2 Oct 2010 0:57]
Christopher Powers
Also fails on: - mysql-5.5-runtime - mysql-next-mr-runtime Next: Examine query log (debug trace if necessary) to determine which concurrent operation left the table in an invalid (or transitional) state. Possibly related: Bug#49938, "Failing assertion: inode or deadlock in fsp/fsp0fsp.c", where InnoDB is doing a TRUNCATE concurrently with an I_S query.
[2 Oct 2010 1:09]
Christopher Powers
Possibly related: Bug#50561, "ALTER PARTITIONS does not have adequate lock, breaks with concurrent I_S query" which includes a revealing comment from Mikael R.: "Don't accept a complex change in ALTER TABLE partitioning code to support a 'cheating' code in INFORMATION_SCHEMA, change I_S to behave as a proper citizen instead. AFAIU this is also the behaviour which MDL will apply in 5.5." Hmmm. In addition, Bug#56541, "Assertion table->db_stat failed in sql_base.cc::open_table() during I_S queries" claims to be a regression caused by the fix for Bug#50561. So...this bug could be a regression, but clearly the I_S is behaving badly.
[7 Oct 2010 10:18]
Davi Arnaut
Reduced test case: create table t1 (a int, b int); create view v1 as select t1.a, t1.b from t1; alter table t1 change b c int; # set table_cache_size=0 so table is not cached. select * from information_schema.views;
[14 Oct 2010 16:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120787 3163 Konstantin Osipov 2010-10-14 A fix and a test case for Bug#56540 "Exception (crash) in sql_show.cc during rqg_info_schema test on Windows". Ensure we do not access freed memory when filling information_schema.views when one of the views could not be properly opened. @ mysql-test/r/information_schema.result Update results - a fix for Bug#56540. @ mysql-test/t/information_schema.test Add a test case for Bug#56540 @ sql/sql_base.cc Produce an error when we return an error. This directs get_all_tables() to the execution branch which doesn't involve 'process_table()' when we no table/view was opened. @ sql/sql_show.cc Do not try to access underlying table fields when opening of a view failed. The underlying table is closed in that case, and accessing its fields may lead to dereferencing a damaged pointer.
[14 Oct 2010 16:56]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120791 3163 Konstantin Osipov 2010-10-14 A fix and a test case for Bug#56540 "Exception (crash) in sql_show.cc during rqg_info_schema test on Windows". Ensure we do not access freed memory when filling information_schema.views when one of the views could not be properly opened. @ mysql-test/r/information_schema.result Update results - a fix for Bug#56540. @ mysql-test/t/information_schema.test Add a test case for Bug#56540 @ sql/sql_base.cc Push an error into the Diagnostics area when we return an error. This directs get_all_tables() to the execution branch which doesn't involve 'process_table()' when no table/view was opened. @ sql/sql_show.cc Do not try to access underlying table fields when opening of a view failed. The underlying table is closed in that case, and accessing its fields may lead to dereferencing a damaged pointer.
[14 Oct 2010 16:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/120792 3166 Konstantin Osipov 2010-10-14 A fix and a test case for Bug#56540 "Exception (crash) in sql_show.cc during rqg_info_schema test on Windows". Ensure we do not access freed memory when filling information_schema.views when one of the views could not be properly opened. @ mysql-test/r/information_schema.result Update results - a fix for Bug#56540. @ mysql-test/t/information_schema.test Add a test case for Bug#56540 @ sql/sql_base.cc Push an error into the Diagnostics area when we return an error. This directs get_all_tables() to the execution branch which doesn't involve 'process_table()' when no table/view was opened. @ sql/sql_show.cc Do not try to access underlying table fields when opening of a view failed. The underlying table is closed in that case, and accessing its fields may lead to dereferencing a damaged pointer.
[18 Oct 2010 12:16]
Konstantin Osipov
Queued into 5.5.7.
[13 Nov 2010 16:14]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:40]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)
[20 Nov 2010 17:49]
Paul DuBois
Noted in 5.5.8 changelog. The server could crash as a result of accessing freed memory when populating INFORMATION_SCHEMA.VIEWS if a view could not be opened properly.
[16 Dec 2010 22:25]
Bugs System
Pushed into mysql-5.5 5.5.9 (revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (version source revid:jonathan.perkin@oracle.com-20101216101358-fyzr1epq95a3yett) (merge vers: 5.5.9) (pib:24)