Bug #56080 Assertion `inited==INDEX' in sql/handler.h:1534 on SELECT MAX(..)
Submitted: 18 Aug 2010 1:49 Modified: 3 May 2011 1:01
Reporter: Elena Stepanova Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.99-m5-debug OS:Any
Assigned to: Roy Lyseng CPU Architecture:Any
Tags: regression

[18 Aug 2010 1:49] Elena Stepanova
Description:
It might turn out to have the same root cause as bug#55986, but since the backtrace looks quite different, I am submitting it separately.

#0  0x00002ac822b87ea3 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000a8ea43 in my_write_core (sig=6)
    at mysql-5.6.99-m5/mysys/stacktrace.c:326
#2  0x0000000000538ef1 in handle_segfault (sig=6)
    at mysql-5.6.99-m5/sql/mysqld.cc:2502
#3  <signal handler called>
#4  0x00002ac82356ebb5 in raise () from /lib64/libc.so.6
#5  0x00002ac82356ffb0 in abort () from /lib64/libc.so.6
#6  0x00002ac823568286 in __assert_fail () from /lib64/libc.so.6
#7  0x00000000005636c6 in handler::ha_index_end (this=0x1a7b560)
    at mysql-5.6.99-m5/sql/handler.h:1534
#8  0x00000000007ee8b3 in opt_sum_query (tables=0x1abd900, all_fields=@0x1ac5a68, conds=0x0)
    at mysql-5.6.99-m5/sql/opt_sum.cc:389
#9  0x000000000060c53b in JOIN::optimize (this=0x1abfaf0)
    at mysql-5.6.99-m5/sql/sql_select.cc:1753
#10 0x0000000000610c33 in mysql_select (thd=0x1a71a10, rref_pointer_array=0x1a73bc0, tables=0x1abd900, wild_num=0, fields=@0x1a73ae0,
    conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x1abdec8,
    unit=0x1a733c0, select_lex=0x1a739d8)
    at mysql-5.6.99-m5/sql/sql_select.cc:3324
#11 0x0000000000616949 in handle_select (thd=0x1a71a10, lex=0x1a73310, result=0x1abdec8, setup_tables_done_option=0)
    at mysql-5.6.99-m5/sql/sql_select.cc:323
#12 0x00000000005b8996 in execute_sqlcom_select (thd=0x1a71a10, all_tables=0x1abd900)
    at mysql-5.6.99-m5/sql/sql_parse.cc:4716
#13 0x00000000005b9bab in mysql_execute_command (thd=0x1a71a10)
    at mysql-5.6.99-m5/sql/sql_parse.cc:2299
#14 0x00000000005c0b26 in mysql_parse (thd=0x1a71a10, rawbuf=0x1abd5d0 "SELECT MAX(i) FROM t FOR UPDATE", length=31,
    parser_state=0x4594e9b0) at mysql-5.6.99-m5/sql/sql_parse.cc:5745
#15 0x00000000005c2158 in dispatch_command (command=COM_QUERY, thd=0x1a71a10, packet=0x1aaf431 "SELECT MAX(i) FROM t FOR UPDATE",
    packet_length=31) at mysql-5.6.99-m5/sql/sql_parse.cc:1129
#16 0x00000000005c35c2 in do_command (thd=0x1a71a10)
    at mysql-5.6.99-m5/sql/sql_parse.cc:801
#17 0x0000000000694e20 in do_handle_one_connection (thd_arg=0x1a71a10)
    at mysql-5.6.99-m5/sql/sql_connect.cc:1191
#18 0x0000000000694ee9 in handle_one_connection (arg=0x1a71a10)
    at mysql-5.6.99-m5/sql/sql_connect.cc:1130
#19 0x0000000000935fd1 in pfs_spawn_thread (arg=0x1a1f300)
    at mysql-5.6.99-m5/storage/perfschema/pfs.cc:1061
#20 0x00002ac822b83143 in start_thread () from /lib64/libpthread.so.0
#21 0x00002ac8235ff8cd in clone () from /lib64/libc.so.6
#22 0x0000000000000000 in ?? ()

Note: In the provided test case, 'FOR UPDATE' clause is added for easier synchronization, it is not necessary for the test scenario as such.

How to repeat:
#----------------------------------
# Add a sync point to sql/item_subselect.cc 
# and compile with sync points enabled

diff -r a/sql/item_subselect.cc b/sql/item_subselect.cc
40a41
> #include "debug_sync.h"
2096a2098
>   DEBUG_SYNC(thd,"before_index_end_in_subselect");

#----------------------------------
# Test case

--source include/have_innodb.inc

--disable_warnings
DROP TABLE IF EXISTS t;
--enable_warnings

CREATE TABLE t ( i INT NOT NULL PRIMARY KEY, f INT ) 
  ENGINE = InnoDB;
INSERT INTO t VALUES (1,1),(2,2);

BEGIN;
UPDATE t SET f=100 WHERE i=2;

--connect (con1,localhost,root,,)

SET DEBUG_SYNC='before_index_end_in_subselect WAIT_FOR callit';
--send SELECT f FROM t WHERE i IN ( SELECT i FROM t )

--connection default
let $show_statement= SHOW PROCESSLIST;
let $field= State;
let $condition= = 'debug sync point: before_index_end_in_subselect';
 --source include/wait_show_condition.inc

--connect (con2,localhost,root,,)
--send SELECT MAX(i) FROM t FOR UPDATE

--connect (con3,localhost,root,,)
--send SELECT MAX(i) FROM t FOR UPDATE

--connection default

let $wait_condition=
  SELECT COUNT(*) = 2 FROM information_schema.processlist 
  WHERE state = 'Optimizing' and info = 'SELECT MAX(i) FROM t FOR UPDATE';
--source include/wait_condition.inc

SET DEBUG_SYNC='now SIGNAL callit';
COMMIT;

--connection con1
--reap
SET DEBUG_SYNC='RESET';

--connection con2
--reap

--connection con3
--reap

DROP TABLE t;

--exit
[18 Aug 2010 1:52] Elena Stepanova
Sync point with context:

diff -C 3 -r a/sql/item_subselect.cc b/sql/item_subselect.cc
*** a/sql/item_subselect.cc     2010-08-18 03:29:48.000000000 +0200
--- b/sql/item_subselect.cc     2010-08-18 03:29:51.000000000 +0200
***************
*** 38,43 ****
--- 38,44 ----
  #include "set_var.h"
  #include "sql_select.h"
  #include "sql_parse.h"                          // check_stack_overrun
+ #include "debug_sync.h"

  inline Item * and_items(Item* cond, Item *item)
  {
***************
*** 2094,2099 ****
--- 2095,2101 ----
  {
    DBUG_ENTER("subselect_uniquesubquery_engine::cleanup");
    /* Tell handler we don't need the index anymore */
+   DEBUG_SYNC(thd,"before_index_end_in_subselect");
    if (tab->table->file->inited)
      tab->table->file->ha_index_end();
    DBUG_VOID_RETURN;
[4 Oct 2010 8:15] Roy Lyseng
Bug#55986 is probably a duplicate of this one. See that report for likely origin.
[19 Oct 2010 8:30] Roy Lyseng
There seems to be a race condition, where a thread may attempt to end a table or index scan for a released TABLE object:

When finishing up query execution, do_select() calls JOIN::join_free(). This again calls JOIN::cleanup(), which calls ha_index_or_rnd_end() for the subquery table, which again calls ha_index_end() for the table.

Then we're calling JOIN::destroy(), which calls JOIN::cleanup() again.

This is repeated for all st_select_lex objects, which means that eventually, all JOIN objects are "fully" cleaned up.

Then we call close_thread_tables() from mysql_execute_command(), which releases all TABLE objects and makes them free for reuse by other threads.

Then THD::cleanup_after_query() calls free_items(), which eventually calls subselect_uniquesubquery_engine::cleanup(). This object still has a TABLE reference, and if that object has been reused by another thread, the function will attempt to clean up a table or index scan on behalf of another thread.
[9 Nov 2010 15:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/123297

3280 Roy Lyseng	2010-11-09
      Bug#56080: Assertion 'inited=INDEX in sql/handler.h
      
      The problem occurs because some TABLE objects are associated with
      the optimizer execution structures even after close_thread_tables()
      has been called. In turn, this means that final cleanup of some
      TABLE objects is done from within THD::cleanup_after_query().
      At this point in time, the TABLE objects may already have been
      given to another session, meaning that we may end another session's
      index or table scan.
      
      The safest solution seems to be to always perform a thorough cleanup
      of all execution data structures, including releasing the TABLE
      objects, before close_thread_tables() is called. In other words,
      this will have to be done at the end of each execution.
      
      However, this is currently not possible in the implementation of
      subquery materialization. This part of the optimizer assigns some
      objects, including a TABLE object, for the lifetime of the
      statement object, and not for each execution. This gives an odd
      situation, where some parts of the data structures are set up for
      reuse and others are not. Combine this with the fact that the
      user may decide to use another strategy for a subsequent
      subquery execution, the "permanent" data structures may not even
      be used later.
      
      It is therefore reasonable to allocate materialization execution
      structures for the lifetime of one execution. It means that the
      functions init_permanent() and init_runtime() of class
      subselect_hash_sj_engine are combined into one setup() function.
      In addition, the function Item_subselect::cleanup() will be called
      when the cleanup() function for the underlying query expression
      is called, and this function will cleanup and destroy the
      materialization data structures, so that the subquery item object
      is ready for a new optimization process in the next round of
      execution.
      
      However, the EXPLAIN functionality relies on these data
      structures being present when explaining a query containing
      a materialized subquery. Thus, we have two conflicting
      requirements for query cleanup:
       - For maximum efficiency we should release resources as early as
         possible after execution.
       - We cannot release resources that are needed for EXPLAIN queries.
      
      One solution to this problem is to split resource cleanup in two:
      1. Delete JOIN objects as soon as possible - this makes sense 
         because TABLE objects are associated with them through JOIN_TAB.
         But notice that JOIN objects may be needed for EXPLAIN queries...
      2. Cleanup query expression objects (ie st_select_lex and
         st_select_lex_unit) later (in practice at end of query execution).
      
      Because this bug was caused by a race condition, no simple test case
      has been made.
      
      mysql-test/r/union.result
        Change in EXPLAIN EXTENDED output that shows an optimized predicate
        in the two query specifications within a UNION in a subquery.
        AFAIK, this is similar to how a non-subquery is reported, so the
        change is an improvement.
      
      sql/ha_partition.cc
        Comment change.
      
      sql/item_subselect.cc
        In Item_subselect::cleanup(), "new" engine is always deleted after
        an execution.
        Item_in_subselect::setup_engine() now builds the materialization
        engine for one execution only. Specific arena is no longer needed.
        In subselect_union_engine::cleanup(), call to
        unit->reinit_exec_mechanism is deleted because it is redundant.
        Implementation of cleanup() functions for subclasses of
        subselect_engine has been refactored.
        subselect_hash_sj_engine::setup() replaces init_permanent() and
        init_runtime().
        For class subselect_hash_sj_engine, freeing of the temporary
        result table is moved from the destructor to ::cleanup().
        Some changes necessary because of interface changes.
      
      sql/item_subselect.h
        Some cleanup of the subselect_engine classes:
         - All virtual functions now marked virtual in all derived classes.
         - Functions that returned bool result are now declared accordingly.
         - Slight cleanup of private/protected/public performed.
         - Improved constructors.
      
      sql/sql_lex.h
        Change in friend declaration.
      
      sql/sql_select.cc
        Function JOIN::reinit() is renamed to JOIN::reset().
        JOIN::destroy() will now delete connection to associated tables.
        The st_select_lex_unit object is no longer cleaned up as early as
        before, but instead relying on the general cleanup of lex->unit
        in mysql_execute_command().
      
      sql/sql_select.h
        A few small interface changes.
      
      sql/sql_union.cc
        Cleaned up some confusing error handling in st_select_lex_unit::exec()
        and st_select_lex::cleanup().
[5 Jan 2011 7:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127945

3321 Roy Lyseng	2011-01-05
      Bug#56080: Assertion 'inited=INDEX in sql/handler.h
      
      The problem occurs because some TABLE objects are associated with
      the optimizer execution structures even after close_thread_tables()
      has been called. In turn, this means that final cleanup of some
      TABLE objects is done from within THD::cleanup_after_query().
      At this point in time, the TABLE objects may already have been
      given to another session, meaning that we may end another session's
      index or table scan.
      
      The safest solution seems to be to always perform a thorough cleanup
      of all execution data structures, including releasing the TABLE
      objects, before close_thread_tables() is called. In other words,
      this will have to be done at the end of each execution.
      
      However, this is currently not possible in the implementation of
      subquery materialization. This part of the optimizer assigns some
      objects, including a TABLE object, for the lifetime of the
      statement object, and not for each execution. This gives an odd
      situation, where some parts of the data structures are set up for
      reuse and others are not. Combine this with the fact that the
      user may decide to use another strategy for a subsequent
      subquery execution, the "permanent" data structures may not even
      be used later.
      
      It is therefore reasonable to allocate materialization execution
      structures for the lifetime of one execution. It means that the
      functions init_permanent() and init_runtime() of class
      subselect_hash_sj_engine are combined into one setup() function.
      In addition, the function Item_subselect::cleanup() will be called
      when the cleanup() function for the underlying query expression
      is called, and this function will cleanup and destroy the
      materialization data structures, so that the subquery item object
      is ready for a new optimization process in the next round of
      execution.
      
      However, the EXPLAIN functionality relies on these data
      structures being present when explaining a query containing
      a materialized subquery. Thus, we have two conflicting
      requirements for query cleanup:
       - For maximum efficiency we should release resources as early as
         possible after execution.
       - We cannot release resources that are needed for EXPLAIN queries.
      
      One solution to this problem is to split resource cleanup in two:
      1. Delete JOIN objects as soon as possible - this makes sense 
         because TABLE objects are associated with them through JOIN_TAB.
         But notice that JOIN objects may be needed for EXPLAIN queries...
      2. Cleanup query expression objects (ie st_select_lex and
         st_select_lex_unit) later (in practice at end of query execution).
      
      mysql-test/r/optimizer_debug_sync.result
        Results for new test case.
      
      mysql-test/t/optimizer_debug_sync.test
        Test case for bug. Requires the debug sync facility, and often fails
        without the bug fix.
      
      mysql-test/r/union.result
        Change in EXPLAIN EXTENDED output that shows an optimized predicate
        in the two query specifications within a UNION in a subquery.
        AFAIK, this is similar to how a non-subquery is reported, so the
        change is an improvement.
      
      sql/ha_partition.cc
        Comment change.
      
      sql/item_subselect.cc
        In Item_subselect::cleanup(), "new" engine is always deleted after
        an execution.
        Item_in_subselect::setup_engine() now builds the materialization
        engine for one execution only. Specific arena is no longer needed.
        In subselect_union_engine::cleanup(), call to
        unit->reinit_exec_mechanism is deleted because it is redundant.
        Implementation of cleanup() functions for subclasses of
        subselect_engine has been refactored.
        subselect_hash_sj_engine::setup() replaces init_permanent() and
        init_runtime().
        For class subselect_hash_sj_engine, freeing of the temporary
        result table is moved from the destructor to ::cleanup().
        Some changes necessary because of interface changes.
      
      sql/item_subselect.h
        Some cleanup of the subselect_engine classes:
         - All virtual functions now marked virtual in all derived classes.
         - Functions that returned bool result are now declared accordingly.
         - Slight cleanup of private/protected/public performed.
         - Improved constructors.
      
      sql/sql_lex.h
        Change in friend declaration.
      
      sql/sql_select.cc
        Function JOIN::reinit() is renamed to JOIN::reset().
        JOIN::destroy() will now delete connection to associated tables.
        The st_select_lex_unit object is no longer cleaned up as early as
        before, but instead relying on the general cleanup of lex->unit
        in mysql_execute_command().
      
      sql/sql_select.h
        A few small interface changes.
      
      sql/sql_union.cc
        Cleaned up some confusing error handling in st_select_lex_unit::exec()
        and st_select_lex::cleanup().
[3 May 2011 1:01] Paul DuBois
Noted in 5.6.3 changelog.

Table objects associated with one session's optimizer structures
could be closed after being passed to another session, prematurely
ending the second sesion's table or index scan. 

CHANGESET - http://lists.mysql.com/commits/132215