MySQL Bugs: #21587: FLUSH TABLES causes server crash when used with HANDLER statements

Bug #21587	FLUSH TABLES causes server crash when used with HANDLER statements
Submitted:	11 Aug 2006 15:24	Modified:	22 Oct 2007 21:57
Reporter:	Shane Bester (Platinum Quality Contributor)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server	Severity:	S2 (Serious)
Version:	5.0.24	OS:	Any (*)
Assigned to:	Davi Arnaut	CPU Architecture:	Any
Tags:	crash, flush tables, handler

Description:
When one connection is using HANDLER OPEN, CLOSE, and related commands on some MyISAM tables, and another connection issues FLUSH TABLES, the server can crash.

Crash always shows the HANDLER ... OPEN statement in the error log.  

On linux I received an ASSERT once.:

mysqld: sql_base.cc:560: bool close_thread_table(THD*, TABLE**): Assertion `table->file->inited == handler::NONE' failed.

0x91d03164 __stop___libc_freeres_ptrs + -1989787260
0x83fa71b __assert_fail + 167
0x81148e6 _Z18close_thread_tableP3THDPP8st_table + 162
0x8116307 _Z19close_thread_tablesP3THDbb + 679
0x81164ae _Z23close_tables_for_reopenP3THDPP13st_table_list + 150
0x8119e91 _Z11open_tablesP3THDPP13st_table_listPjj + 641
0x804cbb0 _Z13mysql_ha_openP3THDP13st_table_listb + 512
0x80f7ce6 _Z21mysql_execute_commandP3THD + 32302
0x80f8144 _Z11mysql_parseP3THDPcj + 330
0x80f8d54 _Z16dispatch_command19enum_server_commandP3THDPcj + 2776
0x80fa005 _Z10do_commandP3THD + 537
0x80fad62 handle_one_connection + 3246
0x83c6856 start_thread + 98
0x841e3ae __clone + 110

On windows, I got same assert once, and another crash second:

mysqld_nt!broadcast_refresh+0x1a0 
mysqld_nt!mysql_lock_tables+0x29 
mysqld_nt!mysql_ha_read+0x1bd 
mysqld_nt!mysql_execute_command+0x3fbc
mysqld_nt!mysql_parse+0x102 
mysqld_nt!dispatch_command+0x562 
mysqld_nt!do_command+0xad 
mysqld_nt!handle_one_connection+0x26e 
mysqld_nt!pthread_start+0x3b 
mysqld_nt!_threadstart(void * ptd = 0x0067f39c)+0x6c 

How to repeat:
I am working on a simpler testcase. It is 100% repeatable on my dataset, which currently involves alot of complex tables.

This doesn't require many threads to repeat.  Only single thread running FLUSH TABLES over and over, and another single thread running HANDLER commands.

Suggested fix:
.

Indeed, it is a crasher:

#0  0x00000000005f779d in get_lock_data (thd=0x183b0c8, table_ptr=0x1846948,
    count=1, flags=2, write_lock_used=0x438c75f8) at lock.cc:659
#1  0x00000000005f9b84 in mysql_lock_tables (thd=0x183b0c8, tables=0x1846948,
    count=1, flags=0, need_reopen=0x438c7a0f) at lock.cc:126
#2  0x00000000005526ec in mysql_ha_read (thd=0x183b0c8, tables=0x18468b0,
    mode=RNEXT, keyname=0x1846b60 "pbatcfg", key_expr=0xa5a5a5a5a5a5a5a5,
    ha_rkey_mode=HA_READ_KEY_EXACT, cond=0x0, select_limit_cnt=1,
    offset_limit_cnt=0) at sql_handler.cc:419
#3  0x00000000006226f5 in mysql_execute_command (thd=0x183b0c8)
    at sql_parse.cc:4096
#4  0x0000000000625ae3 in mysql_parse (thd=0x183b0c8,
    inBuf=0x1846768 "handler pbatcfg003 read pbatcfg next", length=36)
    at sql_parse.cc:5847
#5  0x00000000006266ae in dispatch_command (command=COM_QUERY, thd=0x183b0c8,
    packet=0x1832d69 "handler pbatcfg003 read pbatcfg next", packet_length=37)
    at sql_parse.cc:1766
#6  0x0000000000627ec2 in do_command (thd=0x183b0c8) at sql_parse.cc:1550
#7  0x00000000006282e4 in handle_one_connection (arg=0x183b0c8)
    at sql_parse.cc:1181
#8  0x00002aaaaaef112a in start_thread () from /lib/libpthread.so.0
#9  0x00002aaaab4993c3 in clone () from /lib/libc.so.6

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/15833

ChangeSet@1.2296, 2006-11-27 13:24:24+04:00, ramil@mysql.com +3 -0
  Fix for bug #21587: FLUSH TABLES causes server crash when used with HANDLER statements
  
  Problems (appear only under some circumstances): 
    1. we get a reference to a deleted table searching in the 
       thd->handler_tables_hash in the mysql_ha_read().
  
    2. DBUG_ASSERT(table->file->inited == handler::NONE); assert fails in the
       close_thread_table().
  
  Fix: end open index scans and table scans and remove references to the 
  tables from the handler tables hash. After this preparation it is safe 
  to close the tables. The close can no longer fail on open index/table 
  scans and the closed table will not be used again by handler functions.

Noted in 5.0.32, 5.1.15 changelogs.

Using FLUSH TABLES in one connection while another connection is
using HANDLER statements caused a server crash.

The fix for this bug must be reverted, because it causes a more serious bug (bug #29474).  An alternate fix for this bug will be prepared.

The previous fix for this bug has been reverted, as of 5.0.48 and 5.1.22-beta.

Another fix is being prepared.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34934

ChangeSet@1.2526, 2007-10-04 17:34:41-03:00, davi@moksha.local +1 -0
  Bug#21587 FLUSH TABLES causes server crash when used with HANDLER statements
  
  This bug is a symptom of the way handler's tables are managed. The
  most different aspect, compared to the conventional behavior, is that
  the handler's tables are long lived, meaning that their lifetimes are
  not bounded by the duration of the command that opened them. For this
  effect the handler code uses its own list (handler_tables instead of
  open_tables) to hold open handler tables so that the tables won't be
  closed at the end of the command/statement. Besides the handler_tables
  list, there is a hash (handler_tables_hash) which is used to associate
  handler aliases to tables and to refresh the tables upon demand (flush
  tables).
  
  The current implementation doesn't work properly with refreshed tables
  -- more precisely when flush commands are issued by other initiators.
  This happens because when a handler open or read statement is being
  processed, the associated table has to be opened or locked and, for this
  matter, the open_tables and handler_tables lists are swapped so that the
  new table being opened is inserted into the handler_tables list. But when
  opening or locking the table, if the refresh version is different from the
  thread refresh version then all used tables in the open_tables list (now
  handler_tables) are refreshed. In the "refreshing" process the handler
  tables are flushed (closed) without being properly unlinked from the
  handler hash.
  
  The current implementation also fails to properly discard handlers of
  dropped tables, but this and other problems are going to be addressed
  in the fixes for bugs 31397 and 31409.
  
  The chosen approach tries to properly save and restore the table state
  so that no table is flushed during the table open and lock operations.
  The logic is almost the same as before with the list swapping, but with
  a working glue code.
  
  The test case for this bug is going to be committed into 5.1 because it
  requires a test feature only avaiable in 5.1 (wait_condition).

Pushed into 5.1.23-beta

Pushed into 5.0.52

Noted in 5.0.52, 5.1.23 changelogs.

Executing FLUSH TABLES while tables were open for use with HANDLER
statements could cause a server crash.