Bug #26503 Illegal SQL exception handler code causes the server to crash
Submitted: 20 Feb 2007 17:06 Modified: 31 Mar 2007 23:03
Reporter: Peter Stuge Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Parser Severity:S1 (Critical)
Version:5.0.24-20070220bk, 5.1.15, 5.1.16BK, 5.2.3 OS:Linux (Linux 2.6.19.1+glibc 2.4, Windows)
Assigned to: Marc ALFF CPU Architecture:Any
Tags: bfsm_2007_03_01, crashwork

[20 Feb 2007 17:06] Peter Stuge
Description:
I've created a 40-something table InnoDB schema with various triggers and stored procedures, some of which use transactions.

When stress testing the application I got deadlocks for transactions in stored procedures. I wrapped the transactions in a simple repeat until true construct and added two handlers for deadlock conditions so that the transaction would be retried until successful.

In the transaction I do an insert into a table that has a trigger, in which a select is performed on the same table being inserted to.

After having added the loop and deadlock handlers mysqld started to crash where it had previously failed with locking errors.

I have verified the crash with locally built 5.0.24, 5.0.32 and 5.0.33, as well as with mysql-standard-5.0.27-linux-i686 (static), mysql-5.1.15-beta-linux-i686-glibc23 and mysql-5.2.3-falcon-alpha-linux-i686-glibc23 from mysql.com, and bk://mysql.bkbits.net/mysql-5.0 as of 20070220.

I've tried to diagnose the problem as best as I could and it looks like an Item in Query_arena::free_list is getting corrupted. I added lots of couts to one build and could confirm that the Item::next for the particular instance was healthy in Item::Item() but corrupt in Query_arena::free_items()

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1330541664 (LWP 15484)]
0x00000031 in ?? ()
(gdb) bt
#0  0x00000031 in ?? ()
#1  0x08163e7a in Query_arena::free_items (this=0xb409b964) at item.h:823
#2  0x08163ef6 in THD::cleanup_after_query (this=0xb409b958)
    at sql_class.cc:577
#3  0x08194c64 in mysql_parse (thd=0xb409b958, 
    inBuf=0x8c104a8 "call gc_purchase(@tid,53,1)", length=27)
    at sql_parse.cc:5936
#4  0x0819574b in dispatch_command (command=COM_QUERY, thd=0xb409b958, 
    packet=0xb40ca0f1 "", packet_length=28) at sql_parse.cc:1786
#5  0x081969a5 in do_command (thd=0xb409b958) at sql_parse.cc:1568
#6  0x08197600 in handle_one_connection (arg=0xb409b958) at sql_parse.cc:1194
#7  0xb7f5e294 in start_thread () from /lib/libpthread.so.0
#8  0xb7da1e7e in clone () from /lib/libc.so.6
(gdb) f 1
#1  0x08163e7a in Query_arena::free_items (this=0xb409b964) at item.h:823
823         cleanup();
(gdb) p *this
$1 = {_vptr.Query_arena = 0x844f6dc, free_list = 0x8c12120, 
  mem_root = 0xb409b978, is_backup_arena = false, 
  state = CONVENTIONAL_EXECUTION}
(gdb) p *this->free_list
$2 = {_vptr.Item = 0x920cde8, rsize = 9, str_value = {
    Ptr = 0x2 <Address 0x2 out of bounds>, str_length = 4, Alloced_length = 0, 
    alloced = false, str_charset = 0x8525d20}, name = 0x84c733f "NULL", 
  orig_name = 0x0, next = 0x8c12060, max_length = 0, name_length = 0, 
  marker = 0 '\0', decimals = 0 '\0', maybe_null = 1 '\001', 
  null_value = 1 '\001', unsigned_flag = 0 '\0', with_sum_func = 0 '\0', 
  fixed = 1 '\001', is_autogenerated_name = 1 '\001', collation = {
    collation = 0x8525d20, derivation = DERIVATION_IGNORABLE}, 
  with_subselect = 0 '\0', cmp_context = 4294967295}
(gdb) 

I found another bug that featured memory corruption because a thd was reused when it wasn't appropriate but can't find the id now. That bug was fixed, so it's not a straight duplicate though.

If I remove the out tid parameter I can't reproduce the crash.
If I remove the 1205 handler I get a 1213 error instead of crash. (Strange!)
If I remove the '40001' handler I get a 1213 error instead of crash.

How to repeat:
Attached DDL is a heavily stripped down test case that will require more connections to trigger the problem. Create the database and run call proc1(@tid) in a couple of hundred connections in parallell. On a 2GHz CPU and using 400 clients I get the crash about 1 out of 10 times.

Suggested fix:
Make sure to not corrupt memory and let Query_arena::free_items() finish cleaning up.
[20 Feb 2007 17:28] Peter Stuge
attached DDL

Attachment: bug26503.mysql (application/octet-stream, text), 1.53 KiB.

[20 Feb 2007 17:36] MySQL Verification Team
verified using attached testcase.
crashes about 50% of the time, which stack the same on linux as on windows:

mysqld.exe!Query_arena::free_items()  + 0x2e bytes	C++
mysqld.exe!THD::cleanup_after_query()  + 0x2f bytes	C++
mysqld.exe!mysql_parse()  + 0x18e bytes	C++
mysqld.exe!dispatch_command()  + 0x5e5 bytes	C++
mysqld.exe!do_command()  + 0xdc bytes	C++
mysqld.exe!handle_one_connection()  + 0x326 bytes	C++
mysqld.exe!pthread_start()  + 0x56 bytes	C
mysqld.exe!_callthreadstart()  Line 295	C
mysqld.exe!_threadstart(void * ptd=0x0000000077d6b660)  Line 275 + 0x5 bytes
[20 Feb 2007 17:37] MySQL Verification Team
see header of file for host, user, port, gcc build instructions, etc.

Attachment: bug26503_testcase.c (text/plain), 7.56 KiB.

[20 Feb 2007 17:49] MySQL Verification Team
testcase for bug #26089 doesn't crash on my server, but this testcase still does.
[20 Feb 2007 19:25] MySQL Verification Team
5.1.16BK crashed also.  It was easier to crash the release builds than the debug builds - presumably due to the speed difference.
[28 Feb 2007 18:20] Marc ALFF
ANALYSIS (I)

Reproduced with a single thread, by forcing the error in the code
to investigate:

With mysys/thr_lock.cc modified as below:

enum enum_thr_lock_result
thr_lock(THR_LOCK_DATA *data, THR_LOCK_OWNER *owner,
         enum thr_lock_type lock_type)
{
  static int debug = 0;

  debug++;

  if ((debug % 100) == 0)
    return THR_LOCK_DEADLOCK;

  return thr_lock_impl(data, owner, lock_type);
}

'random' (every 100 calls) failures will occur.

Calling the procedure in a loop using mysqltest scripts

let $loop=150;
while ($loop)
{
  set @tid=0;
  call proc1(@tid);
  select @tid;
  dec $loop;
}

exposes the bug.
[28 Feb 2007 18:22] Marc ALFF
ANALYSIS (II)

Reproduced with the following test case script:

--disable_warnings
drop database if exists bug26503;
--enable_warnings

set SQL_MODE="ANSI";

create database bug26503;
use bug26503;

create table t1(a int);

delimiter //;

create procedure proc1(out tid int)
begin
  declare var int;
  set tid= null;
retry:
  repeat
    begin
      select 'Push';

      begin
        declare continue handler for 1329
        begin
          select 'Handler';
          insert into t1 values (1);
          iterate retry;
        end;

        select 'Before';
        ## Raises a warning 1329 the first time
        select a into var from t1;
        select 'After';

      end;
      select 'Pop';
    end;
  until true end repeat retry;
end//
delimiter ;//

show procedure code proc1;

set @tid=0;
call proc1(@tid);
select @tid;

drop database if exists bug26503;

The runtime execution is:

call proc1(@tid);
Push
Push
Before
Before
Handler
Handler
Push
Push
Before
Before
After
After
Pop
Pop
select @tid;
@tid
0

The crash is caused by the second push of the exception handler,
overflowing the exception handler stack size.

See
sp_rcontext::push_handler()
sp_rcontext::init()

The SQL code :

declare continue handler for xxx
iterate retry <-- leaving the exception handler here

cause the runtime to never pop the first handler, leading to the crash
[28 Feb 2007 23:16] Marc ALFF
Dear Peter,

I would like to express our gratitude for the work you put together
by providing a bug report with a test case showing the problem.
It was instrumental in getting the issue reproduced, and narrowed down
to the root cause.

As for the crash itself, it's not related to queries causing deadlocks,
but is related to the way the exception handler loops.

In particular :

retry: <--------------------------------------
  repeat                                      |
    begin                                     |
      declare continue handler for xxx        |
      begin                                   |
        iterate retry; -----------------------
      end
    end
  until true end repeat retry;

cause the bug.

This construct happens to be illegal according to the SQL:2003 specification,
since the scope of the 'retry' label does not extend to the code located
within exception handlers.

While this is a confirmed bug in the parser, which should reject the code,
you probably need to rewrite the loop to use a valid (and working) syntax
instead. See below for an example.

create procedure proc1()
begin
  declare done boolean;

retry:
  repeat
    begin
      declare cond_wait_timeout condition for 1205;
      declare cond_deadlock condition for sqlstate '40001';
      declare continue handler for cond_wait_timeout, cond_deadlock
      begin
        -- Failed, trying again
        set done = false;
      end;

      set done = true;

      -- Attempt something that might fail
      ...
    end;
  until done end repeat retry;
end//

Regards,
Marc.
[1 Mar 2007 2:59] Peter Stuge
Marc,

Thanks for your excellent analysis and the work all of you put into MySQL. I'm glad to help improve it and I know how important decent bug reports are, especially for fun stuff like double frees with lots of threads.

Regarding label scope I only have access to the SQL:2003 wiscorp draft and it seems to me that a handler declaration would have to be considered a SQL schema statement to be excluded by repeat statement syntax rule 4, but I have found nothing to confirm that is the case. All mentions of handler declarations I can find are among the control statements. Could you point me to the right place in the documents?

If iterate-within-handlers is illegal, that should be documented in chapter 17 of the MySQL manual. Should I file a separate bug for documentation?

If the problem isn't really iterate but the label scope, then leave:ing within a handler is not legal either, is that correct?

Thanks again!
[13 Mar 2007 0:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/21771

ChangeSet@1.2446, 2007-03-12 18:37:34-06:00, malff@weblab.(none) +9 -0
  Bug#26503 (Illegal SQL exception handler code causes the server to crash)
  
  Before this fix, the parser would accept illegal code in SQL exceptions
  handlers, that later causes the runtime to crash when executing the code,
  due to memory violations in the exception handler stack.
  
  The root cause of the problem is instructions within an exception handler
  that jumps to code located outside of the handler. This is illegal according
  to the SQL 2003 standard, since labels located outside the handler are not
  supposed to be visible (they are "out of scope"), so any instruction that
  jumps to these labels, like ITERATE or LEAVE, should not parse.
  
  The section of the standard that is relevant for this is :
    SQL:2003 SQL/PSM (ISO/IEC 9075-4:2003)
    section 13.1 <compound statement>,
    syntax rule 4
  <quote>
    The scope of the <beginning label> is CS excluding every <SQL schema
    statement> contained in CS and excluding every
    <local handler declaration list> contained in CS. <beginning label> shall
    not be equivalent to any other <beginning label>s within that scope.
  </quote>
  
  With this fix, the C++ class sp_pcontext, which represent the "parsing
  context" tree (a.k.a symbol table) of a stored procedure, has been changed
  as follows:
  - constructors have been cleaned up, so that only building a root node for
  the tree is public; building nodes inside a tree is not public.
  - a new member, m_exception_handler, indicates if a given syntactic context
  belongs to a DECLARE HANDLER block,
  - label resolution, in the method find_label(), has been changed to
  implement the restriction of scope regarding labels used in a compound
  statement.
  
  The actions in the parser, when parsing the body of a SQL exception handler,
  have been changed as follows:
  - the implementation of an exception handler (DECLARE HANDLER) now creates
  explicitly a new sp_pcontext, to isolate the code inside the handler from
  the containing compound statement context.
  - registering exception handlers as a result occurs in the parent context,
  see the rule sp_hcond_element
  - the code in sp_hcond_list has been cleaned up, to avoid code duplication
  
  In addition, the flags IN_SIMPLE_CASE and IN_HANDLER, declared in sp_head.h
  have been removed, since they are unused and broken by design (as seen with
  Bug 19194 (Right recursion in parser for CASE causes excessive stack usage,
  limitation), representing a stack in a single flag is not possible.
  
  Tests in sp-error have been added to show that illegal constructs are now
  rejected.
  
  Tests in sp have been added for code coverage, to show that ITERATE or LEAVE
  statements are legal when jumping to a label in scope, inside the body of
  an exception handler.
[13 Mar 2007 1:08] Marc ALFF
Hi Peter

Please see the comments in the patch, for the reference to the SQL:2003 spec.

Iterate-within-handlers is legal ... as long as the label pointed to is in scope, which means that it must jump to code within the handler itself.

Thanks for noticing LEAVE, the test cases cover both ITERATE and LEAVE then.

-- Marc
[13 Mar 2007 20:59] Konstantin Osipov
Sent a review over email with  a few comments.
Approved (no second review needed).
[14 Mar 2007 18:17] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/21930

ChangeSet@1.2446, 2007-03-14 12:02:32-06:00, malff@weblab.(none) +9 -0
  Bug#26503 (Illegal SQL exception handler code causes the server to crash)
  
  Before this fix, the parser would accept illegal code in SQL exceptions
  handlers, that later causes the runtime to crash when executing the code,
  due to memory violations in the exception handler stack.
  
  The root cause of the problem is instructions within an exception handler
  that jumps to code located outside of the handler. This is illegal according
  to the SQL 2003 standard, since labels located outside the handler are not
  supposed to be visible (they are "out of scope"), so any instruction that
  jumps to these labels, like ITERATE or LEAVE, should not parse.
  
  The section of the standard that is relevant for this is :
    SQL:2003 SQL/PSM (ISO/IEC 9075-4:2003)
    section 13.1 <compound statement>,
    syntax rule 4
  <quote>
    The scope of the <beginning label> is CS excluding every <SQL schema
    statement> contained in CS and excluding every
    <local handler declaration list> contained in CS. <beginning label> shall
    not be equivalent to any other <beginning label>s within that scope.
  </quote>
  
  With this fix, the C++ class sp_pcontext, which represent the "parsing
  context" tree (a.k.a symbol table) of a stored procedure, has been changed
  as follows:
  - constructors have been cleaned up, so that only building a root node for
  the tree is public; building nodes inside a tree is not public.
  - a new member, m_label_scope, indicates if a given syntactic context
  belongs to a DECLARE HANDLER block,
  - label resolution, in the method find_label(), has been changed to
  implement the restriction of scope regarding labels used in a compound
  statement.
  
  The actions in the parser, when parsing the body of a SQL exception handler,
  have been changed as follows:
  - the implementation of an exception handler (DECLARE HANDLER) now creates
  explicitly a new sp_pcontext, to isolate the code inside the handler from
  the containing compound statement context.
  - registering exception handlers as a result occurs in the parent context,
  see the rule sp_hcond_element
  - the code in sp_hcond_list has been cleaned up, to avoid code duplication
  
  In addition, the flags IN_SIMPLE_CASE and IN_HANDLER, declared in sp_head.h
  have been removed, since they are unused and broken by design (as seen with
  Bug 19194 (Right recursion in parser for CASE causes excessive stack usage,
  limitation), representing a stack in a single flag is not possible.
  
  Tests in sp-error have been added to show that illegal constructs are now
  rejected.
  
  Tests in sp have been added for code coverage, to show that ITERATE or LEAVE
  statements are legal when jumping to a label in scope, inside the body of
  an exception handler.
[22 Mar 2007 21:23] Konstantin Osipov
Fixed in 5.0.40 and 5.1.17
[31 Mar 2007 23:03] Paul DuBois
Noted in 5.0.40, 5.1.17 changelogs.

The parser accepted illegal code in SQL exception handlers, leading
to a crash at runtime when executing the code.
[19 Jul 2007 17:47] Paul DuBois
Revised changelog entry:

*Important note*
The parser accepted invalid code in SQL condition handlers,
leading to server crashes or unexpected execution behavior in
stored programs.  Specifically, the parser allowed a condition
handler to refer to labels for blocks that enclose the handler
declaration. This was incorrect because block label scope
does not include the code for handlers declared within the
labeled block.

The parser now rejects this invalid construct, but if you
upgrade in place (without dumping and reloading your databases),
existing handlers that contain the construct still are invalid
*even if they appear to function as you expect* and should
be rewritten.

To find affected handlers, use mysqldump to dump all stored
functions and procedures, triggers, and events. Then attempt
to reload them into an upgraded server. Handlers that contain
illegal label references will be rejected.

For more information about condition handlers and writing them
to avoid invalid jumps, see 
http://dev.mysql.com/doc/mysql/en/declare-handlers.html.
[21 Dec 2007 17:17] Marc ALFF
See related Bug #33465 (Temporarily disable fix for bug#26503).