MySQL Bugs: #76349: memory leak in add_derived

Bug #76349	memory leak in add_derived_key()
Submitted:	17 Mar 2015 11:35	Modified:	19 Jul 2015 21:30
Reporter:	Vlad Lesin	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	5.6	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
There is memory leak in add_derived_key() function if it is called from stored procedure.

The memory leak reason is the following:

1) The instruction's mem_root is set  to persistent memory root
(sp_head::main_mem_root) during SP compilation:

bool sp_head::add_instr(THD *thd, sp_instr *instr)
{
...
  /*
    Memory root of every instruction is designated for permanent
    transformations (optimizations) made on the parsed tree during
    the first execution. It points to the memory root of the
    entire stored procedure, as their life span is equal.
  */
  instr->mem_root= get_persistent_mem_root();
...
}

2) thd->stmt_arena is set to the query arena of SP instruction:

bool sp_head::execute(THD *thd, bool merge_da_on_success)
{
...
  do
  {
    sp_instr *i;
...
    /* get_instr returns NULL when we're done. */
    i = get_instr(ip);
...
    /*
      We have to set thd->stmt_arena before executing the instruction
      to store in the instruction free_list all new items, created
      during the first execution (for example expanding of '*' or the
      items made during other permanent subquery transformations).
    */
    thd->stmt_arena= i;
...
    err_status= i->execute(thd, &ip);
  } while (!err_status && !thd->killed && !thd->is_fatal_error);
...
}

3) add_derived_key() uses thd->stmt_arena->mem_root for allocating, namely it
uses memory root for persistent objects, this memory is not cleared after
instruction execution which leads to constant memory consumption.

... add_derived_key (...) {
...
 entry= new (thd->stmt_arena->mem_root) Derived_key();
...
}

It works well outside of SP because thd->stmt_arena points to thd and
thd->mem_root is initialized before each query in THD::init_for_queries() and
cleared after each query in dispatch_command().

But inside of SP there is special mechanism to clear memory after each
instruction. There are local variables execute_mem_root and execute_arena
in sp_head::execute(). thd->mem_root is set pointed to local execute_mem_root
before SP execution and the old pointer is restored after SP execution
(see sp_head::execute()). After each instruction thd->mem_root is cleared.

So we have thd->mem_root which is cleared after statement execution in both
SP and non-SP cases. The fix is in using thd->mem_root instead of
thd->stmt_arena->mem_root for memory alocation in add_derived_key().

How to repeat:
Test case:

create table t1 (a int unsigned auto_increment primary key,
                 b int(10) unsigned not null default 0);
create table t2 (a int unsigned auto_increment primary key,
                 b int(10) unsigned not null default 0);

insert into t1 (b) values (1), (2),(3),(4),(5),(6),(7),(8),(9),(10);
insert into t2 (b) values (1), (2),(3),(4),(5),(6),(7),(8),(9),(10);

delimiter |
create procedure mem_leak()
select a from (select * from t2) as d_t where d_t.a = 5;
|

delimiter ;
call mem_leak();

Suggested fix:
Use thd->mem_root instead of thd->stmt_arena->mem_root for memory allocating in add_derived_key(). 

--- a/sql/table.cc
+++ b/sql/table.cc
@@ -6175,12 +6175,12 @@ static bool add_derived_key(List<Derived_key> &derived_key_list, Field *field,
   {
     THD *thd= field->table->in_use;
     key++;
-    entry= new (thd->stmt_arena->mem_root) Derived_key();
+    entry= new (thd->mem_root) Derived_key();
     if (!entry)
       return TRUE;
     entry->referenced_by= ref_by_tbl;
     entry->used_fields.clear_all();
-    if (derived_key_list.push_back(entry, thd->stmt_arena->mem_root))
+    if (derived_key_list.push_back(entry, thd->mem_root))
       return TRUE;
     field->table->max_keys++;
   }

The second tables is not necessary for the test case:

    create table t1 (a int unsigned auto_increment primary key,
                     b int(10) unsigned not null default 0);
    
    insert into t1 (b) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
    
    delimiter |
    create procedure mem_leak()
    select a from (select * from t1) as d_t where d_t.a = 5;
    |
    
    delimiter ;
    call mem_leak();

Thanks for the report! Testcas uses up endless memory:

--------
drop table if exists t1,t2
create table t1 (a int unsigned auto_increment primary key,b int(10) unsigned not null default 0);
create table t2 (a int unsigned auto_increment primary key,b int(10) unsigned not null default 0);
insert into t1 (b) values (1), (2),(3),(4),(5),(6),(7),(8),(9),(10);
insert into t2 (b) values (1), (2),(3),(4),(5),(6),(7),(8),(9),(10);

delimiter $
create procedure p1()
begin
  repeat
    select a from (select * from t2) as d_t where d_t.a = 5;
  until 1=2 end repeat;
end $

delimiter ;
call p1();
------

heap profile.

Attachment: bug76349_5.6.25_heap_profile.pdf (application/pdf, text), 14.22 KiB.

Affects only 5.6, did not leak on 5.1, 5.5, 5.7, 5.8.

Percona release a fix that works: https://bugs.launchpad.net/percona-server/+bug/1380985

Noted in 5.6.27, 5.7.9, 5.8.0 changelogs.

The optimizer sometimes generates an index for a derived table
(subquery in the FROM clause). If this occurred for a statement
executed within a stored program, a memory leak could occur.