Bug #46018 group_concat(distinct ...) uses max_heap_table_size for memory allocations
Submitted: 7 Jul 2009 19:16 Modified: 7 Mar 2010 2:09
Reporter: Harrison Fisk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.0,5.1 OS:Any
Assigned to: Alexey Botchkov CPU Architecture:Any
Tags: distinct, group_concat, max_heap_table_size

[7 Jul 2009 19:16] Harrison Fisk
Description:
When you have a query which uses group_concat(distinct ...), then max_heap_table_size gets used for memory allocation.  It is passed into Uniques as the limit to the memory and then max_heap_table_size / 16 gets allocated immediately.

So if you have a large max_heap_table_size then a lot of memory can be used by this function.  For example, a max_heap_table_size of 8G results in 512MB of memory being used and 16G for the setting will result in 1G of memory being used.

The call path is Item_func_group_concat::setup creates a new Unique item using thd->variables.max_heap_table_size as the last parameter which is max_in_memory_size_arg for the constructor.  

This then calls init_tree with max_in_memory_size_arg / 16 for the initial allocation size.

How to repeat:
SET SESSION max_heap_table_size = 16 * 1024 * 1024 * 1024;
CREATE TABLE gc (a int, b char(100));
INSERT INTO gc VALUES (1, 'a'), (2, 'b');
SELECT a, group_concat(distinct b) FROM gc GROUP BY a;

Suggested fix:
Do not use max_heap_table_size for this.  Instead use tmp_table_size or create a new variable to manage it.
[25 Sep 2009 8:49] Sergei Golubchik
I tend to agree with Harrison, max_heap_table_size according to the manual has completely unrelated purpose. tmp_table_size looks more appropriate
[1 Oct 2009 13:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85369

3146 Alexey Botchkov	2009-10-01
      Bug#46018      group_concat(distinct ...) uses max_heap_table_size for memory allocations
          Use min(max_heap_table_size, tmp_table_size) instead
      
      per-file comments:
        sql/item_sum.cc
      Bug#46018      group_concat(distinct ...) uses max_heap_table_size for memory allocations
          Item_sum_**::setup fixed, so they use ram_limitation() for Unique-s they embed
      
        sql/item_sum.h
      Bug#46018      group_concat(distinct ...) uses max_heap_table_size for memory allocations
         Item_sum::ram_limitation() declared
[28 Oct 2009 12:25] Alexander Barkov
The patch http://lists.mysql.com/commits/85369 looks Ok to push.
[18 Nov 2009 14:40] Alexey Botchkov
pushed into mysql-next-mr-bugfixing
[20 Nov 2009 12:58] Bugs System
Pushed into 6.0.14-alpha (revid:kostja@sun.com-20091120124947-yi6h2jbgw0kbciwm) (version source revid:holyfoot@mysql.com-20091117114617-75quk99yr1coxv8c) (merge vers: 6.0.14-alpha) (pib:13)
[26 Nov 2009 15:49] Paul DuBois
Noted in 6.0.14 changelog.

For queries that used GROUP_CONCAT(DISTINCT ...), the value of
max_heap_table_size was used for memory allocation, which could be
excessive. Now the minimum of max_heap_table_size and tmp_table_size
is used. 

Setting report to NDI pending push to 5.6.x.
[11 Dec 2009 6:04] Bugs System
Pushed into 5.6.0-beta (revid:alik@sun.com-20091211055628-ltr7fero363uev7r) (version source revid:alik@sun.com-20091211055453-717czhtezc74u8db) (merge vers: 5.6.0-beta) (pib:13)
[11 Dec 2009 19:49] Paul DuBois
Noted in 5.6.0 changelog.
[6 Mar 2010 10:59] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20091211201717-03qf8ckwiw0np80p) (merge vers: 5.6.0-beta) (pib:16)
[7 Mar 2010 2:09] Paul DuBois
Moved 5.6.0 changelog entry to 5.5.3.