Bug #98782 Using TempTable engine for GROUP BY is slower than using MEMORY engine
Submitted: 28 Feb 2020 18:57 Modified: 20 Jul 2020 8:30
Reporter: Øystein Grøvlen Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.19 OS:Any
Assigned to: CPU Architecture:Any

[28 Feb 2020 18:57] Øystein Grøvlen
Description:
Running a query that is using a temporary table for GROUP BY is slower with new TempTable engine than with MEMORY engine.   I am using the following query with the MySQL world database:

mysql> explain select avg(population) from city use index () group by countrycode;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra           |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
|  1 | SIMPLE      | city  | NULL       | ALL  | CountryCode   | NULL | NULL    | NULL | 4188 |   100.00 | Using temporary |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

Here are the results from running the query using the different engines:

mysql>  SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM performance_schema.events_statements_history order by timer_start;    
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| truncate performance_schema.events_statements_history              |    2.2906 |
| set internal_tmp_mem_storage_engine=default                        |    0.2240 |
| select avg(population) from city use index () group by countrycode |    3.4209 |
| select avg(population) from city use index () group by countrycode |    3.1320 |
| select avg(population) from city use index () group by countrycode |    2.9873 |
| select avg(population) from city use index () group by countrycode |    2.9773 |
| select avg(population) from city use index () group by countrycode |    3.0106 |
| select avg(population) from city use index () group by countrycode |    2.9718 |
| select avg(population) from city use index () group by countrycode |    2.9629 |
| select avg(population) from city use index () group by countrycode |    3.0143 |
| select avg(population) from city use index () group by countrycode |    3.0956 |
| select avg(population) from city use index () group by countrycode |    3.2512 |
| set internal_tmp_mem_storage_engine=MEMORY                         |    0.1037 |
| select avg(population) from city use index () group by countrycode |    4.0703 |
| select avg(population) from city use index () group by countrycode |    2.7441 |
| select avg(population) from city use index () group by countrycode |    2.6286 |
| select avg(population) from city use index () group by countrycode |    3.2299 |
| select avg(population) from city use index () group by countrycode |    2.5692 |
| select avg(population) from city use index () group by countrycode |    2.7132 |
| select avg(population) from city use index () group by countrycode |    3.1385 |
| select avg(population) from city use index () group by countrycode |    2.6247 |
| select avg(population) from city use index () group by countrycode |    2.6066 |
| select avg(population) from city use index () group by countrycode |    2.5606 |
+--------------------------------------------------------------------+-----------+
23 rows in set (0.00 sec)

Here is the result from running a similar query using a DBT3 SF10 database:

mysql>  SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM performance_schema.events_statements_history order by timer_start;    
+------------------------------------------------------------------------------------+-----------+
| sql_text                                                                           | Time (ms) |
+------------------------------------------------------------------------------------+-----------+
| truncate performance_schema.events_statements_history                              |    4.1112 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1196.1212 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1151.6570 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1152.0946 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1207.5053 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1154.1293 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1153.7330 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1151.6730 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1150.0393 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1149.3197 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1150.4283 |
| set internal_tmp_mem_storage_engine=MEMORY                                         |    0.1350 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1031.9952 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1031.5870 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1029.8365 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1030.7925 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1031.2058 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1032.5193 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1032.6714 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1038.0801 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1027.5355 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1026.3479 |
| select c_nationkey, sum(c_acctbal) from customer use index () group by c_nationkey | 1029.2537 |
+------------------------------------------------------------------------------------+-----------+
23 rows in set (0.00 sec)

It seem using the MEMORY engine is about 10% faster.

(See Bug#98770 for the reason why the "use index" hint is used.)

How to repeat:
select avg(population) from city use index () group by countrycode;
set internal_tmp_mem_storage_engine=MEMORY;
select avg(population) from city use index () group by countrycode;
[3 Mar 2020 13:59] MySQL Verification Team
Hi Øystein,

For filing the second bug that you reported within #98770, as a side note.

This has been tested and confirmed by me.

Verified as reported.
[22 May 2020 11:42] Sergey Kuzmichev
Hi all,

I found this bug when looking at a 6x degradation of performance between using TempTable and MEMORY engines. Actually, the problems I see are related, it seems, to this bug, as well as https://bugs.mysql.com/bug.php?id=96071 and perhaps https://bugs.mysql.com/bug.php?id=99593 though it's hard to tell (and it's a rather far fetch, I understand).

The test case I use is two tables of ~2mln rows each (150mb) and a simple query. It's actually just a slightly "larger" test case from https://bugs.mysql.com/bug.php?id=98739 I added few more inserts to make resultset bigger. Tested with 8.0.20.

Query used for testing is: SELECT COUNT(*) FROM (select * from joinit union select * from joinit2) foo;

Running the query 5 times with TempTable on a smallish Centos7 VM.

SET GLOBAL internal_tmp_mem_storage_engine = TempTable;

1 row in set (37.95 sec)
1 row in set (37.26 sec)
1 row in set (36.92 sec)
1 row in set (36.73 sec)
1 row in set (37.71 sec)

Running the query 5 times with MEMORY.

SET GLOBAL internal_tmp_mem_storage_engine = MEMORY;
SET GLOBAL tmp_table_size = 1*1024*1024*1024;
SET GLOBAL max_heap_table_size = 1*1024*1024*1024;

1 row in set (6.59 sec)
1 row in set (6.75 sec)
1 row in set (7.65 sec)
1 row in set (6.84 sec)
1 row in set (6.63 sec)

Running locally on a laptop with Fedora 31, numbers change to ~14 sec for TempTable and ~8 sec for MEMORY.

All the settings are default, so TempTable has 1G memory limit and thus doesn't "spill" to disk, which would then be bug https://bugs.mysql.com/bug.php?id=98739

mysql> SELECT * FROM performance_schema.memory_summary_global_by_event_name where event_name like '%TempTable%'\G
*************************** 1. row ***************************
                  EVENT_NAME: memory/temptable/physical_disk
                 COUNT_ALLOC: 0
                  COUNT_FREE: 0
...
*************************** 2. row ***************************
                  EVENT_NAME: memory/temptable/physical_ram
                 COUNT_ALLOC: 6614606
                  COUNT_FREE: 6614605
   SUM_NUMBER_OF_BYTES_ALLOC: 3551171300032512
    SUM_NUMBER_OF_BYTES_FREE: 3551171298983936
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 1
             HIGH_COUNT_USED: 11
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 1048576
   HIGH_NUMBER_OF_BYTES_USED: 1073741824
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*) FROM (SELECT * FROM joinit UNION SELECT * FROM joinit2) foo;
+----------+
| COUNT(*) |
+----------+
|  2097152 |
+----------+
1 row in set (21.40 sec)

mysql> SELECT * FROM performance_schema.memory_summary_global_by_event_name where event_name like '%TempTable%'\G
*************************** 1. row ***************************
                  EVENT_NAME: memory/temptable/physical_disk
                 COUNT_ALLOC: 0
                  COUNT_FREE: 0
...
*************************** 2. row ***************************
                  EVENT_NAME: memory/temptable/physical_ram
                 COUNT_ALLOC: 8268257
                  COUNT_FREE: 8268256
   SUM_NUMBER_OF_BYTES_ALLOC: 4438964124516352
    SUM_NUMBER_OF_BYTES_FREE: 4438964123467776
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 1
             HIGH_COUNT_USED: 11
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 1048576
   HIGH_NUMBER_OF_BYTES_USED: 1073741824
2 rows in set (0.01 sec)

You can see, however, that number of allocations and number of bytes allocated is pretty high. I was surprised, so I did a pretty naive thing and straced the mysqld process and its children. There I found a close enough number of mmap/munmap calls originating from a thread that starts with my query.

[root@localhost ~]# grep munmap /tmp/pp/strace | grep "^30261"| wc -l
1670923

The calls also show that allocations happen for 536870912 bytes.

30261      0.000039 [00007f734d483bf9] mbind(0x7f72d2c00000, 536870912, MPOL_PREFERRED, NULL, 0, 0) = 0
30261      0.000035 [00007f734d483e27] munmap(0x7f72d2c00000, 536870912) = 0
30261      0.000038 [00007f734d483d9a] mmap(NULL, 536870912, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 0</dev/null>, 0) = 0x7f72d2c00000

perf flamegraphs show that most of the time is spent on munmaps originating from temptable::Table::insert. That's not the case for 8.0.17, as I later checked.

Looking at vsz with sleep 0.1 shows some similar-looking jumps
    PID      RSS    VSZ S TTY       TIME COMMAND
  28151 1005752 3297372 S ?     00:01:23 /usr/sbin/mysqld
  28151 1005752 2773084 S ?     00:01:24 /usr/sbin/mysqld
  28151 1005752 3297372 S ?     00:01:24 /usr/sbin/mysqld
  28151 1005752 3297372 S ?     00:01:24 /usr/sbin/mysqld
  28151 1005752 2773084 S ?     00:01:24 /usr/sbin/mysqld
  28151 1005752 3297372 S ?     00:01:24 /usr/sbin/mysqld

Doing a second pass on bugs reported, I also found https://bugs.mysql.com/bug.php?id=99593 that mentions an internal bug originating in 8.0.18 and fixed in 8.0.21. I re-tested my issue and 8.0.17 doesn't show this performance degradation.

I have a few questions.
1. Will this bug (98782) be fixed in 8.0.21?
2. Is what I describe a duplicate? It seems to be, but would be good to make sure. I can create a new bug should that be needed.
3. Will the issue described by me get fixed in 8.0.21 alongside 99593? I suspect yes, because it probably originated in 8.0.18 as well.

Sorry if I'm inadvertently hijacking the wrong issue, but the TempTable issues are somewhat spread around right now and I couldn't find any best fit.
[22 May 2020 12:19] MySQL Verification Team
Thank you, Mr. Kuzmichev.
[9 Jun 2020 7:34] Erlend Dahl
[25 May 2020 2:19] Jusufadis Bakamovic

The issue reported by Sergey Kuzmichev (previous comment) is not an issue originally reported by this bug-report. It is another issue which I have verified that it is the same as the one reported in

Bug#99593 Performance issues in 8.0.20

and fixed under the internal heading

Bug#30562964 8.0.18: PERFORMANCE REGRESSION IN SELECT DISTINCT

Fix for it already landed on mysql-8.0 and mysql-trunk  and will be part of MySQL 8.0.21 release.

I have verified that the reported issue is gone, e.g. performance goes back
to normal again:
 
mysql> SELECT COUNT(*) FROM (select * from joinit union select * from
joinit2) foo;
+----------+
| COUNT(*) |
+----------+
|    32768 |
+----------+
1 row in set (0.05 sec)
[9 Jun 2020 13:37] MySQL Verification Team
Thank you, Erlend.
[20 Jul 2020 8:30] Erlend Dahl
Posted by developer:

Jusufadis Bakamovic:

Hi,
 
Following are my findings and analysis wrt the original issue reported in
this bug. Tests were ran against quite recent trunk
(77a332aec20b1f9dc4f5947c15f099c05f4492d8).
 
1st test-run:
source <path_to>/mysql/world.sql
set internal_tmp_mem_storage_engine=TEMPTABLE;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| select avg(population) from city use index () group by countrycode | 2.3617 |
| select avg(population) from city use index () group by countrycode | 1.8994 |
| select avg(population) from city use index () group by countrycode | 1.8700 |
| select avg(population) from city use index () group by countrycode | 1.9157 |
| select avg(population) from city use index () group by countrycode | 1.9386 |
| select avg(population) from city use index () group by countrycode | 2.7847 |
| select avg(population) from city use index () group by countrycode | 1.9315 |
| select avg(population) from city use index () group by countrycode | 1.8920 |
| select avg(population) from city use index () group by countrycode | 1.8840 |
| select avg(population) from city use index () group by countrycode | 1.8899 |
+--------------------------------------------------------------------+-----------+
 
Then I repeated this _exact_ test-procedure for another 2 times ...
 
2nd test-run:
...
set internal_tmp_mem_storage_engine=TEMPTABLE;
select avg(population) from city use index () group by countrycode;
...
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| select avg(population) from city use index () group by countrycode | 2.3885 |
| select avg(population) from city use index () group by countrycode | 1.8968 |
| select avg(population) from city use index () group by countrycode | 1.8706 |
| select avg(population) from city use index () group by countrycode | 1.9204 |
| select avg(population) from city use index () group by countrycode | 1.9346 |
| select avg(population) from city use index () group by countrycode | 1.9451 |
| select avg(population) from city use index () group by countrycode | 1.9396 |
| select avg(population) from city use index () group by countrycode | 1.8882 |
| select avg(population) from city use index () group by countrycode | 1.8863 |
| select avg(population) from city use index () group by countrycode | 1.8869 |
+--------------------------------------------------------------------+-----------+
 
3rd test-run:
...
set internal_tmp_mem_storage_engine=TEMPTABLE;
select avg(population) from city use index () group by countrycode;
...
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-------- ---+
| select avg(population) from city use index () group by countrycode | 2.6640 |
| select avg(population) from city use index () group by countrycode | 1.9327 |
| select avg(population) from city use index () group by countrycode | 1.9126 |
| select avg(population) from city use index () group by countrycode | 1.8726 |
| select avg(population) from city use index () group by countrycode | 1.8891 |
| select avg(population) from city use index () group by countrycode | 1.8918 |
| select avg(population) from city use index () group by countrycode | 1.9032 |
| select avg(population) from city use index () group by countrycode | 1.9546 |
| select avg(population) from city use index () group by countrycode | 1.9345 |
| select avg(population) from city use index () group by countrycode | 1.9514 |
+--------------------------------------------------------------------+-----------+
[20 Jul 2020 8:30] Erlend Dahl
Similarly, the same procedure has been repeated but this time with MEMORY SE:
 
1st test-run:
...
set internal_tmp_mem_storage_engine=MEMORY;
select avg(population) from city use index () group by countrycode;
...
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| select avg(population) from city use index () group by countrycode | 2.6730 |
| select avg(population) from city use index () group by countrycode | 2.1057 |
| select avg(population) from city use index () group by countrycode | 2.0978 |
| select avg(population) from city use index () group by countrycode | 2.1225 |
| select avg(population) from city use index () group by countrycode | 2.1251 |
| select avg(population) from city use index () group by countrycode | 2.1425 |
| select avg(population) from city use index () group by countrycode | 2.1473 |
| select avg(population) from city use index () group by countrycode | 2.1661 |
| select avg(population) from city use index () group by countrycode | 2.1763 |
| select avg(population) from city use index () group by countrycode | 2.4880 |
+--------------------------------------------------------------------+-----------+
 
2nd test-run:
...
set internal_tmp_mem_storage_engine=MEMORY;
select avg(population) from city use index () group by countrycode;
...
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| select avg(population) from city use index () group by countrycode | 2.4928 |
| select avg(population) from city use index () group by countrycode | 2.0069 |
| select avg(population) from city use index () group by countrycode | 2.0532 |
| select avg(population) from city use index () group by countrycode | 2.0563 |
| select avg(population) from city use index () group by countrycode | 2.0589 |
| select avg(population) from city use index () group by countrycode | 1.9995 |
| select avg(population) from city use index () group by countrycode | 2.0237 |
| select avg(population) from city use index () group by countrycode | 2.0425 |
| select avg(population) from city use index () group by countrycode | 2.4942 |
| select avg(population) from city use index () group by countrycode | 2. ---+
 
3rd test-run:
...
set internal_tmp_mem_storage_engine=MEMORY;
select avg(population) from city use index () group by countrycode;
...
select avg(population) from city use index () group by countrycode;
SELECT sql_text, timer_wait/1000000000.0 "Time (ms)" FROM
performance_schema.events_statements_history order by timer_start;
+--------------------------------------------------------------------+-----------+
| sql_text                                                           | Time (ms) |
+--------------------------------------------------------------------+-----------+
| select avg(population) from city use index () group by countrycode | 2.5956 |
| select avg(population) from city use index () group by countrycode | 2.0658 |
| select avg(population) from city use index () group by countrycode | 2.0476 |
| select avg(population) from city use index () group by countrycode | 2.0616 |
| select avg(population) from city use index () group by countrycode | 2.2993 |
| select avg(population) from city use index () group by countrycode | 2.3288 |
| select avg(population) from city use index () group by countrycode | 2.1178 |
| select avg(population) from city use index () group by countrycode | 2.0865 |
| select avg(population) from city use index () group by countrycode | 2.0501 |
| select avg(population) from city use index () group by countrycode | 2.0530 |
+--------------------------------------------------------------------+-----------+
[20 Jul 2020 8:31] Erlend Dahl
Using some python and numpy one can easily calculate median and average of
the results ...
 
From TempTable results:
1st test-run:
>>> numpy.median([2.3617, 1.8994, 1.8700, 1.9157, 1.9386, 2.7847, 1.9315, 1.8920, 1.8840, 1.8899])
1.90755
>>> numpy.average([2.3617, 1.8994, 1.8700, 1.9157, 1.9386, 2.7847, 1.9315, 1.8920, 1.8840, 1.8899])
2.0367500000000005
 
2nd test-run:
>>> numpy.median([2.3885, 1.8968, 1.8706, 1.9204, 1.9346, 1.9451, 1.9396, 1.8882, 1.8863, 1.8869])
1.9086
>>> numpy.average([2.3885, 1.8968, 1.8706, 1.9204, 1.9346, 1.9451, 1.9396, 1.8882, 1.8863, 1.8869])
1.9556999999999998
 
3rd test-run:
>>> numpy.median([2.6640, 1.9327, 1.9126, 1.8726, 1.8891, 1.8918, 1.9032, 1.9546, 1.9345, 1.9514])
1.92265
>>> numpy.average([2.6640, 1.9327, 1.9126, 1.8726, 1.8891, 1.8918, 1.9032, 1.9546, 1.9345, 1.9514])
1.99065
 
Ditto from MEMORY results:
1st test-run:
>>> numpy.median([2.6730, 2.1057, 2.0978, 2.1225, 2.1251, 2.1425, 2.1473, 2.1661, 2.1763, 2.4880])
2.1449
>>> numpy.average([2.6730, 2.1057, 2.0978, 2.1225, 2.1251, 2.1425, 2.1473, 2.1661, 2.1763, 2.4880])
2.22443
 
2nd test-run:
>>> numpy.median([2.4928, 2.0069, 2.0532, 2.0563, 2.0589, 1.9995, 2.0237, 2.0425, 2.4942, 2.0665])
2.05475
>>> numpy.average([2.4928, 2.0069, 2.0532, 2.0563, 2.0589, 1.9995, 2.0237, 2.0425, 2.4942, 2.0665])
2.1294500000000003
 
3rd test-run:
>>> numpy.median([2.5956, 2.0658, 2.0476, 2.2993, 2.3288, 2.1178, 2. 2.0501, 2.0530])
2.0865
>>> numpy.average([2.5956, 2.0658, 2.0476, 2.2993, 2.3288, 2.1178, 2.0865, 2.0501, 2.0530])
2.1827222222222225
 
Summary of 3 test-runs for TempTable (median and average calculated for each
test-run, with each test-run having 10 samples, which is 30 samples in
total):
  median [1.90755, 1.9086, 1.92265]
  avg    [2.0367500000000005, 1.9556999999999998, 1.99065]
 
Summary of 3 test-runs for MEMORY (median and average calculated for each
test-run, with each test-run having 10 samples, which is 30 samples in
total):
  median [2.1449, 2.05475, 2.0865]
  avg    [2.22443, 2.1294500000000003, 2.1827222222222225]
 
Percent change (goes in favor of TempTable):
  median [+12.44%, +7.65%, +8.52%]
  avg    [+9.21%, +8.88%, +9.64%]
 
Conclusively, these figures seem to imply that TempTable in fact has
consistently better performance (both by average and median measures) than
MEMORY SE for given use-case which is why I will be free enough to reject
this as not a bug.