Bug #105331 Range check causing high memory consumption in memory/sql/THD::main_mem_root
Submitted: 26 Oct 2021 16:36 Modified: 22 Dec 2021 18:38
Reporter: Sergey Kuzmichev Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:8.0.27 OS:Any
Assigned to: CPU Architecture:Any

[26 Oct 2021 16:36] Sergey Kuzmichev
Description:
A relatively simple query on small tables consistently causes memory/sql/THD::main_mem_root to grow to high values. I can consistently push it to ~4.5G. The issue happens when the optimizer picks a range check for the query plan. When multiple queries like that execute, the MySQL server can get terminated by the OOM-killer. If a range check is picked due to plan instability, the issue can happen unexpectedly due to queries that normally run fine.

This behavior is present since 8.0.16 (likely described in release notes for Bug #28805105, Bug #28857990), and the plan for the "bad" query looks as follows:

mysql> EXPLAIN SELECT COUNT(*) FROM tt0 WHERE EXISTS (SELECT 1 FROM tt3 WHERE tt3.a<tt0.a OR tt3.b<tt0.a OR tt3.c<tt0.a OR tt3.d<tt0.a);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra                                                           |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
|  1 | SIMPLE      | tt0   | NULL       | index | PRIMARY       | PRIMARY | 514     | NULL | 999125 |   100.00 | Using index                                                     |
|  1 | SIMPLE      | tt3   | NULL       | ALL   | a,b,c,d       | NULL    | NULL    | NULL | 980460 |    80.24 | Range checked for each record (index map: 0xF); FirstMatch(tt0) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
2 rows in set, 5 warnings (0.00 sec)

The query completes in around 1.5 minutes on my test data, and before 8.0.16 it takes hours (I didn't wait to see how many hours precisely). Clearly, the feature works, but the vast memory usage is concerning.

valgrind massif shows the following picture:

99.96% (3,435,932,625B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->91.99% (3,162,130,292B) 0x1F2B15F: my_malloc(unsigned int, unsigned long, int) (in /opt/mysql/8.0.27/bin/mysqld)
| ->91.49% (3,144,749,640B) 0x1F26F55: MEM_ROOT::AllocBlock(unsigned long, unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | ->91.44% (3,143,105,528B) 0x1F26FDF: MEM_ROOT::ForceNewBlock(unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | ->91.44% (3,143,105,528B) 0x1F27064: MEM_ROOT::AllocSlow(unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | ->86.58% (2,976,064,448B) 0xE9FEA4: sql_memdup(void const*, unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCB9871: QUICK_RANGE::QUICK_RANGE(unsigned char const*, unsigned int, unsigned long, unsigned char const*, unsigned int, unsigned long, unsigned int, ha_rkey_function) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |   ->86.58% (2,976,064,448B) 0xCBE0EA: get_quick_keys(MEM_ROOT*, KEY const*, KEY_PART*, SEL_ARG*, unsigned char*, unsigned char*, unsigned int, unsigned char*, unsigned char*, unsigned int, unsigned int*, unsigned int, unsigned int*, Mem_root_array<QUICK_RANGE*>*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |     ->86.58% (2,976,064,448B) 0xCBE7B2: get_ranges_from_tree(MEM_ROOT*, TABLE*, KEY_PART*, unsigned int, SEL_ROOT*, unsigned int, unsigned int*, Mem_root_array<QUICK_RANGE*>*) [clone .part.0] [clone .isra.0] (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |       ->86.58% (2,976,064,448B) 0xCBFC89: get_key_scans_params(THD*, RANGE_OPT_PARAM*, SEL_TREE*, bool, bool, enum_order, bool, Cost_estimate const*, Bitmap<64u>*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |         ->86.58% (2,976,064,448B) 0xCBB3B6: test_quick_select(THD*, MEM_ROOT*, MEM_ROOT*, Bitmap<64u>, unsigned long, unsigned long, unsigned long long, bool, enum_order, TABLE*, bool, Item*, Bitmap<64u>*, QUICK_SELECT_I**, bool, Query_block*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |           ->86.58% (2,976,064,448B) 0xD4EC1D: DynamicRangeIterator::Init() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |             ->86.58% (2,976,064,448B) 0x1262D34: NestedLoopIterator::Read() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |               ->86.58% (2,976,064,448B) 0x1263F39: AggregateIterator::Read() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                 ->86.58% (2,976,064,448B) 0xE5BF5D: Query_expression::ExecuteIteratorQuery(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                   ->86.58% (2,976,064,448B) 0xE5C29E: Query_expression::execute(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                     ->86.58% (2,976,064,448B) 0xDE79FD: Sql_cmd_dml::execute(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                       ->86.58% (2,976,064,448B) 0xD91DD7: mysql_execute_command(THD*, bool) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                         ->86.58% (2,976,064,448B) 0xD9554D: dispatch_sql_command(THD*, Parser_state*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                           ->86.58% (2,976,064,448B) 0xD967E3: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                             ->86.58% (2,976,064,448B) 0xD98893: do_command(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                               ->86.58% (2,976,064,448B) 0xEC6CAF: handle_connection (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                                 ->86.58% (2,976,064,448B) 0x23D440F: pfs_spawn_thread (in /opt/mysql/8.0.27/bin/mysqld)
| | | | |                                   ->86.58% (2,976,064,448B) 0x4F77EA4: start_thread (pthread_create.c:307)
| | | | |                                     ->86.58% (2,976,064,448B) 0x6A259FC: clone (in /usr/lib64/libc-2.17.so)

How to repeat:
1. Deploy MySQL 8.0.27. I used dbdeployer, but can be also reproduced using docker.
2. Create tables:

CREATE TABLE tt0 (a varchar(128) PRIMARY KEY);
CREATE TABLE tt3 (a varchar(128),
                b varchar(128),
                c varchar(128),
                d varchar(128),
                key a (a),
                key b (b),
                key c (c),
                key d (d)
            );

3. Populate tables with ~1 million rows each. I used mysql_random_data_load:

mysql_random_data_load test tt0 1000000
mysql_random_data_load test tt3 1000000

Memory usage and execution times will vary due to the random nature of the data. I can provide a ~46M compressed dump that I used to reproduce for Valgrind and check the issue on different versions. That dataset causes the query to finish in ~1 minute and 15 seconds, making for a nice quick test.

4. Check table sizes

mysql> SELECT table_name, format_bytes(data_length+index_length) FROM information_schema.tables WHERE table_schema = 'test';
+------------+----------------------------------------+
| TABLE_NAME | format_bytes(data_length+index_length) |
+------------+----------------------------------------+
| tt0        | 30.56 MiB                              |
| tt3        | 494.11 MiB                             |
+------------+----------------------------------------+
2 rows in set (0.00 sec)

5. Run the query and observe memory usage of the mysqld process as well as memory instruments in performance schema.

Query: SELECT COUNT(*) FROM tt0 WHERE EXISTS (SELECT 1 FROM tt3 WHERE tt3.a<tt0.a OR tt3.b<tt0.a OR tt3.c<tt0.a OR tt3.d<tt0.a);

Memory footprint before query terminates:

memory/sql/THD::main_mem_root   42      4.39 GiB        107.01 MiB      6567    4.39 GiB        700.88 KiB
memory/sql/THD::main_mem_root   42      4.39 GiB        107.01 MiB      6568    4.39 GiB        700.78 KiB
memory/sql/THD::main_mem_root   37      592.03 MiB      16.00 MiB       6569    4.39 GiB        700.67 KiB
memory/sql/THD::main_mem_root   11      219.09 KiB      19.92 KiB       6570    4.39 GiB        700.56 KiB

And from the OS side:

10:14:17 AM 955600057     12149  12881.00      0.00 5762188 4672124   2.36  mysqld
10:14:18 AM 955600057     12149  12812.00      0.00 5762188 4722752   2.39  mysqld
10:14:19 AM 955600057     12149  12565.00      0.00 5762188 4771988   2.41  mysqld
10:14:20 AM 955600057     12149  12760.00      0.00 5762188 4822060   2.44  mysqld
10:14:21 AM 955600057     12149   9466.00      0.00 5762188 2561612   1.29  mysqld
10:14:22 AM 955600057     12149    146.00      0.00 5762188  468312   0.24  mysqld
10:14:23 AM 955600057     12149     66.00      0.00 5762188  468312   0.24  mysqld
10:14:24 AM 955600057     12149     67.00      0.00 5762188  468312   0.24  mysqld

Suggested fix:
Do not use so much memory or provide a way to put an upper limit on the memory consumed by the operation.
[26 Oct 2021 21:05] Justin Swanhart
As a workaround, the following query might perform as well (or better) and use less memory:
select count(*) from tt0 
WHERE 1 IN
(SELECT count(*) cnt FROM tt3 WHERE tt3.a<tt0.a 
 UNION ALL
 SELECT count(*) cnt FROM tt3 WHERE tt3.b<tt0.a 
UNION ALL
 SELECT count(*) cnt FROM tt3 WHERE tt3.c<tt0.a 
UNION ALL
 SELECT count(*) cnt FROM tt3 WHERE tt3.d<tt0.a);
[26 Oct 2021 21:17] Justin Swanhart
this should be even faster:
select count(*) from tt0 
WHERE EXISTS
(SELECT 1 FROM tt3 WHERE tt3.a<tt0.a limit 1 
 UNION ALL
 SELECT 1 FROM tt3 WHERE tt3.b<tt0.a limit 1
UNION ALL
 SELECT 1 FROM tt3 WHERE tt3.c<tt0.a limit 1 
UNION ALL
 SELECT 1 FROM tt3 WHERE tt3.d<tt0.a limit 1);

Hope that helps.
[28 Oct 2021 10:03] MySQL Verification Team
Hello Sergey,

Thank you for the bug report and test case.
Verified as described.

regards,
Umesh
[29 Oct 2021 7:45] Steinar Gunderson
Posted by developer:
 
So this is the culprit:

        min_key = static_cast<uchar *>(sql_memdup(min_key_arg, min_length_arg + 1));
        max_key = static_cast<uchar *>(sql_memdup(max_key_arg, max_length_arg + 1));

It should be allocated on the return_mem_root (or perhaps even the temp_mem_root), not *THR_MALLOC.
[22 Dec 2021 18:38] Jon Stephens
Documented fix as follows in the MySQL 8.0.29 changelog:

    The QUICK_RANGE constructor used the main thread memory root for
    allocating keys, rather than the same memory root as used for
    allocating the QUICK_RANGE object itself. This resulted in
    unexpectedly high memory usage when executing multiple queries
    using a range check for the query plan.

Closed.