Description:
A relatively simple query on small tables consistently causes memory/sql/THD::main_mem_root to grow to high values. I can consistently push it to ~4.5G. The issue happens when the optimizer picks a range check for the query plan. When multiple queries like that execute, the MySQL server can get terminated by the OOM-killer. If a range check is picked due to plan instability, the issue can happen unexpectedly due to queries that normally run fine.
This behavior is present since 8.0.16 (likely described in release notes for Bug #28805105, Bug #28857990), and the plan for the "bad" query looks as follows:
mysql> EXPLAIN SELECT COUNT(*) FROM tt0 WHERE EXISTS (SELECT 1 FROM tt3 WHERE tt3.a<tt0.a OR tt3.b<tt0.a OR tt3.c<tt0.a OR tt3.d<tt0.a);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
| 1 | SIMPLE | tt0 | NULL | index | PRIMARY | PRIMARY | 514 | NULL | 999125 | 100.00 | Using index |
| 1 | SIMPLE | tt3 | NULL | ALL | a,b,c,d | NULL | NULL | NULL | 980460 | 80.24 | Range checked for each record (index map: 0xF); FirstMatch(tt0) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------------------------------------------------------+
2 rows in set, 5 warnings (0.00 sec)
The query completes in around 1.5 minutes on my test data, and before 8.0.16 it takes hours (I didn't wait to see how many hours precisely). Clearly, the feature works, but the vast memory usage is concerning.
valgrind massif shows the following picture:
99.96% (3,435,932,625B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->91.99% (3,162,130,292B) 0x1F2B15F: my_malloc(unsigned int, unsigned long, int) (in /opt/mysql/8.0.27/bin/mysqld)
| ->91.49% (3,144,749,640B) 0x1F26F55: MEM_ROOT::AllocBlock(unsigned long, unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | ->91.44% (3,143,105,528B) 0x1F26FDF: MEM_ROOT::ForceNewBlock(unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | ->91.44% (3,143,105,528B) 0x1F27064: MEM_ROOT::AllocSlow(unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | ->86.58% (2,976,064,448B) 0xE9FEA4: sql_memdup(void const*, unsigned long) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCB9871: QUICK_RANGE::QUICK_RANGE(unsigned char const*, unsigned int, unsigned long, unsigned char const*, unsigned int, unsigned long, unsigned int, ha_rkey_function) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCBE0EA: get_quick_keys(MEM_ROOT*, KEY const*, KEY_PART*, SEL_ARG*, unsigned char*, unsigned char*, unsigned int, unsigned char*, unsigned char*, unsigned int, unsigned int*, unsigned int, unsigned int*, Mem_root_array<QUICK_RANGE*>*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCBE7B2: get_ranges_from_tree(MEM_ROOT*, TABLE*, KEY_PART*, unsigned int, SEL_ROOT*, unsigned int, unsigned int*, Mem_root_array<QUICK_RANGE*>*) [clone .part.0] [clone .isra.0] (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCBFC89: get_key_scans_params(THD*, RANGE_OPT_PARAM*, SEL_TREE*, bool, bool, enum_order, bool, Cost_estimate const*, Bitmap<64u>*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xCBB3B6: test_quick_select(THD*, MEM_ROOT*, MEM_ROOT*, Bitmap<64u>, unsigned long, unsigned long, unsigned long long, bool, enum_order, TABLE*, bool, Item*, Bitmap<64u>*, QUICK_SELECT_I**, bool, Query_block*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xD4EC1D: DynamicRangeIterator::Init() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0x1262D34: NestedLoopIterator::Read() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0x1263F39: AggregateIterator::Read() (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xE5BF5D: Query_expression::ExecuteIteratorQuery(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xE5C29E: Query_expression::execute(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xDE79FD: Sql_cmd_dml::execute(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xD91DD7: mysql_execute_command(THD*, bool) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xD9554D: dispatch_sql_command(THD*, Parser_state*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xD967E3: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xD98893: do_command(THD*) (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0xEC6CAF: handle_connection (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0x23D440F: pfs_spawn_thread (in /opt/mysql/8.0.27/bin/mysqld)
| | | | | ->86.58% (2,976,064,448B) 0x4F77EA4: start_thread (pthread_create.c:307)
| | | | | ->86.58% (2,976,064,448B) 0x6A259FC: clone (in /usr/lib64/libc-2.17.so)
How to repeat:
1. Deploy MySQL 8.0.27. I used dbdeployer, but can be also reproduced using docker.
2. Create tables:
CREATE TABLE tt0 (a varchar(128) PRIMARY KEY);
CREATE TABLE tt3 (a varchar(128),
b varchar(128),
c varchar(128),
d varchar(128),
key a (a),
key b (b),
key c (c),
key d (d)
);
3. Populate tables with ~1 million rows each. I used mysql_random_data_load:
mysql_random_data_load test tt0 1000000
mysql_random_data_load test tt3 1000000
Memory usage and execution times will vary due to the random nature of the data. I can provide a ~46M compressed dump that I used to reproduce for Valgrind and check the issue on different versions. That dataset causes the query to finish in ~1 minute and 15 seconds, making for a nice quick test.
4. Check table sizes
mysql> SELECT table_name, format_bytes(data_length+index_length) FROM information_schema.tables WHERE table_schema = 'test';
+------------+----------------------------------------+
| TABLE_NAME | format_bytes(data_length+index_length) |
+------------+----------------------------------------+
| tt0 | 30.56 MiB |
| tt3 | 494.11 MiB |
+------------+----------------------------------------+
2 rows in set (0.00 sec)
5. Run the query and observe memory usage of the mysqld process as well as memory instruments in performance schema.
Query: SELECT COUNT(*) FROM tt0 WHERE EXISTS (SELECT 1 FROM tt3 WHERE tt3.a<tt0.a OR tt3.b<tt0.a OR tt3.c<tt0.a OR tt3.d<tt0.a);
Memory footprint before query terminates:
memory/sql/THD::main_mem_root 42 4.39 GiB 107.01 MiB 6567 4.39 GiB 700.88 KiB
memory/sql/THD::main_mem_root 42 4.39 GiB 107.01 MiB 6568 4.39 GiB 700.78 KiB
memory/sql/THD::main_mem_root 37 592.03 MiB 16.00 MiB 6569 4.39 GiB 700.67 KiB
memory/sql/THD::main_mem_root 11 219.09 KiB 19.92 KiB 6570 4.39 GiB 700.56 KiB
And from the OS side:
10:14:17 AM 955600057 12149 12881.00 0.00 5762188 4672124 2.36 mysqld
10:14:18 AM 955600057 12149 12812.00 0.00 5762188 4722752 2.39 mysqld
10:14:19 AM 955600057 12149 12565.00 0.00 5762188 4771988 2.41 mysqld
10:14:20 AM 955600057 12149 12760.00 0.00 5762188 4822060 2.44 mysqld
10:14:21 AM 955600057 12149 9466.00 0.00 5762188 2561612 1.29 mysqld
10:14:22 AM 955600057 12149 146.00 0.00 5762188 468312 0.24 mysqld
10:14:23 AM 955600057 12149 66.00 0.00 5762188 468312 0.24 mysqld
10:14:24 AM 955600057 12149 67.00 0.00 5762188 468312 0.24 mysqld
Suggested fix:
Do not use so much memory or provide a way to put an upper limit on the memory consumed by the operation.