Bug #102037 CPU overhead from inlists much larger in 8.0.22
Submitted: 20 Dec 2020 22:23 Modified: 29 Jun 2022 19:46
Reporter: Mark Callaghan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[20 Dec 2020 22:23] Mark Callaghan
Description:
This query is much slower starting in 8.0.22 and the regression is a function of the number of entries in the inlist. I found this while running sysbench. When the inlist has 1000 entries then MySQL 8.0.21 is ~4X faster than 8.0.22 when the database is cached.

The problem query looks like this, "id" is the PK column and c is not indexed.
  select c from sbtest where id in (...)

With a differential flamegraph I see the overhead is mostly in Item_cmpfunc_in::val_int and the new code there is from WL#9384

https://github.com/mysql/mysql-server/commit/67c3c70e4895874d43434f1df556f9f30d781b48
https://github.com/mysql/mysql-server/blame/8.0/sql/item_cmpfunc.cc#L5127

This is a summary of sysbench QPS for tests that use 1 and 2 threads. The "random-points" test uses a large inlist with 100 and 1000 entries (range=100, range=1000). The regression is larger for range=1000. There is no regression for point queries (fetch one row by PK) as seen in the QPS for "point-query".

point-query
1       2       threads/version
16755   35328   8.0.21
17358   36508   8.0.22

random-points, range=100
1       2       threads/version
2661    5310    8.0.21
2136    4292    8.0.22

random-points, range=1000
1       2       threads/version
323     624     8.0.21
88      174     8.0.22

How to repeat:
These assume: user/password is root/pw, host is 127.0.0.1, sysbench Lua files are from https://github.com/mdcallag/sysbench/tree/1.0/src/lua and locally copied to $sb

# Load sysbench table
sysbench --db-driver=mysql --mysql-storage-engine=innodb --range-size=100 --table-size=10000000 --tables=1 --events=0 --time=90 $sb/oltp_point_select.lua prepare --mysql-user=root --mysql-password=pw --mysql-host=127.0.0.1 --mysql-db=test

# Analyze table
mysql -uroot -ppw test -e 'analyze table sbtest1'

# Run with 100 entries in the inlist
sysbench --db-driver=mysql --range-size=100 --table-size=10000000 --tables=1 --threads=1 --events=0 --warmup-time=5 --time=90 $sb/oltp_inlist_select.lua run --mysql-user=root --mysql-password=pw --mysql-host=127.0.0.1 --mysql-db=test --rand-type=uniform --random-points=100 --skip-trx

# Run with 1000 entries in the inlist
sysbench --db-driver=mysql --range-size=1000 --table-size=10000000 --tables=1 --threads=1 --events=0 --warmup-time=5 --time=90 /home/mdcallag/d/sysbench/share/sysbench/oltp_inlist_select.lua run --mysql-user=root --mysql-password=pw --mysql-host=127.0.0.1 --mysql-db=test --rand-type=uniform --random-points=1000 --skip-trx
[20 Dec 2020 22:26] Mark Callaghan
DIfferential flamegraph, red means 8.0.22 has more overhead

Attachment: diff.rpeqdef.svg (image/svg+xml, text), 106.56 KiB.

[20 Dec 2020 22:45] Mark Callaghan
my.cnf used for the tests

[mysqld]
sql_mode=no_engine_substitution,no_unsigned_subtraction
default_tmp_storage_engine=MyISAM
tmpdir=/data/m/my
#query_cache_size=0
#query_cache_type=0
# slow_query_log_file=/data/m/my/slow.log
long_query_time=5
# skip_log_bin
log_bin=/data/m/my/binlogs/bl
max_binlog_size=128M
server_id=77
binlog_format=ROW
datadir=/data/m/my/data

innodb_file_per_table=1
innodb_data_home_dir=/data/m/my/data
innodb_data_file_path=ibdata1:512M:autoextend
innodb_log_group_home_dir=/data/m/my/txlogs
slow_query_log=1

performance_schema=1
ssl=0
default_authentication_plugin=mysql_native_password
character_set_server=latin1
collation_server=latin1_swedish_ci

innodb_fast_shutdown=1
innodb_doublewrite=1
innodb_flush_log_at_trx_commit=2
sync_binlog=0

innodb_dedicated_server=ON
innodb_buffer_pool_size=10G
innodb_log_files_in_group=40
innodb_max_dirty_pages_pct_lwm=80
innodb_max_dirty_pages_pct=90
innodb_idle_flush_pct=1
innodb_io_capacity=1000
innodb_io_capacity_max=2000

[mysqld_safe]
malloc-lib=/usr/lib/x86_64-linux-gnu/libjemalloc.so
[21 Dec 2020 14:07] MySQL Verification Team
Hi Mr. Callaghan,

Thank you for your performance improvement request.

I have done your test with 100 entries in the inlist and managed to reproduce the slowdown between 8.0.21 and 8.0.22.

Verified as reported.

Thank you for your contribution.
[21 Dec 2020 14:29] MySQL Verification Team
Correct category.
[12 Jan 2021 19:39] Mark Callaghan
Forgot that I filed something for this a few years ago:
https://bugs.mysql.com/bug.php?id=91139
[13 Jan 2021 12:06] Øystein Grøvlen
I do not think this issue is the same as https://bugs.mysql.com/bug.php?id=91139
As the flame graph shows, this new issue is about extra overhead during query execution.
Bug#91139 is about how query optimization takes longer time for long IN-lists since 2 index dives will be perform per value (unless number of values in IN-list exceeds eq_range_index_dive_limit)
[13 Jan 2021 13:19] MySQL Verification Team
Hi Mr. Callaghan,

Thank you for your comment.

However, upon detailed analysis, I have to agree with our ex-colleague, Øystein Grøvlen.
[13 Jan 2021 15:01] Mark Callaghan
Yes, he is right. But perhaps if there are fewer regressions for in-lists in the future then I will be less likely to get confused.
[29 Jun 2022 19:46] Jon Stephens
Documented fix as follows in the MySQL 8.0.31 changelog:

    Queries using WHERE column IN (list) took an increasingly excess
    amount of CPU time as the list of values grew in length.

Closed.
[30 Jun 2022 11:55] MySQL Verification Team
Jon,

Thank you very much .....