Bug #115216 range_optimizer_max_mem_size calculation on with large IN queries
Submitted: 4 Jun 16:46 Modified: 4 Jun 17:18
Reporter: Tony Fenleish Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.34 OS:Any
Assigned to: CPU Architecture:Any

[4 Jun 16:46] Tony Fenleish
Description:
After upgrading from 5.7 to 8.0.34 and see a similar issue we found that queries with large IN clause ran a full table scan.  Table has more than 140 million rows, so a full table scan causes catastrophic CPU utilization.

MySQL 5.7
range_optimizer_max_mem_size = 8388608
my_large_table collation: general_ci

EXPLAIN SELECT * FROM my_large_table WHERE id IN (1, 2, 3, ... 10000);
-- type: range
-- key: PRIMARY
-- extra: Using where

MySQL 8.0.34
range_optimizer_max_mem_size = 8388608
my_large_table: utf8mb3_general_ci

EXPLAIN SELECT * FROM my_large_table WHERE id IN (1, 2, 3, ... 10000);
-- type: ALL
-- key: NULL
-- Using where; Using parallel query (30 columns, 0 filters, 1 exprs; 0 extra)

Updating `range_optimizer_max_mem_size=0` on MySQL 8.0.34 of course fixes this.

According to the documentatation, 10,000 items in the IN clause should result in 

230 bytes * 10000 (IN CLAUSE ids) = 2,300,000.  This is 6,088,608 bytes under the range_optimizer_max_mem_size value.

This ticket indicates that the bug exists in MySQL documentation.

https://bugs.mysql.com/bug.php?id=115062

However, something changed from 5.7 to 8.  I'm thinking it's not a bug in the documentation, but rather a bug in the way MySQL 8 estimate the amount of memory needed to process a range expression.

I found others with the same issue:
https://dba.stackexchange.com/questions/325361/performance-issues-after-upgrading-from-mys... https://dba.stackexchange.com/a/294729

How to repeat:
- Create a table with more than 10,000 rows
- Query the table using an IN clause with 10,000 ids

Note: for our table, if I decrease the number of IDs in the IN clause to 6,418 then range is used, but 6,419 IDs causes a full table scan.
[4 Jun 16:48] Tony Fenleish
upgrading severity
[4 Jun 17:18] MySQL Verification Team
Hi Mr. Renieish,

Thank you for your bug report.

However, we can not further process your report.

First of all, this is a forum for reports with the repeatable test cases. Each of those test cases consists of a set of SQL statements that always leads to the problem in question.

Next, we can not verify performance degradation reports, like yours. First of all, There are so many new features developed between 5.7 and 8.0, with literally thousands of patches pushed, so that it is not possible to find the one that lead to the great dip in the performance.

Hence, you can compare, for example, 8.0.30 with 8.0.37 or any two previous releases.

Next, you have to  specify what is the performance degradation that you measured between two 8.0 releases.

I hope that we were clear enough.

We are waiting on your full feedback.