Bug #116258 The memory consumption of the range optimizer is incorrect about 230 bytes
Submitted: 28 Sep 15:19
Reporter: chong zhang Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:5.7,8.0 OS:Any
Assigned to: CPU Architecture:Any

[28 Sep 15:19] chong zhang
Description:
The official documentation does not accurately describe the memory consumption of the range optimizer, see https://dev.mysql.com/doc/refman/8.0/en/range-optimization.html.

There is such a description,

To estimate the amount of memory needed to process a rangeexpression, use these guidelines: 

•For a simple query such as the following, where there isone candidate key for the range access method, eachpredicate combined with or uses approximately 230 bytes:

SELECT COUNT(*) FROM t WHERE a=1 OR a=2 OR a=3 OR .. . a=N; 

I found that the number of fields, length, and index count are all factors that affect the memory usage of the scope optimizer.

How to repeat:
set sql_mode='';
set session range_optimizer_max_mem_size=default;

drop table if exists t;
 create table t (
  `id` int unsigned not null auto_increment,
  primary key (`id`),
  key(id)
) engine=innodb ;

truncate table t;
set @id:=0;
replace into t(id) values
(@id:=@id+1),(@id:=@id+1),(@id:=@id+1),(@id:=@id+1),(@id:=@id+1);
replace into t(id) select @id:=@id+1 from t,t a,t b,t c,t d,t e,t f,t g limit
40000;

mysql>  select count(id) from t where id=1 or id=2 or id=3 or id=4 or id=5 or id=6 or id=7 or id=8 or id=9 or id=10 or  id=11 or id=12;
+-----------+
| count(id) |
+-----------+
|        12 |
+-----------+
1 row in set (0.00 sec)

     select high_number_of_bytes_used AS range_memory  from
       performance_schema.memory_summary_by_thread_by_event_name
 a,performance_schema.threads b
       where b.processlist_id=connection_id() and a.thread_id=b.thread_id and
 event_name='memory/sql/test_quick_select';	

+--------------+
| range_memory |
+--------------+
|        10256 |
+--------------+
1 row in set (0.01 sec)

If it is correct for a predicate to use approximately 230 bytes of memory, then 4104 bytes of memory should satisfy ranges 18, while actual ranges 12 occupy 10256 bytes of memory.

Suggested fix:
SQL often uses the following writing style,
1、 SELECT COUNT(*) FROM t WHERE a=1 OR a=2 OR a=3 OR .. . a=N; 
2、 SELECT COUNT(*) FROM t WHERE a in (1,2,3,..,n);

When SQL with similar features uses the range optimizer beyond the value of range_optimizer_max_mem_stize, The optimizer will abandon range scanning and consider other execution plans, resulting in selecting the wrong execution plan.

The first point,
I hope the community can provide a correct evaluation method for the memory usage of the scope optimizer. The explicit calculation formula for the number of ins and the memory range of the optimizer used.

The second point,
range_optimizer_max_mem_stize defaults to 8M. If it is specified that the number of ins does not exceed 100, will the memory usage of the range optimizer never exceed 8M.

Looking forward to your reply!