Bug #115623 Is it necessary to access leaves when generating Ranges in parallel scan(count*)
Submitted: 17 Jul 2024 8:58 Modified: 17 Jul 2024 12:21
Reporter: Ruyi Zhang (OCA) Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S5 (Performance)
Version:8.0.38 OS:Any
Assigned to: CPU Architecture:Any
Tags: count (*)

[17 Jul 2024 8:58] Ruyi Zhang
Description:
I found that when MySQL executes count(*), the SQL thread will perform Ranges splitting on BTREE and access the leaf page corresponding to each Range. I'm thinking that if the leaf pages are not in the buffer-pool, then the split thread will take a lot of time to load the leaf pages. In this scenario, if the tree has only 2 levels, parallel scanning will not be significantly faster than serial scanning. So what I want to know is whether it is necessary to access leaf pages during splitting ranges. Non-leaf nodes should already contain the complete key value. Distributing the task of accessing leaf nodes to parallel scan threads may save more IO time. Thanks.

How to repeat:
No need to reproduce
[17 Jul 2024 9:51] MySQL Verification Team
Hi Mr. Zhang,

Thank you for your bug report.

We do not see which storage engine you are discussing here, but we suppose that it is about the InnoDB SE.

Since InnoDB SE is MVCC SE, then SELECT COUNT(*) has to literally count all rows  that is visible in that session.

That is why you can get the approximate count number in our Information Schema.

Not a bug.
[17 Jul 2024 10:23] Ruyi Zhang
Thank you very much. What I described is indeed InnoDB. More specifically, count(*) is implemented through the function row_mysql_parallel_select_count_star. As mentioned above, when BTREE is split into Ranges, the split thread may spend more time loading leaf pages to buffer-pool, if access to leaf pages is given to the worker thread, this part of the IO time may be saved.
[17 Jul 2024 10:55] MySQL Verification Team
Hi Mr. Zhang,

According to MVCC standards, each concurrent session sees it's own values of the number of rows, regardless if there is a condition or not.

Also, InnoDB has a clustered primary key. If a primary key is not specified, then an invisible primary key is generated. Hence, primary key is stored with the entire record.

Count of the rows is done by the primary key entries. Hence, there you do not have separate pages for the leaf pages or node pages.

Due to that organisation of the InnoDB, it is much faster to count the rows sequentially then to jump through the BTREE.

Counting on some condition, that can be met by some index is a totally different affair.

We recommend that when you need a total count of the rows in some InnoDB table, you use Information Schema.

Not a bug.
[17 Jul 2024 11:29] Ruyi Zhang
Thanks, yes it is necessary for InnoDB to access all leaves when executing count(*). But what I want to confirm is: the Parallel_reader::Scan_ctx::partition function splits BTREE (clustered) into multiple Ranges, and each Range will scan its corresponding leaf pages in parallel through different threads (innodb_parallel_read_threads). Since Parallel_reader::Scan_ctx::partition can obtain Range information by accessing non-leaf nodes, is it still necessary to access leaf pages at the same time?
[17 Jul 2024 12:15] MySQL Verification Team
Hi,

With primary index, leaf pages do not exist. Only node pages exist.

Leaf pages are the records themselves, kept in the data pages of the InnoDB.

Also, it is a single operation, not performed in parallel, due to MVCC restrictions.
[17 Jul 2024 12:21] Ruyi Zhang
Thanks,
 In fact, 8.0.38 already supports the feature of parallel execution of COUNT(*)
https://dev.mysql.com/worklog/task/?id=11720