Bug #112137 parallel read is slow when innodb_use_native_aio = off
Submitted: 22 Aug 2023 13:48 Modified: 14 Mar 21:47
Reporter: Ke Yu (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.33, 8.0.34 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[22 Aug 2023 13:48] Ke Yu
Description:
When innodb_use_native_aio = off and innodb_parallel_read_threads is greater than innodb_read_io_threads, parallel read is slow.

How to repeat:
step 1. start mysqld and use sysbench to prepare a table with size = 60000000.
rm -rf 110722/
bin/mysqld --initialize-insecure --basedir=$PWD --datadir=$PWD/110722 --log-error-verbosity=3
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/110722 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3306 --log-error=$PWD/110722/log.err --mysqlx-port=33330 --mysqlx-socket=/tmp/mysql_x_ushastry.sock --log-error-verbosity=3 --secure-file-priv=/tmp/ --innodb-buffer-pool-size=2105745408 --innodb-buffer-pool-instances=2 --innodb-flush-method=O_DIRECT --innodb-read-io-threads=1 --innodb-use-native-aio=off 2>&1 &
sysbench oltp_read_write --tables=1 --table_size=60000000 --report-interval=1 --mysql-db=sbtest --rand-type=uniform --threads=1 prepare

step 2. restart mysqld to empty buffer pool
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/110722 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3306 --log-error=$PWD/110722/log.err --mysqlx-port=33330 --mysqlx-socket=/tmp/mysql_x_ushastry.sock --log-error-verbosity=3  --secure-file-priv=/tmp/ --innodb-buffer-pool-size=2105745408 --innodb-buffer-pool-instances=2 --innodb-flush-method=O_DIRECT --innodb-read-io-threads=1 --innodb-use-native-aio=off 2>&1 &

step 3. set innodb_parallel_read_threads = 16, execute a select count(*).
mysql -uroot -S /tmp/mysql_ushastry.sock
mysql> set innodb_parallel_read_threads = 16;
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 60000000 |
+----------+
1 row in set (1 min 21.71 sec)

step 4. restart mysqld with --innodb-use-native-aio=on
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/110722 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3306 --log-error=$PWD/110722/log.err --mysqlx-port=33330 --mysqlx-socket=/tmp/mysql_x_ushastry.sock --log-error-verbosity=3  --secure-file-priv=/tmp/ --innodb-buffer-pool-size=2105745408 --innodb-buffer-pool-instances=2 --innodb-flush-method=O_DIRECT --innodb-read-io-threads=1 --innodb-use-native-aio=on 2>&1 &

step 5. set innodb_parallel_read_threads = 16, execute a select count(*).
mysql> set innodb_parallel_read_threads = 16;
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 60000000 |
+----------+
1 row in set (6.4 sec)

As we can see, when innodb-use-native-aio is off, it takes 1min21.71s; When innodb-use-native-aio is on, it takes 6.4s.

If we modify the source code to change the asynchronous read in parallel read to the synchronous read:
diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
index b3e498f7f51..330aecf1fae 100644
--- a/storage/innobase/buf/buf0buf.cc
+++ b/storage/innobase/buf/buf0buf.cc
@@ -4019,7 +4019,7 @@ dberr_t Buf_fetch<T>::check_state(buf_block_t *&block) {
 template <typename T>
 void Buf_fetch<T>::read_page() {
   bool success{};
-  auto sync = m_mode != Page_fetch::SCAN;
+  bool sync = true;
 
   if (sync) {
     success = buf_read_page(m_page_id, m_page_size);

and then we start mysqld with innodb-use-native-aio=off:
bin/mysqld --no-defaults --basedir=$PWD --datadir=$PWD/110722 --core-file --socket=/tmp/mysql_ushastry.sock  --port=3306 --log-error=$PWD/110722/log.err --mysqlx-port=33330 --mysqlx-socket=/tmp/mysql_x_ushastry.sock --log-error-verbosity=3  --secure-file-priv=/tmp/ --innodb-buffer-pool-size=2105745408 --innodb-buffer-pool-instances=2 --innodb-flush-method=O_DIRECT --innodb-read-io-threads=1 --innodb-use-native-aio=off 2>&1 &

then do a select count(*):
mysql> set innodb_parallel_read_threads = 16;
mysql> select count(*) from sbtest1;
+----------+
| count(*) |
+----------+
| 60000000 |
+----------+
1 row in set (6.44s sec)

As we can see, after modifying the code, even if innodb-use-native-aio=off, the query speed is the same as innodb-use-native-aio=on.

Suggested fix:
We know that simulated AIO does not perform as well as native AIO, but we can compensate for this by changing the read operations in parallel read worker thread from asynchronous to synchronous.

In a parallel read worker thread, page is asynchronously read from disk, and then wait for the io thread to complete io, the function call stack is as follows:
Parallel_reader::worker
| -> Parallel_reader::Ctx::traverse
| -> | -> Parallel_reader::Ctx::traverse_recs
| -> | -> | -> Parallel_reader::Ctx::move_to_next_node
| -> | -> | -> | -> PCursor::move_to_next_block
| -> | -> | -> | -> | -> PCursor::move_to_user_rec
| -> | -> | -> | -> | -> | -> buf_page_get_gen
| -> | -> | -> | -> | -> | -> | -> Buf_fetch<Buf_fetch_other>::single_page
| -> | -> | -> | -> | -> | -> | -> | -> buf_wait_for_read

In buf_page_get_gen, Page_fetch::SCAN is used to perform asynchronous reads, and buf_wait_for_read waits for the I/O thread to complete asynchronous reads.

Between committing an asynchronous read and waiting for the asynchronous read to complete, the parallel worker thread does no other opreations, just waiting for the asynchronous io to complete. Therefore, the asynchronous io here can be changed to synchronous read.

Since the parallel work thread is also a background thread, it does not have to hand over the read operations to the io thread. In addition, the use of asynchronous read here will cause performance losses, and it is perfectly possible to use the parallel read work thread to perform synchronous read I/O operations.
[24 Aug 2023 9:05] MySQL Verification Team
Hello Ke Yu,

Thank you for the report and feedback.
I'm in the middle of verifying this and get back to you if anything further needed. Thank you.

regards,
Umesh
[24 Aug 2023 10:30] MySQL Verification Team
Hello Ke Yu,

Thank you for the report and feedback.
Verified as described.

regards,
Umesh
[24 Aug 2023 10:32] MySQL Verification Team
8.0.34 test results

Attachment: 112137.results (application/octet-stream, text), 4.60 KiB.

[2 Sep 2023 9:05] Ke Yu
Contributed patch

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: bug112137.diff (application/octet-stream, text), 1.97 KiB.

[8 Nov 2023 16:15] Marcin Babij
Posted by developer:
 
Hello Ke Yu,
Thank you for the report.

`--innodb-read-io-threads=1`
This is not a fair comparison. If you use `--innodb-read-io-threads=16` then it seems it could behave similarly. 6.4s * 16 = 1m42s, which is more than reported 1m20s, so there seems to not be much room for disadvantage.

1. Could you check if the problem persists with `--innodb-read-io-threads=16` please?
2. Is there any reason you'd not set it higher if you don't use Native AIO? If it is set low, the performance will be degraded vs Native AIO in many more cases.
3. Could you share information why you are not using Native AIO? It is highly discouraged, unless it is not possible to be used otherwise.
[9 Nov 2023 2:09] Ke Yu
Hello Marcin Babij,

1. If we use '--innodb-read-io-threads=16', the time cost of Native AIO and simulated AIO 6.4s, 10.16s respectively.

use '--innodb-read-io-threads=4', the time cost of Native AIO and simulated AIO 6.4s, 21.73s respectively.

2. In my application, there are machines with relatively small cpu cores, so the io thread count is low.

3. History reasons.

By the way, the main point I want to make is that "but we can compensate for this by changing the read operations in parallel read worker thread from asynchronous to synchronous."
[9 Nov 2023 12:06] Marcin Babij
Posted by developer:
 
1. Thank you for testing it! Is 10s quite OK for you as workaround?
2. Small CPU core count is not a problem. You can use even higher numbers if this helps your performance. These threads should be sleeping if not used, and they post a blocking IO fast and sleep on it if used. Kernel scheduler should be able to fit them all on a single core just fine. You may want to test if 32 threads will help and how they influence the CPU usage, for example when the server is idle.
3. OK. This seem interesting, may be an input to some future work in this area, please share some reasons why it is used or was used in the first place, if you don't mind.

> "but we can compensate for this by changing the read operations in parallel read worker thread from asynchronous to synchronous."
Yes, I understand, we are evaluating this change too, it would revert some of the work done in WL#12978 InnoDB:Fix imbalance during parallel scan https://dev.mysql.com/worklog/task/?id=12978 . It was not a crucial part of this WL, more like an addition. It seems it already caused performance problems that were fixed in 8.0.23 with:
  A 20µs sleep in the buf_wait_for_read() function introduced with parallel read functionality in MySQL 8.0.17 took 1ms on Windows, causing an unexpected timeout when running certain tests. Also, AIO threads were found to have uneven amounts of waiting operating system IO requests. (Bug #31095274)
[10 Nov 2023 3:17] Ke Yu
Hello Marcin Babij,

1. It is ok!
2. Yes, I understand. However, due to cpu limitations, some instances set this value lower. The native aio will post a blocking IO fast and sleep on, but the simulated AIO wil do the io itself. This consumes cpu resources
3. We will gradually use native aio in the future.
[27 Nov 2023 7:54] MySQL Verification Team
Checked against Marcin's patched build

Attachment: 112137_8.3.0.results (application/octet-stream, text), 5.05 KiB.

[14 Mar 21:47] Philip Olson
Posted by developer:
 
Fixed as of the upcoming MySQL Server 8.0.37 and 8.4.0 releases, and here's the proposed changelog entry from the documentation team:

With innodb_use_native_aio=off and innodb_read_io_threads=1, queries
performing parallel scans, such as SELECT COUNT(*), were slow.

Thank you for the bug report.
[15 Apr 18:53] Philip Olson
Posted by developer:
 
Updated release note:

With innodb_parallel_read_threads set to a value greater than 1, InnoDB
would unnecessarily request asynchronous reads which required more
synchronization during I/O completion and created a bottleneck due to the
limited number of available threads (innodb_read_io_threads) for handling
I/O operations. Now this performs synchronous instead of asynchronous
reads.