Bug #111102 Performance issue in simulated AIO in sysbench oltp_insert auto_inc mode
Submitted: 22 May 2023 9:40 Modified: 22 May 2023 19:32
Reporter: Zongzhi Chen (OCA) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:8.0.* OS:CentOS
Assigned to: CPU Architecture:x86
Tags: aio

[22 May 2023 9:40] Zongzhi Chen
Description:
In sysbench simple table oltp_insert auto_inc=ON mode.
if using innodb_use_native_aio = OFF, the QPS is about 7w+

```
[ 6530s ] thds: 32 tps: 66612.49 qps: 66612.49 (r/w/o: 0.00/66612.49/0.00) lat (ms,95%): 2.35 err/s: 0.00 reconn/s: 0.00
[ 6531s ] thds: 32 tps: 66773.02 qps: 66773.02 (r/w/o: 0.00/66773.02/0.00) lat (ms,95%): 2.30 err/s: 0.00 reconn/s: 0.00
[ 6532s ] thds: 32 tps: 67010.57 qps: 67010.57 (r/w/o: 0.00/67010.57/0.00) lat (ms,95%): 2.30 err/s: 0.00 reconn/s: 0.00
[ 6533s ] thds: 32 tps: 67959.63 qps: 67959.63 (r/w/o: 0.00/67959.63/0.00) lat (ms,95%): 2.26 err/s: 0.00 reconn/s: 0.00
```

if using innodb_use_native_aio = ON, the QPS is about 11w+

this is the sysbench command

```
sysbench oltp_insert --mysql-host=xxx.xxx.xxx.x --mysql-port=2255 --mysql-password=xxx --mysql-user=xxx --tables=1 --table_size=2000000000 --threads=32 --max-time=30000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all --auto_inc=on run
```

using pt-pmp to find the the bottleneck, we can find that the btr_cur_prefetch_siblings() in btr_cur_optimistic_insert() call the wake_simulated_handler_thread() function.  in wake_simulated_handler_thread() function it need to acquire the global s_writes mutex and iterator all the slot in that segment and wake up os_aio_segment_wait_events[global_segment], this global s_writes mutex become the bottleneck.

below is the pt-pmp result:

 1 AIO::wake_simulated_handler_thread(os0file.cc:6801),AIO::wake_simulated_handler_thread(os0file.cc:6781),os_aio_simulated_wake_handler_threads(os0file.cc:6830),btr_cur_prefetch_siblings(btr0cur.cc:2660),btr_cur_optimistic_insert(btr0cur.cc:2768),row_ins_clust_index_entry_low(row0ins.cc:2514),row_ins_clust_index_entry(row0ins.cc:3102),row_ins_index_entry(row0ins.cc:3295),row_ins_index_entry_step(row0ins.cc:3295),row_ins(row0ins.cc:3295),row_ins_step(row0ins.cc:3295),row_insert_for_mysql_using_ins_graph(row0mysql.cc:1580),row_insert_for_mysql(row0mysql.cc:1706),ha_innobase::write_row(ha_innodb.cc:8581),handler::ha_write_row(handler.cc:7740),write_record(sql_insert.cc:1946),Sql_cmd_insert_values::execute_inner(sql_insert.cc:614),Sql_cmd_dml::execute(sql_select.cc:704),mysql_execute_command(sql_parse.cc:3450),mysql_parse(sql_parse.cc:5257),dispatch_command(sql_parse.cc:1765),do_command(sql_parse.cc:1273),handle_connection(connection_handler_per_thread.cc:302),pfs_spawn_thread(pfs.cc:2854),start_thread(libpthread.so.0),clone(libc.so.6)

below is the AIO::wake_simulated_handler_thread() {

  acquire();

  const Slot *slot = at(offset);

  for (ulint i = 0; i < n; ++i, ++slot) {
    if (slot->is_reserved) {
      /* Found an i/o request */

      release();

      os_event_set(os_aio_segment_wait_events[global_segment]);

      return;
    }
  }

  release();
}

How to repeat:
running the sysbench and setting the innodb_use_native_aio to ON/OFF

```
sysbench oltp_insert --mysql-host=xxx.xxx.xxx.x --mysql-port=2255 --mysql-password=xxx --mysql-user=xxx --tables=1 --table_size=2000000000 --threads=32 --max-time=30000 --report-interval=1 --rand-type=uniform --mysql-ignore-errors=all --auto_inc=on run
```

Suggested fix:
There is two way to fix this issue:

1. don't call btr_cur_prefetch_siblings() in btr_cur_optimistic_insert then we can get about 11w+ qps again.

2. another way is to fix the simulated AIO module.

we have fix this is using multi queue and divide the single s_writes->mutex into multi mutex.
[22 May 2023 12:28] MySQL Verification Team
Hi Mr. zongzhi,

Thank you very much for your bug report.

However, we need one clarification. When you talk about differences in the performances, you mention the differences of 7w+ and 11w+. Can you explain what do you mean by that.

Also, when you advise against usage of the simulated AIO, what shall we do on the systems who do not have native AIO, or when their native AIO is slower then the simulated one ???? How do you propose that we fix this, so that every operating system that we support and every version of those operating systems perform better. Please, do note, that we can change that variable so that some users get better performance and some other get worse performance.

That is a basic reason why we have that option, so that users can tune MySQL for their environment. 

We do not think that a change in that setting is acceptable at all !!!!

Last , but not least, we can not repeat your test case, since you are using tools that are not our tools. The only exception to this is `sysbench`, which we use regularly for testing.

Can't repeat.
[22 May 2023 13:12] Zongzhi Chen
Hello

7w+ and 11w+ means that the qps is approximately 70000, and 11w means that the qps is approximately 110000.

And I don't advise against usage of the simulated AIO. Actually, in our environment, we use simulated AIO instead of native since we deploy multi instance in on machine. In order to avoid the IO contend in libaio, we use simulated AIO.

I just told you guys that there is bottleneck in simulated AIO, I suggest you guys to fix it.

the tool I use is sysbench and pstack, after pstack I use pt-pmp to aggregate the pstack information. pt-pmp is develop by percona team. 
https://docs.percona.com/percona-toolkit/pt-pmp.html

can you reproduce the result again use my sysbench command by switch the innodb_use_native_aio option.
[22 May 2023 15:41] MySQL Verification Team
Hi Mr. zongzhi,

We do use both sysbench and pstack, but we do not use other tools that you are mentioning.

Hence, we need a test case without other third-party tools. Next, you are reporting a performance problem on an unknown operating system and unknown release of MySQL, we can not repeat it. On the OS that we are using difference is very, very small .....

Next, we do not see any profiling info, so your report does not help at all.

Last , but not least, version 8.0 is entering the maintenance mode, so it will not receive any performance improvement. Those will all go to 8.1, which is not out yet ....
[22 May 2023 19:32] Zongzhi Chen
The simulated AIO pstack information.

Attachment: simulated_pstack.txt (text/plain), 390.70 KiB.

[22 May 2023 19:32] Zongzhi Chen
I have update de CPU Architecture and the OS..

I don't use the third-party tool, I only use sysbench. And use pstack to get the profiling information.

And another tool I use is a tool called "pt-pmp" to aggregate the pstack information, if you don't use if before you can ignore the information.

I have attach the simulated AIO pstack information.
[23 May 2023 12:01] MySQL Verification Team
Hi,

We already wrote that, in our environment, performances are much closer.

Also, if we analyse the stacks in each of the threads, it is easy to conclude that most of the threads are in the operations that do require either waiting on condition or on the I/O.

Hence, a patch from you would be much more beneficial then the sysbench results, since (as we wrote) on our system those to options differ by 2 - 5 %.

We hope that you are capable of providing the patch that would enhance performance on your setup and that would not decrease performance on the systems where simulated AIO is faster then the native one.

We shall patiently wait for your input.