Description:
When we set innodb_use_native_aio=OFF, and also set innodb_read_io_threads=64 and innodb_write_io_threads=64, along with innodb_page_cleaners=32 and innodb_purge_threads=64, the printed perf statements reveal severe lock contention.
How to repeat:
Further investigation shows that the purge thread calls wake_simulated_handler_thread, and the flushing thread calls both AIO::reserve_slot and wake_simulated_handler_thread.
/** Wakes up simulated aio i/o-handler threads if they have something to do. */
void
os_aio_simulated_wake_handler_threads()
{
if (srv_use_native_aio) {
/* We do not use simulated aio: do nothing */
return;
}
os_aio_recommend_sleep_for_read_threads = false;
for (ulint i = 0; i < os_aio_n_segments; i++) {
AIO::wake_simulated_handler_thread(i);
}
}
In the function above, the value of os_aio_n_segments is 129.
If there are 64 purge threads, it will call wake_simulated_handler_thread 64 * 129 = 8256 times each time.
This wake_simulated_handler_thread,it needs to acquire the global lock for aio, and calling it too many times will lead to severe lock contention.
wake_simulated_handler_thread also needs to be called when mysql flush dirty page, if there are 32 page cleaner threads, then wake_simulated_handler_thread will be called 32 * 129 = 4128 times each time.
Clearly, this will lead to severe lock contention. And we saw it in perf.
You need a good understanding of the purge and flushing processes to understand what I'm saying.
Suggested fix:
/** Wakes up a simulated AIO I/O-handler thread if it has something to do
for a local segment in the AIO array.
@param[in] global_segment The number of the segment in the AIO arrays
@param[in] segment The local segment in the AIO array */
void
AIO::wake_simulated_handler_thread(ulint global_segment, ulint segment)
{
ut_ad(!srv_use_native_aio);
ulint n = slots_per_segment();
ulint offset = segment * n;
/* Look through n slots after the segment * n'th slot */
acquire();
const Slot* slot = at(offset);
for (ulint i = 0; i < n; ++i, ++slot) {
if (slot->is_reserved) {
/* Found an i/o request */
release();
os_event_t event;
event = os_aio_segment_wait_events[global_segment];
os_event_set(event);
return;
}
}
release();
}
This function requires acquiring a global lock each time, which can easily lead to performance issues. This large lock can be modified into up to 129 smaller locks, each used by a separate `write_io_thread` or 'read_io_thread' upon wakeup.
Or you could try other methods to fix it.
Description: When we set innodb_use_native_aio=OFF, and also set innodb_read_io_threads=64 and innodb_write_io_threads=64, along with innodb_page_cleaners=32 and innodb_purge_threads=64, the printed perf statements reveal severe lock contention. How to repeat: Further investigation shows that the purge thread calls wake_simulated_handler_thread, and the flushing thread calls both AIO::reserve_slot and wake_simulated_handler_thread. /** Wakes up simulated aio i/o-handler threads if they have something to do. */ void os_aio_simulated_wake_handler_threads() { if (srv_use_native_aio) { /* We do not use simulated aio: do nothing */ return; } os_aio_recommend_sleep_for_read_threads = false; for (ulint i = 0; i < os_aio_n_segments; i++) { AIO::wake_simulated_handler_thread(i); } } In the function above, the value of os_aio_n_segments is 129. If there are 64 purge threads, it will call wake_simulated_handler_thread 64 * 129 = 8256 times each time. This wake_simulated_handler_thread,it needs to acquire the global lock for aio, and calling it too many times will lead to severe lock contention. wake_simulated_handler_thread also needs to be called when mysql flush dirty page, if there are 32 page cleaner threads, then wake_simulated_handler_thread will be called 32 * 129 = 4128 times each time. Clearly, this will lead to severe lock contention. And we saw it in perf. You need a good understanding of the purge and flushing processes to understand what I'm saying. Suggested fix: /** Wakes up a simulated AIO I/O-handler thread if it has something to do for a local segment in the AIO array. @param[in] global_segment The number of the segment in the AIO arrays @param[in] segment The local segment in the AIO array */ void AIO::wake_simulated_handler_thread(ulint global_segment, ulint segment) { ut_ad(!srv_use_native_aio); ulint n = slots_per_segment(); ulint offset = segment * n; /* Look through n slots after the segment * n'th slot */ acquire(); const Slot* slot = at(offset); for (ulint i = 0; i < n; ++i, ++slot) { if (slot->is_reserved) { /* Found an i/o request */ release(); os_event_t event; event = os_aio_segment_wait_events[global_segment]; os_event_set(event); return; } } release(); } This function requires acquiring a global lock each time, which can easily lead to performance issues. This large lock can be modified into up to 129 smaller locks, each used by a separate `write_io_thread` or 'read_io_thread' upon wakeup. Or you could try other methods to fix it.