Description:
In our test, running command:
SELECT * FROM performance_schema.data_lock_waits;
when there are more than 5000 read-only transactions may get stuck for more than 20s.
The performance degradation might be caused because Restartable batch scan in storage/innobase/handler/p_s.cc: Innodb_data_lock_wait_iterator::scan() is not working as excepted:
In p_s.cc data lock instrumentation implements Restartable batch scan. As described in p_s.cc:@subsection restartable_batch_scan, Innodb is expected to report all the records in the range to the performance schema.
However, this batch scan fails to work fast when it comes to read-only transactions since the trx_id of read-only transaction is trx_id_t{reinterpret_cast<uintptr_t>(trx)} | (max_trx_id + 1)) according to function trx_get_id_for_print in trx0trx.ic
The difference between the memory addresses of two transactions be always bigger than 256. So when scan traversing the entire trx_list once, only one transaction is reported. Making the scan degenerate to be as slow as described in p_s.cc:@subsection no_single_row_scan and casuing stuck.
Performance of storage/innobase/handler/p_s.cc: Innodb_data_lock_iterator::scan has the same issue.
How to repeat:
A simple way to repeat is to start more than 5000 read-only transactions and run SELECT * FROM performance_schema.data_lock_waits;
In detail, we encountered this problem in production environment, but the steps below should be sufficient to repeat:
Prepare a database testdb and a table t.
Start Many read-only transactions. The following Golang code could be used to start 10000 read-only transactions:
for i := 0; i < 10000; i++ {
Db[i], err = sql.Open("mysql", "root:@tcp(localhost:3306)/testdb")
if err != nil {
fmt.Println("Open mysql error:", err)
break
}
Db[i].Exec("start transaction read only;")
Db[i].Exec("select * from t;")
}
Some read-write transactions can also be started at will to create lock waits, this does not affect the result of being stuck.
Run SELECT * FROM performance_schema.data_lock_waits;
As a result, it got stuck for about 20s in our test.
And while it is stuck, Innodb_data_lock_iterator::scan takes the lock trx_sys_mutex , preventing all other transactions from creating or destroying.
Suggested fix:
A simple fix might be adding the following code to p_s.cc:228, the begining of function trx_id_in_range(trx_id_t trx_id) :
static const trx_id_t max_trx_id = (1ULL << (DATA_TRX_ID_LEN * CHAR_BIT)) - 1;
if (trx_id > max_trx_id) {
trx_id = trx_id >> 10;
}
The above code checks whether the trx_id is generated from the memory address of the transaction, and if so, shifts the trx_id right by 10 bits.
This operation makes the trx_id of read-only transactions more compact, so that batch scan can work fast for read-only transactions, while still consuming bounded memory and without affecting other code.
In our test following the steps described in How to repeat part, the above code can reduce the time of SELECT * FROM performance_schema.data_lock_waits; from 20s to 0.1s.