Description:
On the table performance_schema.data_lock_waits there are indexes REQUESTING_ENGINE_TRANSACTION_ID and BLOCKING_ENGINE_TRANSACTION_ID. However, these two index are "fake index". When scanning using these two indexes, a full table scan is actually performed, which may result in slow SQL.
How to repeat:
The mtr test:
--let $i = 0
CREATE TABLE t1 (c1 INT);
INSERT INTO t1 VALUES(1);
SET GLOBAL max_connections = 100000;
while ($i < 10000)
{
--connect(con$i, localhost, root,,)
SELECT * FROM t1;
if ($i < 100)
{
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT * FROM t1;
}
--inc $i
}
SELECT count(*) FROM information_schema.innodb_trx;
SELECT count(*) FROM performance_schema.data_lock_waits;
SELECT now();
--echo =============================================================
--echo
/* hdm internal mark */ select
t.trx_id AS trxId,
t.trx_mysql_thread_id AS trxMySqlThreadId,
t.trx_state AS trxStatus,
left(t.trx_query, 5000) AS trxQuery,
t.trx_rows_locked AS trxRowsLocked,
t.trx_rows_modified AS trxRowsModified,
timestampdiff(SECOND, `t`.`trx_wait_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) AS trxWaitLockTime,
timestampdiff(SECOND, `t`.`trx_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) AS trxDurationTime,
wb.REQUESTING_ENGINE_TRANSACTION_ID AS blockedTrx,
wb.REQUESTING_ENGINE_LOCK_ID AS blockedLock,
w.BLOCKING_ENGINE_TRANSACTION_ID AS blockedByTrx,
w.BLOCKING_ENGINE_LOCK_ID AS blockedByLock,
p.id AS sessionId,
p.user AS loginUser,
p.host AS host,
p.db AS db,
p.command AS command,
p.time AS sqlExecuteTime,
p.state AS sessionState,
left(p.info, 5000) AS sqlText
from information_schema.innodb_trx t
left join performance_schema.data_lock_waits w
on t.trx_id=w.REQUESTING_ENGINE_TRANSACTION_ID
left join performance_schema.data_lock_waits wb
on t.trx_id= wb.BLOCKING_ENGINE_TRANSACTION_ID
left join `information_schema`.processlist p
on t.trx_mysql_thread_id = p.id
where timestampdiff(SECOND, `t`.`trx_wait_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) > 100
or wb.REQUESTING_ENGINE_TRANSACTION_ID IS NOT NULL;
--echo
--echo ======================================================================
select now();
This test created 100,000 connections and started 100 transactions. Then, query the lock waiting information through a SQL which joins information_schema.innodb_trx t, performance_schema.data_lock_waits and information_schema`.processlist.
The execution of this sql takes about 3 seconds.
However, if we ignore the indexes REQUESTING_ENGINE_TRANSACTION_ID and BLOCKING_ENGINE_TRANSACTION_ID, rewrite the sql as:
select
t.trx_id AS trxId,
t.trx_mysql_thread_id AS trxMySqlThreadId,
t.trx_state AS trxStatus,
left(t.trx_query, 5000) AS trxQuery,
t.trx_rows_locked AS trxRowsLocked,
t.trx_rows_modified AS trxRowsModified,
timestampdiff(SECOND, `t`.`trx_wait_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) AS trxWaitLockTime,
timestampdiff(SECOND, `t`.`trx_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) AS trxDurationTime,
wb.REQUESTING_ENGINE_TRANSACTION_ID AS blockedTrx,
wb.REQUESTING_ENGINE_LOCK_ID AS blockedLock,
w.BLOCKING_ENGINE_TRANSACTION_ID AS blockedByTrx,
w.BLOCKING_ENGINE_LOCK_ID AS blockedByLock,
p.id AS sessionId,
p.user AS loginUser,
p.host AS host,
p.db AS db,
p.command AS command,
p.time AS sqlExecuteTime,
p.state AS sessionState,
left(p.info, 5000) AS sqlText
from information_schema.innodb_trx t
left join performance_schema.data_lock_waits w ignore index(REQUESTING_ENGINE_TRANSACTION_ID)
on t.trx_id=w.REQUESTING_ENGINE_TRANSACTION_ID
left join performance_schema.data_lock_waits wb ignore index(BLOCKING_ENGINE_TRANSACTION_ID)
on t.trx_id= wb.BLOCKING_ENGINE_TRANSACTION_ID
left join `information_schema`.processlist p
on t.trx_mysql_thread_id = p.id
where timestampdiff(SECOND, `t`.`trx_wait_started`, convert_tz(now(), @@time_zone, 'SYSTEM')) > 100
or wb.REQUESTING_ENGINE_TRANSACTION_ID IS NOT NULL;
This SQL is executed within 1 second.
Suggested fix:
The index REQUESTING_ENGINE_TRANSACTION_ID and BLOCKING_ENGINE_TRANSACTION_ID causes that join_type of performance_schema.data_lock_waits is ref. However, the index is "fake". So, the table performance_schema.data_lock_waits will be scanned multiple times. When igonring the index, the join_type of performance_schema.data_lock_waits is HashJoin and will be much faster.
Suggest to remove the index REQUESTING_ENGINE_TRANSACTION_ID and BLOCKING_ENGINE_TRANSACTION_ID on table performance_schema.data_lock_waits.