Bug #106610 Concurrent INSERT IGNORE in InnoDB table may trigger the database to stall
Submitted: 1 Mar 2022 10:32 Modified: 4 Apr 2022 11:34
Reporter: Dirk LANGE Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Locking Severity:S2 (Serious)
Version:5.7.33 OS:Linux
Assigned to: CPU Architecture:x86
Tags: auto_increment, concurrent, innodb, insert, unique key

[1 Mar 2022 10:32] Dirk LANGE
Description:
Concurrent INSERT IGNORE into an InnoDB Table may trigger a database state, where overall statement execution stalls.
InnoDB-related statements suffer from drastically increased runtime.

Prerequisite:
- innodb_autoinc_lock_mode = 2
- InnoDB target table contains
  a primary key column using AUTO_INCREMENT, and
  another column, adding a separate unique key
- concurrent INSERT IGNORE

Observation:
All statements involving InnoDB tables appear to execute if not being locked otherwise (if no other lock condition applies), but extremely slow, virtually not progressing.
Affected are
- the triggering INSERT IGNORE themselves
- all other independently running statements on tables of type InnoDB

One of the triggering INSERT IGNORE will start executing, the other concurrent one(s) are in State „Sending data“ or „Update“.
The one running executes already extremely slow (ref. strace on the mysqld thread).

In case of exceeding innodb_lock_wait_timeout, the waiting ones will get aborted.
Even when the last waiting/locked INSERT IGNORE got aborted, the executing one keeps processing extremely slow.

The situation recovers, when the executing INSERT IGNORE finished.
In case of still waiting INSERT IGNORE, they now execute as fast as normal, if not another race condition occurs.

The runtime of the executing INSERT IGNORE appears to scale with the number of records to be processed.

In contrast, letting run the „triggering“ INSERT IGNORE sequentially will not lead to the effect described.
They would execute in fast manner.

The host machine load plunges to 1-2 regarding the respective MySQL instance.

Consequence:
In regular database production mode suffering from the effect described, the number of open connections will increase over time leading to
- reaching the maximum number of allowed database connections
- accumulating a vast number of active connections executing statements
- virtually no response by the database in terms of delivering results

The effect described happens with INSERT-like Statements (INSERT IGNORE … SELECT …) as well as “Simple INSERT”- (single-row, multi-row).
autocommit = 0|1 does not show a change in behavior.

How to repeat:
-- you will need two sessions;
-- the test uses a small number of records to show the effect;
-- the more records, the longer executing, the more collateral stalled third party statements potentially piling up

-- prepare target and source tables

CREATE TABLE `ainc` (
`sk` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`a` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`sk`),
UNIQUE INDEX `Index_2` (`a`)
)
ENGINE=InnoDB;

CREATE TABLE `refdata1` (
`a` INT(10) UNSIGNED NOT NULL
)
ENGINE=InnoDB;

CREATE TABLE `refdata2` (
`a` INT(10) UNSIGNED NOT NULL
)
ENGINE=InnoDB;

-- fill a sufficient number of records into the tables intended to be read,
-- here I just used mysql.time_zone as means to prepare about 1800 rows;
-- the more you fill in, the likely the test later will hit the effect

INSERT INTO refdata1
SELECT substring_index(1000000000*RAND(), '.', 1)
FROM mysql.time_zone;

INSERT INTO refdata2
SELECT substring_index(1000000000*RAND(), '.', 1)
FROM mysql.time_zone;

-- execute the test

-- in session 1
-- lock the target to provoke race condition

LOCK TABLES ainc READ;

-- in session 2

INSERT IGNORE ainc SELECT NULL AS sk, a FROM refdata2;

-- in session 1
-- important for the test: run both statements below in a single batch

UNLOCK TABLES; INSERT IGNORE ainc SELECT NULL AS sk, a FROM refdata1;

Result:

(0): --> single runs for reference, regular execution

   mysql> INSERT IGNORE ainc SELECT NULL AS sk, a FROM refdata2;
   Query OK, 1823 rows affected (0.02 sec)
   Records: 1823  Duplicates: 0  Warnings: 0

   mysql> INSERT IGNORE ainc SELECT NULL AS sk, a FROM refdata2;
   Query OK, 0 rows affected, 1823 warnings (0.04 sec)
   Records: 1823  Duplicates: 1823  Warnings: 1823

(A): --> in scope

   drastically increased runtime compared to regular execution

   mysql> UNLOCK TABLES; INSERT IGNORE ainc SELECT NULL AS sk, a FROM test.refdata1;
   Query OK, 0 rows affected (0.00 sec)

   Query OK, 0 rows affected, 1823 warnings (26.81 sec)
   Records: 1823  Duplicates: 1823  Warnings: 1823

   mysql> INSERT IGNORE ainc SELECT NULL AS sk, a FROM test.refdata2;
   Query OK, 0 rows affected, 1823 warnings (27.18 sec)
   Records: 1823  Duplicates: 1823  Warnings: 1823

(B): --> not in scope

   in some rare iterations the race condition may not be triggered, especially when the target table is initially empty

(C): --> not in scope

   in some iterations the test will regularly come back for one session with “ERROR 1213”
   and for the other session with the expected execution time

   mysql> UNLOCK TABLES; INSERT IGNORE ainc SELECT NULL AS sk, a FROM test.refdata1;
   Query OK, 0 rows affected (0.00 sec)

   ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

   mysql> INSERT IGNORE ainc SELECT NULL AS sk, a FROM test.refdata2;
   Query OK, 0 rows affected, 1823 warnings (0.40 sec)
   Records: 1823  Duplicates: 1823  Warnings: 1823
[1 Mar 2022 13:34] MySQL Verification Team
Hi,

Thanks for the report

kind regards
[4 Apr 2022 11:34] Dirk LANGE
Hi,

are there findings already,
have you been able to verify,
do you need further information?

Thanks