Bug #104054 Unnecessary next-key lock for partition table
Submitted: 18 Jun 2021 7:47 Modified: 24 Jun 2021 11:43
Reporter: Qingping Zhu Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:8.0 OS:CentOS (centos 7.0)
Assigned to: CPU Architecture:x86
Tags: lock, partition table

[18 Jun 2021 7:47] Qingping Zhu
Description:
For Bug #29508068, MySQL 8.0.18 already fixed the unnecessary next key lock issue, but for partition, this issue still not fixed.
Below is the test for partition table on MySQL 8.0.25.
CREATE TABLE t1_p(c1 INT NOT NULL PRIMARY KEY, c2 INT) PARTITION BY RANGE COLUMNS (c1) (
   PARTITION p0 VALUES LESS THAN (10),
   PARTITION p1 VALUES LESS THAN (50),
   PARTITION p2 VALUES LESS THAN (MAXVALUE)
);
INSERT INTO t1_p VALUES(0, 0);
INSERT INTO t1_p VALUES(1, 1);
INSERT INTO t1_p VALUES(2, 2);
INSERT INTO t1_p VALUES(3, 3);
INSERT INTO t1_p VALUES(20, 20);
INSERT INTO t1_p VALUES(30, 30);

Time session1                                         session2
T1   BEGIN;
     SELECT * FROM t1_p WHERE c1<=1 FOR UPDATE;    
T2                                                    BEGIN;
                                                      DELETE FROM t1_p WHERE c1=2;
                                                      ERROR HY000: Lock wait timeout exceeded; try restarting transaction

How to repeat:
CREATE TABLE t1_p(c1 INT NOT NULL PRIMARY KEY, c2 INT) PARTITION BY RANGE COLUMNS (c1) (
   PARTITION p0 VALUES LESS THAN (10),
   PARTITION p1 VALUES LESS THAN (50),
   PARTITION p2 VALUES LESS THAN (MAXVALUE)
);
INSERT INTO t1_p VALUES(0, 0);
INSERT INTO t1_p VALUES(1, 1);
INSERT INTO t1_p VALUES(2, 2);
INSERT INTO t1_p VALUES(3, 3);
INSERT INTO t1_p VALUES(20, 20);
INSERT INTO t1_p VALUES(30, 30);

SET autocommit=OFF;
connect (con1,localhost,root);
--echo Connection: con1
connect (con2,localhost,root);
--echo Connection: con2

connection con1;
--echo Connection: con1
BEGIN;
SELECT * FROM t1_p WHERE c1<=1 FOR UPDATE;

connection con2;
--echo Connection: con2
BEGIN;
--error 1205
DELETE FROM t1_p WHERE c1=2;

connection con1;
ROLLBACK;

disconnect con1;
disconnect con2;

Suggested fix:
SHA-1: 4442ebadcad672b32d6680300e6379667afb7dc8
* Bug #29508068     UNNECESSARY NEXT-KEY LOCK TAKEN

This commit should adapt partition table.
[18 Jun 2021 7:54] Qingping Zhu
MTR test case

Attachment: unnecessary_lock_for_partition_table.test (application/octet-stream, text), 781 bytes.

[18 Jun 2021 12:30] MySQL Verification Team
Hi Mr. Zhu,

Thank you for your bug report.

However, this is not a bug.

The reason for this not being a bug is very simple. You have very few rows, so the optimiser correctly estimated that it is much faster to scan the table, than to perform the index lookup. It takes at least 20 to 100 rows (depending on the structure), to make index lookup more efficient than scanning all rows.

You are also mentioning our internal bug report in this public forum. How have you got that number, when it is not accessible to the public ???

If you think that the mentioned internal bug is a solution for your report, than your report is a duplicate and not the original bug report.

Not a bug.
[19 Jun 2021 1:33] Qingping Zhu
Thanks for your response, I think your analysis maybe inaccurate, so I put more information here.
We have two tables(one normal table, one partition table) with same dataset.
For the same SELECT query, optimizer both choose RANGE scan on primary key, but partition table add an extra X Record Lock on Record "c1=2" compare with non-partition table, which will block other session modify this record. Below is the results on MySQL 8.0.25:
mysql> select version();
+--------------+
| version()    |
+--------------+
| 8.0.25-debug |
+--------------+
1 row in set (0.01 sec)

mysql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `c1` int NOT NULL,
  `c2` int DEFAULT NULL,
  PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> SHOW CREATE TABLE t1_p\G
*************************** 1. row ***************************
       Table: t1_p
Create Table: CREATE TABLE `t1_p` (
  `c1` int NOT NULL,
  `c2` int DEFAULT NULL,
  PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50500 PARTITION BY RANGE  COLUMNS(c1)
(PARTITION p0 VALUES LESS THAN (10) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (50) ENGINE = InnoDB,
 PARTITION p2 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */
1 row in set (0.00 sec)

mysql> SELECT * FROM t1;
+----+------+
| c1 | c2   |
+----+------+
|  0 |    0 |
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
| 20 |   20 |
| 30 |   30 |
+----+------+
6 rows in set (0.01 sec)

mysql> SELECT * FROM t1_p;
+----+------+
| c1 | c2   |
+----+------+
|  0 |    0 |
|  1 |    1 |
|  2 |    2 |
|  3 |    3 |
| 20 |   20 |
| 30 |   30 |
+----+------+
6 rows in set (0.00 sec)

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1 WHERE c1<=1 FOR UPDATE;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT * FROM t1_p WHERE c1<=1 FOR UPDATE;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1_p  | p0         | range | PRIMARY       | PRIMARY | 4       | NULL |    2 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> SELECT * FROM t1 WHERE c1<=1 FOR UPDATE;
+----+------+
| c1 | c2   |
+----+------+
|  0 |    0 |
|  1 |    1 |
+----+------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM t1_p WHERE c1<=1 FOR UPDATE;
+----+------+
| c1 | c2   |
+----+------+
|  0 |    0 |
|  1 |    1 |
+----+------+
2 rows in set (0.00 sec)

mysql> SELECT index_name,lock_type,lock_mode,lock_status,lock_data FROM performance_schema.data_locks WHERE object_name = 't1';
+------------+-----------+-----------+-------------+-----------+
| index_name | lock_type | lock_mode | lock_status | lock_data |
+------------+-----------+-----------+-------------+-----------+
| NULL       | TABLE     | IX        | GRANTED     | NULL      |
| PRIMARY    | RECORD    | X         | GRANTED     | 0         |
| PRIMARY    | RECORD    | X         | GRANTED     | 1         |
+------------+-----------+-----------+-------------+-----------+
3 rows in set (0.00 sec)

mysql> SELECT index_name,lock_type,lock_mode,lock_status,lock_data FROM performance_schema.data_locks WHERE object_name = 't1_p';
+------------+-----------+-----------+-------------+-----------+
| index_name | lock_type | lock_mode | lock_status | lock_data |
+------------+-----------+-----------+-------------+-----------+
| NULL       | TABLE     | IX        | GRANTED     | NULL      |
| PRIMARY    | RECORD    | X         | GRANTED     | 0         |
| PRIMARY    | RECORD    | X         | GRANTED     | 1         |
| PRIMARY    | RECORD    | X         | GRANTED     | 2         |
+------------+-----------+-----------+-------------+-----------+
4 rows in set (0.00 sec)

The internal bug number that I get from MySQL 8.0.18 Release Notes. It is accessible to the public. Based on the description, I guess maybe it only fixed InnoDB non-partition table, InnoDB partition table still has this issue.
https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-18.html
InnoDB: An unnecessary next key lock was taken when performing a SELECT...FOR [SHARE|UPDATE] query with a WHERE condition that specifies a range, causing one too many rows to be locked. The most common occurrences of this issue have been addressed so that only rows and gaps that intersect the searched range are locked. (Bug #29508068)
[24 Jun 2021 11:43] Qingping Zhu
Hi, any  progress for this discussion.
[29 Nov 2022 17:01] Alexey Kopytov
The problem in the code appears to be that
ha_innopart::read_range_first() and ha_innopart::read_range_next() that
are called for partitioned tables behave differently to their
ha_innobase counterparts that are used for plain non-partitioned
tables.

ha_innobase::read_range_first/next() both set
m_prebuilt.m_is_reading_range to true:

int ha_innobase::read_range_first(const key_range *start_key,
                                  const key_range *end_key, bool eq_range_arg,
                                  bool sorted) {
  auto guard = m_prebuilt->get_is_reading_range_guard();
  return handler::read_range_first(start_key, end_key, eq_range_arg, sorted);
}

int ha_innobase::read_range_next() {
  auto guard = m_prebuilt->get_is_reading_range_guard();
  return (handler::read_range_next());
}

m_is_reading_range is later used in row_compare_row_to_range() to avoid
processing and locking unnecessary rows that are outside of the range.

That code was introduced with the fix for Bug #29508068 UNNECESSARY
NEXT-KEY LOCK TAKEN.

Based on that, it looks like the fix was indeed incomplete and extending
it to partitioned tables should be fairly straightforward.
[30 Nov 2022 13:32] MySQL Verification Team
Hi , our dear colleague Alexey,

We are afraid that it is not possible to do what you have suggested. Namely, different partition types behave totally differently when it comes to the treatment of the end-of-the-range concept.

Also, the end-of-the-range in the query itself has nothing to do with the end-of-the range of the particular partition. Hence, the extra lock.

Can this be remedies ??? Yes, it can, but it would require men-months of code re-organisation which would involve not only rewriting source code for the partitions , but it would also impact other modules in the InnoDB source code.

This would also require writing  separate interfaces for each of the partition types.

That is a reason why this report can not be treated as a bug, nor as a feature request.