Bug #109015 BETWEEN takes more locks then needed
Submitted: 7 Nov 2022 10:55 Modified: 8 Nov 2022 13:49
Reporter: Maksim Novikov Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version: OS:Linux
Assigned to: CPU Architecture:Any
Tags: locking mysql

[7 Nov 2022 10:55] Maksim Novikov
Description:
The problem is: why would innoDB try to get the id=6 lock is sess2 (Test 2)? It's more then needed.

(Test 2) If we take a look on locks set by these statements in performance_schema, we see that:
sess1 took locks on id=6,7,8,9,10 (that's good)
sess2 took locks on id=1,2,3,4,5 and trying to get the lock on id=6 (why?), which is blocked by sess1.

If I replace BETWEEN with "IN" or "=" operators, it will work fine. The problem appears only when using BETWEEN operator.
Tested on mysql 5.7, mysql 8.

I have not found explanation on this behaviour in docs. If it's there, put the link in comments, please.

How to repeat:
-- create simple InnoDB table and insert some test data in it
CREATE TABLE my_table (
    id INT UNSIGNED NOT NULL AUTO_INCREMENT
 , data TINYINT DEFAULT NULL
 , PRIMARY KEY (id)
)
ENGINE = INNODB;

INSERT INTO my_table(id, data) VALUES (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1);

# Note please, all tests have trx_isolation=READ COMMITTED.

# Test 1, everything is okey here
-- sess1:
DELETE FROM my_table WHERE id BETWEEN 1 AND 5; -- ok
-- sess2:
DELETE FROM my_table WHERE id BETWEEN 6 AND 10; -- ok

# Test 2, here we have some troubles
-- sess1:
DELETE FROM my_table WHERE id BETWEEN 6 AND 10; -- ok
-- sess2:
DELETE FROM my_table WHERE id BETWEEN 1 AND 5; -- waiting for the lock on id=6
[7 Nov 2022 12:05] Maksim Novikov
Here is screenshot from performance_schema

Attachment: photo_2022-11-05_12-14-50.jpg (image/jpeg, text), 40.75 KiB.

[7 Nov 2022 15:35] MySQL Verification Team
Hi Mr. Novikov,

Thank you for your bug report.

What you have written looks like recipe for the deadlock.

Anyway, we would like to see the output from your innodb status.

Waiting on your feedback.
[7 Nov 2022 15:55] Maksim Novikov
I cut lots of unnecessary data from it, but you can clearly see that there is no deadlock here. It's just normal lock

Attachment: show_engine_innodb_status.txt (text/plain), 1.92 KiB.

[8 Nov 2022 12:37] MySQL Verification Team
Hi Mr. Novikov,

The InnoDB Status that you have sent us is not from the situation that you reporting.

We need a full output from the InnoDB Status that shows unnecessary locks !!!
[8 Nov 2022 13:21] Maksim Novikov
Okay, here is full output of innodb status

Attachment: innodb_status.txt (text/plain), 4.79 KiB.

[8 Nov 2022 13:35] Maksim Novikov
I set up local mysql server. I run two transactions (see Test 2 case) and you can find what is happening on server in innodb_status.txt file.
[8 Nov 2022 13:36] MySQL Verification Team
Hi Mr. Novikov,

Yes, that is the output that we wanted to see .......

However, this is expected behaviour with transactional engines with pessimistic locking.

A record with PK value of 6 is also locked, since, at the storage engine level, next record is locked as well, because an intersecting DML might arrive that could span part of the range. We do not document all the possible scenarios , since there are so many.

Thank you for your report.

Not a bug.
[8 Nov 2022 13:49] Maksim Novikov
So, if we lock like this, we dont have the lock conflict:
-- sess1:
DELETE FROM my_table WHERE id BETWEEN 1 AND 5; -- ok
-- sess2:
DELETE FROM my_table WHERE id BETWEEN 6 AND 10; -- ok

But, should we change the order of the statements, we DO have the lock conflict.
Seems inconsistent, according to your answer, we should have the lock conflict in both situations.
[8 Nov 2022 13:57] MySQL Verification Team
Hi,

This is expected behaviour, due to the fact that it is always about the next record and it is very much time-dependent and here we are speaking about microseconds.

These are details that are explained in the post-graduate textbooks, not in the Reference Manuals.
[1 Dec 2022 9:38] Alexey Kopytov
I think it's a problem in the server code after all. Which has always
been there, and is likely reported as another bug, but I didn't bother
to search for duplicates.

I guess the chances for this bug report to be re-verified are slim,
I'm posting the results of my analysis mostly as a note to self, so
they are not lost and hopefully become useful for someone else.

It is better to start explanations with this short function (the
source code is from MySQL 8.0.31).

int handler::read_range_next() {
  DBUG_TRACE;

  int result;
  if (eq_range) {
    /* We trust that index_next_same always gives a row in range */
    result =
        ha_index_next_same(table->record[0], end_range->key, end_range->length);
  } else {
    result = ha_index_next(table->record[0]);
    if (result) return result;

    if (compare_key(end_range) > 0) {
      /*
        The last read row does not fall in the range. So request
        storage engine to release row lock if possible.
      */
      unlock_row();
      result = HA_ERR_END_OF_FILE;
    }
  }
  return result;
}

This function is what the server uses to execute range scans for
conditions like "WHERE id BETWEEN 1 AND 5". That is, it starts a scan
with handler::read_range_first(), specifying the range endpoints, and
then iteratively calls read_range_next() until HAS_ERR_END_OF_FILE is
returned.

As seen from the code above, for equality ranges ("id=N") we trust the
storage engine to always return rows from the range. For plain ranges
we know that the storage engine may return rows outside of the range,
which is true for InnoDB, so we do an extra check if the returned row
is outside of the range, in which case we *unlock* the row and signal
the end of the range.

Which immediately answers a few questions from the bug report. We are
looking at the following 2 queries here:

(Q1) DELETE FROM my_table WHERE id BETWEEN 1 AND 5;
(Q2) DELETE FROM my_table WHERE id BETWEEN 6 AND 10; 

1. Why is no extra lock on row with id=6 taken, if Q1 is executed
before Q2 in a parallel connection? The answer is that the lock on row
with id=6 is actually taken by Q1, but immediately released. Which
goes unnoticed if locking that row succeeds, i.e. if it's not already
locked by another transaction. If it already is locked by Q2 in a
parallel connection, Q1 has to wait and our "temporary" lock on row id
6 becomes visible in performance_schema.data_locks, for example.

2. Why is there no lock on row id 6, if BETWEEN is replaced with "="
or IN() operators? Because in that case the server is scanning
multiple equality ranges, in which case InnoDB returns (and locks)
only rows belonging to the specified range(s), and we don't have to
unlock excessively locked rows outside of the range.

In MySQL 5.7, this behavior applies to all isolation levels.

In MySQL 8.0, fix for internal Bug #29508068 "UNNECESSARY NEXT-KEY LOCK
TAKEN" tried to improve that for certain cases, particularly by
introducing prebuilt->m_stop_tuple. So InnoDB, when executing a range
scan, now checks rows against range boundaries before they are locked
and passed to upper layers. Unfortunately, that improvement does not
cover READ COMMITTED, only higher isolation levels (see checks around
set_also_gap_locks in row_compare_row_to_range()). Which explains why
this bug report is specific to READ COMMITTED. At the same time,
extending that improvement to cover READ COMMITTED as well looks
feasible.