Bug #115511 Inplace ALTER TABLE might fail with duplicate key error if concurrent insertions
Submitted: 4 Jul 11:46 Modified: 4 Jul 13:12
Reporter: Dmitry Lenev (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[4 Jul 11:46] Dmitry Lenev
Description:
ALTER TABLE with rebuilds InnoDB table using INPLACE algorithm occasionally might fail with unwarranted duplicate primary key error if there are concurrent insertions into the table, even though these insertions do not cause any PK conflict.

This issue is different from one described in bug #98600 "Optimize table fails with duplicate entry on UNIQUE KEY" as unlike in #98600 case concurrent INSERTs causing the problem use non-conflicting PK values and thus do not cause PK violations even temporarily!

Users have observed this problem in the wild and I can repeat it non-deterministically by running concurrent workload involving INSERTions and ALTER TABLEs.

However, it is much easier to reproduce and analyze it using deterministic test case involving DEBUG and DEBUG_SYNC facilities, so I will provide this test case instead non-deterministic / highly concurrent one.

How to repeat:
Introduce a new debug sync point by applying the following patch (based on MySQL 8.0.38) and building debug version of server:

diff --git a/storage/innobase/ddl/ddl0par-scan.cc b/storage/innobase/ddl/ddl0par-scan.cc
index f7f948a0f74..f334a7d2f5c 100644
--- a/storage/innobase/ddl/ddl0par-scan.cc
+++ b/storage/innobase/ddl/ddl0par-scan.cc
@@ -237,6 +237,7 @@ dberr_t Parallel_cursor::scan(Builders &builders) noexcept {
             thread_ctx->get_state() != Parallel_reader::State::THREAD) {
           thread_ctx->savepoint();
           latches_released = true;
+          DEBUG_SYNC_C("ddl_bulk_inserter_latches_released");
         }
         return DB_SUCCESS;
       });

Run the following test for MTR framework against the newly built debug server:

CREATE TABLE t1 (pk CHAR(5) PRIMARY KEY);
INSERT INTO t1 VALUES ('aaaaa'), ('bbbbb'), ('ccccc'), ('ddddd'), ('eeeee');

--connect (con1, localhost, root,,)
# We use the following debug flag to simulate overflow of key buffer in InnoDB
# online ALTER TABLE implementation.
SET DEBUG='+d,ddl_buf_add_two';
SET DEBUG_SYNC='ddl_bulk_inserter_latches_released SIGNAL latches_released WAIT_FOR go';
--send ALTER TABLE t1 ENGINE=InnoDB, ALGORITHM=INPLACE

--connection default
SET DEBUG_SYNC='now WAIT_FOR latches_released';
INSERT INTO t1 VALUES ('ccaaa');
SET DEBUG_SYNC='now SIGNAL go';

--connection con1
# Instead of succeeding ALTER TABLE fails with
# "ERROR 1062 (23000): Duplicate entry 'ccccc' for key 't1.PRIMARY'' error
--reap

SET DEBUG='-d,ddl_buf_add_two';

--connection default
DROP TABLE t1;

Suggested fix:
It seems that the problem was introduced by WL#14283 - Parallel create index implementation in 8.0.

AFAICS it is related to the way how parallel read code used by online ALTER TABLE saves/restore cursor position while scanning old table version when it needs to flush changes to new version of the table:

void PCursor::savepoint() noexcept {
  /* Store the cursor position on the previous user record on the page. */
  m_pcur->move_to_prev_on_page();

  m_pcur->store_position(m_mtr);

  m_mtr->commit();
}

void PCursor::resume() noexcept {
  m_mtr->start();

  m_mtr->set_log_mode(MTR_LOG_NO_REDO);

  /* Restore position on the record, or its predecessor if the record
  was purged meanwhile. */

  restore_position();

  if (!m_pcur->is_after_last_on_page()) {
    /* Move to the successor of the saved record. */
    m_pcur->move_to_next_on_page();
  }
}

For example, what happens in the above test case is:
1) ALTER TABLE tries to save cursor position when processing 'ccccc' record, saves the position to previous record on page (i.e. 'bbbbb'),
2) then after processing inserts into new table version, it restores the cursor to point to 'bbbbb'
3) and then moves it to the next record with intention to make it to point to 'ccccc', however thanks to concurrent insert it ends up pointing to 'ccaaa' record, 
4) which means that the code will process 'ccccc' once again causing duplicate key error.
[4 Jul 13:12] MySQL Verification Team
Hi Mr. Lenev,

Thank you very much for your bug report.

We have already built a debug version of mysql-8.0.38 (with memory checking included). Hence, we have just introduced the said DEBUG_SYNC_C and ran your test case.

We then ran your test case and indeed got:

"ERROR 1062 (23000): Duplicate entry 'ccccc' for key 't1.PRIMARY'' error

This is now a verified bug for 8.0 and higher supported versions.

We thank you very much for your patch contribution.

Verified.
[4 Jul 14:16] MySQL Verification Team
Hi Dmitry,

This is the full output from the test run:

------------------------------------------------------------------

==============================================================================
                  TEST NAME                       RESULT  TIME (ms) COMMENT
------------------------------------------------------------------------------
[ 50%] main.dmitry_115511                        [ fail ]
        Test ended at 2024-07-04 17:14:14

CURRENT_TEST: main.dmitry_115511
mysqltest: At line 20: Query 'reap' failed.
ERROR 1062 (23000): Duplicate entry 'ccccc' for key 't1.PRIMARY'
safe_process[33185]: Child process: 33186, exit: 1

Mysqltest client output from logfile
----------- MYSQLTEST OUTPUT START -----------

CREATE TABLE t1 (pk CHAR(5) PRIMARY KEY);
INSERT INTO t1 VALUES ('aaaaa'), ('bbbbb'), ('ccccc'), ('ddddd'), ('eeeee');
SET DEBUG='+d,ddl_buf_add_two';
SET DEBUG_SYNC='ddl_bulk_inserter_latches_released SIGNAL latches_released WAIT_FOR go';
ALTER TABLE t1 ENGINE=InnoDB, ALGORITHM=INPLACE;
SET DEBUG_SYNC='now WAIT_FOR latches_released';
INSERT INTO t1 VALUES ('ccaaa');
SET DEBUG_SYNC='now SIGNAL go';

------------ MYSQLTEST OUTPUT END -----------

-----------------------------------------------------------------