Bug #119759 InnoDB allow reads on non-durable data, which can be lost after crash(visibility before durability window)
Submitted: 22 Jan 18:55 Modified: 22 Jan 21:14
Reporter: Divyam Jaiswal Email Updates:
Status: Open Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:8.0.42 OS:Any
Assigned to: CPU Architecture:Any
Tags: data loss, innodb, innodb_flush_log_at_trx_commit

[22 Jan 18:55] Divyam Jaiswal
Description:
In MySQL InnoDB (tested on 8.0.42) with innodb_flush_log_at_trx_commit = 1 and no binary log, a row inserted in a transaction can be:

1. Visible to other sessions after COMMIT returns
2. Then server is killed before the redo log flush completes
3. After restart and crash recovery, the row is lost
4. A concurrent reader had seen the row before crash

Because the row was visible before crash but lost after, this may appear as a dirty read followed by rollback — though strictly it is due to visibility before durability rather than classic isolation violation.
This scenario reliably reproduces with a high-concurrency workload.

Expected Behavior

With innodb_flush_log_at_trx_commit = 1, a committed transaction should be durable and survive a server crash after COMMIT returns.
A statement such as:

INSERT INTO t_ids (id) VALUES (X);COMMIT;SELECT id FROM t_ids WHERE id = X;

should persist the row X after a crash following the COMMIT.

Observed Behavior

Under a specific timing window, when killing the server immediately after the COMMIT but before the redo log flush completes, the row:

* is visible to other sessions before the crash
* is missing after restart
* InnoDB crash recovery rolls it back

This replicates a scenario where:

* Transaction T1 commits in memory and becomes visible
* Reader R sees the row
* kill -9 mysqld happens before redo flush
* After restart, the row is absent

Thus the server appears to have lost a committed transaction even though innodb_flush_log_at_trx_commit = 1.

Why This Happens

InnoDB’s commit path does:

1. Mark transaction as committed in memory
2. Release locks
3. Make row visible
4. Then flush redo log to disk

If a crash occurs in step 4 before redo flush completes, crash recovery does not find the commit in redo and rolls back the transaction. The row becomes invisible after restart even though it was visible before crash.
This matches the behavior described in other contexts where innodb_flush_log_at_trx_commit != 1 can lose transactions, but here it happens even with =1 due to the internal ordering of visibility vs. durability.

Impact

Even with the strictest durability setting (innodb_flush_log_at_trx_commit=1) and REPEATABLE READ, this can lead to lost visible transactions after a crash. This is especially problematic in high-concurrency environments with abrupt server termination.

How to repeat:
Environment

MySQL Server version: 8.0.42
Storage engine: InnoDB
innodb_flush_log_at_trx_commit = 1
sync_binlog = 0
binlog disabledIsolation level: default (REPEATABLE READ)
Operating system: any GNU/Linux
Tested with simple primary-key insert workloads

Steps to Reproduce

1. Create a table:
    CREATE DATABASE test_crash;
    USE test_crash;
    CREATE TABLE t_ids (
        id BIGINT NOT NULL PRIMARY KEY,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    ) ENGINE = InnoDB;
2. Start multiple concurrent writers inserting increasing IDs in transactions with autocommit OFF.
3. Start a reader thread polling latest IDs.
4. Randomly crash the server:
    sleep 0.1 && kill -9 $(pidof mysqld)
5. After restart, observe that some IDs previously observed by the reader are missing.

T1 @ T0:000 — T1 START TRANSACTION
T1 @ T0:005 — INSERT id=1005 (redo buffer & buffer pool updated, no fsync)
T1 @ T0:010 — COMMIT (InnoDB marks committed in memory & releases locks)
R  @ T0:011 — Reader sees id=1005
 Crash @ T0:012 — kill -9 before redo log flush to diskRestart @ T0:050 — crash recovery runs
Recovery: no redo commit for T1, rollback undo
R  @ T0:052 — SELECT does not return id=1005

Suggested fix:
Either:

1. Guarantee that visibility to other sessions occurs only after redo flush reaches disk, or
2. Document this window explicitly in MySQL/InnoDB documentation, as innodb_flush_log_at_trx_commit=1 does not in practice eliminate the visibility-before-durability window under crash conditions.

A related discussion in the MySQL Bug System notes that committing transactions may not be flushed to disk before a crash, leading to data loss under some configurations:
https://bugs.mysql.com/75519 — “Crash server after a transaction is committed, but before redo log is flushed. After recovery, the transaction will have been rolled back.” 
Also, InnoDB crash recovery documentation confirms that only redo logs on disk determine recovery and unflushed transactions are rolled back:
https://dev.mysql.com/doc/en/innodb-recovery.html
[22 Jan 21:14] Divyam Jaiswal
Mysql source code comment mentions that this could happen but there is no official mysql documentation for the same.

https://github.com/mysql/mysql-server/blob/mysql-cluster-8.0.42/storage/innobase/trx/trx0t...