MySQL Bugs: #61500: Rare bug in redo invalidation can lead to "Error while reading the REDO log"

Bug #61500	Rare bug in redo invalidation can lead to "Error while reading the REDO log"
Submitted:	13 Jun 2011 11:27	Modified:	14 Jun 2011 14:16
Reporter:	Jonas Oreland	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	6.3.0	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
===========
A    B    C
GCI

Suppose alive redo-log writes log from A to C
It will fsync redo-log
- at each GCI (A)
- end of each file
- start of each mega-byte

suppose that there is no end-of-file or mega-byte border between
A and C.

Then (very) rarely it could be that OS makes C durable on disk,
but B never gets written.

This scenario could lead to data-node starting assume that end of redo-log is
somewhere in between A and B. If data-node then starts, and gets stopped again
before having over-written C, it can be at next restart it encounters a "Error while reading the REDO log"

---

This is quite similar to http://bugs.mysql.com/bug.php?id=56961

How to repeat:
repeated (every now and then) by autotest on solaris

Suggested fix:
.

pushed to 6.3.45, 7.0.26 and 7.1.15

Documented bugfix in the NDB 6.3.45, 7.0.26, and 7.1.15 changelogs, as follows:

        When global checkpoint indexes were written with no intervening
        end-of-file or megabyte border markers, this could sometimes
        lead to a situation in which the end of the redo log was
        mistakenly regarded as being between these GCIs, so that if the
        restart of a data node took place before the start of the next
        redo log was overwritten, the node encountered -Error while
        reading the REDO log-.

Closed.