MySQL Bugs: #56961: Incorrect REDO invalidation can lead to subsequent "error reading redo log"

Bug #56961	Incorrect REDO invalidation can lead to subsequent "error reading redo log"
Submitted:	23 Sep 2010 6:13	Modified:	23 Sep 2010 10:18
Reporter:	Jonas Oreland	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	mysql-5.1-telco-6.3	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
When node is shutdown,
  it can be that it has completed and synced gci X
  but written lots of log records belonging to X+1

  This is in itself not a problem.
  Problem is if you start, complete & sync X+1
    stop, then the log records from original start must
    not be read.

To make sure that this does not happen, the REDO reader will once
  found last GCI to restore, scan forward and erase log-records that
  was not used (and should never be used)

This bug is that this scan, scanned pages forward from end of log,
  stopping directly when finding a page that was "empty"

However, since REDO log is divided into several files, it could be that
  there was log records in beginning of next file, even of end of previous
  file was empty.

These (in the beginning of next) was then never invalidated,
  and could after start/stop/start be reused leading to a corrupt REDO log

How to repeat:
new testcase

Suggested fix:
code is rewritten to
1) scan first page of each file until
   a empty page is found
2) backtrack to last file, and scan that linearly to find
   last page to invalidate

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118873

3301 Jonas Oreland	2010-09-23 [merge]
      ndb - bug#56961 - fix redo invalidation handling end of file not written

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/118876

3781 Jonas Oreland	2010-09-23 [merge]
      ndb - merge bug#56961 into 70

Pushed into mysql-5.1-telco-7.0 5.1.47-ndb-7.0.19 (revid:jonas@mysql.com-20100923062337-532mlci8mn53188k) (version source revid:jonas@mysql.com-20100923062337-532mlci8mn53188k) (merge vers: 5.1.47-ndb-7.0.19) (pib:21)

Pushed into mysql-5.1-telco-6.3 5.1.47-ndb-6.3.38 (revid:jonas@mysql.com-20100923061724-9lapdqjviz1kmwyx) (version source revid:jonas@mysql.com-20100923061724-9lapdqjviz1kmwyx) (merge vers: 5.1.47-ndb-6.3.38) (pib:21)

pushed to 6.3.38, 7.0.19 and 7.1.8

Documented bugfix in the NDB-6.3.38, 7.0.19, and 7.1.8 changelogs, as follows:

        A data node can be shut down having completed and synchronized a
        given GCI x, and having written a great many log records
        belonging to the next GCI x+1, as part of normal operations.
        However, when starting, completing, and synchronizing GCI x+1,
        then the log records from original start must not be read. To
        make sure that this does not happen, the REDO log reader finds
        the last GCI to restore, scans forward from that point, and
        erases any log records that were not (and should never be) used.

        The current issue occurred because this scan stopped immediately
        as soon as it encountered an empty page. This was problematic
        because the REDO log is divided into several files; thus, it
        could be that there were log records in the beginning of the
        next file, even if the end of the previous file was empty. These
        log records were never invalidated; following a start or
        restart, they could be reused, leading to a corrupt REDO log.

Closed.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119379

3288 Martin Skold	2010-09-29 [merge]
      Merge
      removed:
        cluster_change_hist.txt
      modified:
        mysql-test/collections/default.experimental
        mysql-test/suite/ndb/r/ndb_database.result
        mysql-test/suite/ndb/t/ndb_database.test
        sql/ha_ndbcluster.cc
        sql/ha_ndbcluster.h
        sql/ha_ndbcluster_binlog.cc
        sql/handler.cc
        sql/handler.h
        sql/sql_show.cc
        sql/sql_table.cc
        storage/ndb/include/kernel/GlobalSignalNumbers.h
        storage/ndb/include/kernel/signaldata/FsReadWriteReq.hpp
        storage/ndb/include/mgmapi/mgmapi.h
        storage/ndb/include/ndbapi/NdbDictionary.hpp
        storage/ndb/src/kernel/blocks/ERROR_codes.txt
        storage/ndb/src/kernel/blocks/dbdict/Dbdict.cpp
        storage/ndb/src/kernel/blocks/dbdih/DbdihMain.cpp
        storage/ndb/src/kernel/blocks/dblqh/Dblqh.hpp
        storage/ndb/src/kernel/blocks/dblqh/DblqhMain.cpp
        storage/ndb/src/kernel/blocks/dbtup/Dbtup.hpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupIndex.cpp
        storage/ndb/src/kernel/blocks/dbtup/DbtupMeta.cpp
        storage/ndb/src/kernel/blocks/dbtux/Dbtux.hpp
        storage/ndb/src/kernel/blocks/dbtux/DbtuxBuild.cpp
        storage/ndb/src/kernel/blocks/dbtux/DbtuxMaint.cpp
        storage/ndb/src/kernel/blocks/dbtux/DbtuxNode.cpp
        storage/ndb/src/kernel/blocks/dbtux/DbtuxTree.cpp
        storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.cpp
        storage/ndb/src/kernel/blocks/ndbfs/AsyncFile.hpp
        storage/ndb/src/kernel/blocks/ndbfs/Ndbfs.cpp
        storage/ndb/src/kernel/blocks/ndbfs/Ndbfs.hpp
        storage/ndb/src/kernel/blocks/ndbfs/VoidFs.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.cpp
        storage/ndb/src/kernel/blocks/suma/Suma.hpp
        storage/ndb/src/kernel/main.cpp
        storage/ndb/src/ndbapi/DictCache.cpp
        storage/ndb/src/ndbapi/DictCache.hpp
        storage/ndb/src/ndbapi/NdbDictionary.cpp
        storage/ndb/src/ndbapi/NdbDictionaryImpl.cpp
        storage/ndb/src/ndbapi/NdbDictionaryImpl.hpp
        storage/ndb/test/include/NdbRestarter.hpp
        storage/ndb/test/ndbapi/testIndex.cpp
        storage/ndb/test/ndbapi/testRestartGci.cpp
        storage/ndb/test/ndbapi/testSystemRestart.cpp
        storage/ndb/test/run-test/daily-basic-tests.txt
        storage/ndb/test/src/NdbRestarter.cpp