MySQL Bugs: #44015: Abort of insert+delete can lead to committed read scan reading inconsistent data

Bug #44015	Abort of insert+delete can lead to committed read scan reading inconsistent data
Submitted:	1 Apr 2009 17:04	Modified:	15 Apr 2009 2:41
Reporter:	Frazer Clement	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S2 (Serious)
Version:	6.2+	OS:	Any
Assigned to:	Jonas Oreland	CPU Architecture:	Any

Description:
Attached patch adds 2 testcases to testNdbApi.

Executing them against 6.2/6.3/6.4 generally results in assertion failures in TUP and LQH indicating some sort of data corruption.

  /* testNdbApi -n WeirdAssertFail
   * Generates phrase "here2" on 6.3 which is 
   * output by DbtupExecQuery::handleReadReq()
   * detecting that the record's tuple checksum
   * is incorrect.
   * Later can generate assertion failure in 
   * prepare_read
   *         ndbassert(src_len >= (dynstart - src_data));
   * resulting in node failure
   */

  /* testNdbApi -n WeirdAssertFail2
   * Results in assertion failure in DbtupCommit::execTUP_DEALLOCREQ()
   *   ndbassert(ptr->m_header_bits & Tuple_header::FREE);
   * Also, sometimes an ndbrequire failre in LQH::execACCKEYREF
   *   if (unlikely(! (tcPtr->seqNoReplica == 0 ||
   *                   errCode != ZTUPLE_ALREADY_EXIST ||
   *                   (tcPtr->operation == ZREAD && 
   *                    (tcPtr->dirtyOp || tcPtr->opSimple)))))
   *   {
   *    ...
   *    ndbrequire(false);
   *
   * Results in node failure
   */

How to repeat:
Run testcases using standard Hugo tables against 2-node cluster at 6.2, 6.3 or 6.4.

Testcases experiment with theme of inserting and deleting the same rows in a single transaction, then aborting the transaction.

Point of assertion/ndbrequire failures varies, and may not always occur, or may not occur until after some other NDABPI errors (etc. Out of Redo log etc.).

Patch to add testcases to testNdbApi

Attachment: 62-weird-assert.patch (text/x-patch), 4.71 KiB.

note: split into 2, update subject on this.

When aborting a insert+delete, the insert and delete are aborted
"separately" (since TC does not know that the operations are on same row)

The abort of the insert comes first, if a committed read scan
(tup scan or index scan) then examines the row after the insert has been aborted
but before the delete has been aborted, it could in same cases find the row
in a inconsistent state. NOTE: backup+lcp does committed-read tup scans!

This problem was fixed for the ordered index, by always also aborting all operations *after* the operation being asked to abort.

The fix for the bug is to generalize that code, and also apply it to the actual data row.

extra clarification: pk-operations or scans using any kind of lock
is not affected, since they are serialized in ACC

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71291

2897 Jonas Oreland	2009-04-03
      ndb - bug#44015 - fix abort of insert+delete, so that committed read scan can't get inbetween

Pushed into 5.1.32-ndb-6.3.24 (revid:jonas@mysql.com-20090403100824-h0lvd8lr4frk17dc) (version source revid:jonas@mysql.com-20090403100824-h0lvd8lr4frk17dc) (merge vers: 5.1.32-ndb-6.3.24) (pib:6)

Pushed into 5.1.32-ndb-7.0.5 (revid:jonas@mysql.com-20090403125707-ma9xedfo4t8oip3z) (version source revid:jonas@mysql.com-20090403125707-ma9xedfo4t8oip3z) (merge vers: 5.1.32-ndb-7.0.5) (pib:6)

Pushed into 5.1.32-ndb-6.2.18 (revid:jonas@mysql.com-20090403082438-zxbfx8pofugzjlf5) (version source revid:jonas@mysql.com-20090403082438-zxbfx8pofugzjlf5) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71384

2898 Jonas Oreland	2009-04-04
      ndb - bug#44015 - apparently all tux triggers should fire first...

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71387

2898 Jonas Oreland	2009-04-04
      ndb - bug#44015 - apparently all tux triggers should fire first...

Pushed into 5.1.32-ndb-6.2.18 (revid:jonas@mysql.com-20090404205024-foid3jeg2n1xxw1w) (version source revid:jonas@mysql.com-20090404205024-foid3jeg2n1xxw1w) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)

Pushed into 5.1.32-ndb-6.3.24 (revid:jonas@mysql.com-20090404205223-at44d5n9y4uovmzc) (version source revid:jonas@mysql.com-20090404205223-at44d5n9y4uovmzc) (merge vers: 5.1.32-ndb-6.3.24) (pib:6)

Pushed into 5.1.32-ndb-7.0.5 (revid:jonas@mysql.com-20090404205352-va9m5fufgc20ho8h) (version source revid:jonas@mysql.com-20090404205352-va9m5fufgc20ho8h) (merge vers: 5.1.32-ndb-7.0.5) (pib:6)

Documented bugfix in the NDB-6.2.18, 6.3.24, and 7.0.5 changelogs as follows:

        When aborting an operation involving both an insert and a delete, the
        insert and delete were aborted separately. This was because the
        transaction coordinator did not know that the operations affected on
        same row, and, in the case of a committed-read (tuple or index) scan,
        the abort of the insert was performed first, then the row was examined
        after the insert was aborted but before the delete was aborted. In some
        cases, this would leave the row in a inconsistent state. This could
        occur when a local checkpoint was performed during a backup. This issue
        did not affect primary ley operations or scans that used locks (these
        are serialized).

        After this fix, for ordered indexes, all operations that follow the
        operation to be aborted are now also aborted.