Bug #44015 Abort of insert+delete can lead to committed read scan reading inconsistent data
Submitted: 1 Apr 2009 17:04 Modified: 15 Apr 2009 2:41
Reporter: Frazer Clement Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S2 (Serious)
Version:6.2+ OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any
Triage: Needs Triage: D1 (Critical)

[1 Apr 2009 17:04] Frazer Clement
Description:
Attached patch adds 2 testcases to testNdbApi.

Executing them against 6.2/6.3/6.4 generally results in assertion failures in TUP and LQH indicating some sort of data corruption.

  /* testNdbApi -n WeirdAssertFail
   * Generates phrase "here2" on 6.3 which is 
   * output by DbtupExecQuery::handleReadReq()
   * detecting that the record's tuple checksum
   * is incorrect.
   * Later can generate assertion failure in 
   * prepare_read
   *         ndbassert(src_len >= (dynstart - src_data));
   * resulting in node failure
   */

  /* testNdbApi -n WeirdAssertFail2
   * Results in assertion failure in DbtupCommit::execTUP_DEALLOCREQ()
   *   ndbassert(ptr->m_header_bits & Tuple_header::FREE);
   * Also, sometimes an ndbrequire failre in LQH::execACCKEYREF
   *   if (unlikely(! (tcPtr->seqNoReplica == 0 ||
   *                   errCode != ZTUPLE_ALREADY_EXIST ||
   *                   (tcPtr->operation == ZREAD && 
   *                    (tcPtr->dirtyOp || tcPtr->opSimple)))))
   *   {
   *    ...
   *    ndbrequire(false);
   *
   * Results in node failure
   */

How to repeat:
Run testcases using standard Hugo tables against 2-node cluster at 6.2, 6.3 or 6.4.

Testcases experiment with theme of inserting and deleting the same rows in a single transaction, then aborting the transaction.

Point of assertion/ndbrequire failures varies, and may not always occur, or may not occur until after some other NDABPI errors (etc. Out of Redo log etc.).
[1 Apr 2009 17:04] Frazer Clement
Patch to add testcases to testNdbApi

Attachment: 62-weird-assert.patch (text/x-patch), 4.71 KiB.

[3 Apr 2009 7:47] Jonas Oreland
note: split into 2, update subject on this.

When aborting a insert+delete, the insert and delete are aborted
"separately" (since TC does not know that the operations are on same row)

The abort of the insert comes first, if a committed read scan
(tup scan or index scan) then examines the row after the insert has been aborted
but before the delete has been aborted, it could in same cases find the row
in a inconsistent state. NOTE: backup+lcp does committed-read tup scans!

This problem was fixed for the ordered index, by always also aborting all operations *after* the operation being asked to abort.

The fix for the bug is to generalize that code, and also apply it to the actual data row.
[3 Apr 2009 7:48] Jonas Oreland
extra clarification: pk-operations or scans using any kind of lock
is not affected, since they are serialized in ACC
[3 Apr 2009 8:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71291

2897 Jonas Oreland	2009-04-03
      ndb - bug#44015 - fix abort of insert+delete, so that committed read scan can't get inbetween
[3 Apr 2009 19:55] Bugs System
Pushed into 5.1.32-ndb-6.3.24 (revid:jonas@mysql.com-20090403100824-h0lvd8lr4frk17dc) (version source revid:jonas@mysql.com-20090403100824-h0lvd8lr4frk17dc) (merge vers: 5.1.32-ndb-6.3.24) (pib:6)
[3 Apr 2009 19:56] Bugs System
Pushed into 5.1.32-ndb-7.0.5 (revid:jonas@mysql.com-20090403125707-ma9xedfo4t8oip3z) (version source revid:jonas@mysql.com-20090403125707-ma9xedfo4t8oip3z) (merge vers: 5.1.32-ndb-7.0.5) (pib:6)
[3 Apr 2009 19:57] Bugs System
Pushed into 5.1.32-ndb-6.2.18 (revid:jonas@mysql.com-20090403082438-zxbfx8pofugzjlf5) (version source revid:jonas@mysql.com-20090403082438-zxbfx8pofugzjlf5) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)
[4 Apr 2009 13:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71384

2898 Jonas Oreland	2009-04-04
      ndb - bug#44015 - apparently all tux triggers should fire first...
[4 Apr 2009 20:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71387

2898 Jonas Oreland	2009-04-04
      ndb - bug#44015 - apparently all tux triggers should fire first...
[4 Apr 2009 20:55] Bugs System
Pushed into 5.1.32-ndb-6.2.18 (revid:jonas@mysql.com-20090404205024-foid3jeg2n1xxw1w) (version source revid:jonas@mysql.com-20090404205024-foid3jeg2n1xxw1w) (merge vers: 5.1.32-ndb-6.2.18) (pib:6)
[4 Apr 2009 20:56] Bugs System
Pushed into 5.1.32-ndb-6.3.24 (revid:jonas@mysql.com-20090404205223-at44d5n9y4uovmzc) (version source revid:jonas@mysql.com-20090404205223-at44d5n9y4uovmzc) (merge vers: 5.1.32-ndb-6.3.24) (pib:6)
[4 Apr 2009 20:56] Bugs System
Pushed into 5.1.32-ndb-7.0.5 (revid:jonas@mysql.com-20090404205352-va9m5fufgc20ho8h) (version source revid:jonas@mysql.com-20090404205352-va9m5fufgc20ho8h) (merge vers: 5.1.32-ndb-7.0.5) (pib:6)
[15 Apr 2009 2:41] Jon Stephens
Documented bugfix in the NDB-6.2.18, 6.3.24, and 7.0.5 changelogs as follows:

        When aborting an operation involving both an insert and a delete, the
        insert and delete were aborted separately. This was because the
        transaction coordinator did not know that the operations affected on
        same row, and, in the case of a committed-read (tuple or index) scan,
        the abort of the insert was performed first, then the row was examined
        after the insert was aborted but before the delete was aborted. In some
        cases, this would leave the row in a inconsistent state. This could
        occur when a local checkpoint was performed during a backup. This issue
        did not affect primary ley operations or scans that used locks (these
        are serialized).

        After this fix, for ordered indexes, all operations that follow the
        operation to be aborted are now also aborted.