Bug #59496 Commit/abort multi op containing INSERT can lead to crash with parallel scan
Submitted: 14 Jan 2011 10:39 Modified: 26 Jan 2011 10:04
Reporter: Jonas Oreland Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version: OS:Any
Assigned to: Jonas Oreland CPU Architecture:Any

[14 Jan 2011 10:39] Jonas Oreland
Description:
case 1)
committing a multi operation transaction, INS+UPD on same record
The commit will first arrive on the INS and then on the UPD
If a read-committed scan arrives inbetween the INS and the UPD
  (only tup & tux) it could read incorrect data, and if having variable
  sized data, maybe crash.

case 2)
rolling back multi operation transaction , DEL+INS on same record)
The abort will first arrive on the DEL and then on the INS
If a read-committed scan arrives in between the DEL and the INS
  (only tup & tux) it could read incorrectly assume that the record
  should not be returned (i.e treat it as an uncommitted insert)

How to repeat:
new test program.

frazers new ndb_blob_big also sporadically repeats case 1

Suggested fix:
Use ALLOC bit rather the op_type
[14 Jan 2011 11:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/128719

3376 Jonas Oreland	2011-01-14
      ndb - bug#59496 - check ALLOC-bit instead of op == ZINSERT
[14 Jan 2011 11:04] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.40 (revid:jonas@mysql.com-20110114110220-i3y4um2ojgicyd6r) (version source revid:jonas@mysql.com-20110114110220-i3y4um2ojgicyd6r) (merge vers: 5.1.51-ndb-6.3.40) (pib:24)
[14 Jan 2011 12:08] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.21 (revid:jonas@mysql.com-20110114120551-landuyoy4k1tjyli) (version source revid:jonas@mysql.com-20110114120551-landuyoy4k1tjyli) (merge vers: 5.1.51-ndb-7.0.21) (pib:24)
[15 Jan 2011 7:49] Jonas Oreland
pushed also to 7.1.10
[26 Jan 2011 10:04] Jon Stephens
Documented in the NDB-6.3.40, 7.0.21, and 7.1.10 changelogs, as follows:

        Two related problems could occur with read-committed scans
        made in parallel with transactions combining multiple 
        (concurrent) operations:

        (1) When committing a multiple-operation transaction that
        contained concurrent INSERT and UPDATE operations on the same
        record, the commit arrived first for the INSERT and then for the
        UPDATE. If a read-committed scan arrived between these
        operations, it could thus read incorrect data; in addition, if
        the scan read variable-size data, it could cause the data node
        to fail.

        (2) When rolling back a multiple-operation transaction having
        concurrent DELETE and INSERT operations on the same record, the
        abort arrived first for the DELETE operation, and then for the
        INSERT. If a read-committed scan arrived between the DELETE and
        the INSERT, it could incorrectly assume that the record should
        not be returned (in other words, the scan treated the INSERT as
        though it had not yet been committed).

Closed.