MySQL Bugs: #31775: NdbOperation errors are found on the NdbTransaction object

Bug #31775	NdbOperation errors are found on the NdbTransaction object
Submitted:	23 Oct 2007 5:30	Modified:	22 Oct 2008 17:23
Reporter:	Monty Taylor	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: NDB API	Severity:	S3 (Non-critical)
Version:	5.1.22	OS:	Any
Assigned to:	Frazer Clement	CPU Architecture:	Any

Description:
If an error occurs on an operation method, like: 

op->setValue("foo",1);

a call to op->getNdbError().message will result in "No Error", but a valid error is returned for tran->getNdbError()

This makes it almost impossible to write any code that operates on the operation without also having the transaction around. (This is a very common occurrence in helper methods in the NDB/Connectors) 

How to repeat:
  Ndb_cluster_connection *cluster_connection=
    new Ndb_cluster_connection(); 
  cluster_connection->connect(5,3,1);
  cluster_connection->wait_until_ready(30,30);

  Ndb* myNdb = new Ndb( cluster_connection,"TEST_DB_1" ); 
  myNdb->init();
  const NdbDictionary::Dictionary* myDict= myNdb->getDictionary();
  const NdbDictionary::Table *myTable= myDict->getTable("MYTABLENAME");
  
  NdbTransaction *myTransaction= myNdb->startTransaction();
  NdbOperation *myOperation= myTransaction->getNdbOperation(myTable);
      
  myOperation->insertTuple();
              
// HERE - put in a column that doesn't exist.
  myOperation->equal("ATTR1",1);
  printf("%s == No Error",myOperation->getNdbError().message);
  printf("%s == Valid Error",myTransaction->getNdbError().message);

Suggested fix:
Well, a patch that just went in post 5.1.22 will allow a work around, which is that NdbOperation now has a getTransaction() method, so something operating on an NdbOperation could call that to then call getNdbError()

But in general, having the error attached to the thing that caused it would be quite a bit clearer, since that is the prevailing pattern in the rest of the code.

I always wondered about this, too ...

Note that the example given above does not fail *as described*.

A separate bug#40242 : NDBAPI : All NDBAPI objects should provide accessors for their 'parents' has been created to add accessors to the Transaction object from Blob and Scan operations.

For this bug, some further description of how errors can be detected and mapped onto particular operations is given below.

Errors during operation definition
----------------------------------
NDBAPI can generate errors during operation definition, and during operation execution. 
Errors generated during operation definition result in a failure return code from the method called.  The actual error can be determined by examining the relevant NdbOperation object, or the operation's NdbTransaction object.

Errors during operation execution
---------------------------------
Errors occurring during operation execution will result in an abort of the transaction that they are part of unless the AO_IgnoreError abort option is set for the operation. 
By default, read operations are run with AO_IgnoreError, and write operations are run with AbortOnError, but this can be overridden by the user.  When an error causes transaction abort, the execute() method call will return a failure return code.  Where an error is ignored, due to AO_IgnoreError being set on the operation, the execute() method will return a success code, and the user must examine all operations for failure using
NdbOperation.getNdbError().  For this reason, NdbTransaction.getNdbError() should usually be checked for errors, even if execute() returns success.  If the client application does not keep track of NdbOperation objects in-execution, then NdbTransaction.getNextCompletedOperation() can be used to iterate over them.

In all operation-specific error cases, an execution error with an operation will be marked against both the operation and the transaction object.  Where there are multiple operation errors in a single NdbTransaction.execute() call, due to operation batching and the use of AO_IgnoreError, only the first is marked against the NdbTransaction object.  The other errors will be recorded against their NdbOperation object(s) only.  

Additionally, there are some errors which occur during execution, such as a data node failure, which are marked against the transaction object, but *not* against the underlying operation objects.  This is because these errors apply to the transaction as a whole, not to individual operations within it.

For this reason, applications should use NdbTransaction.getNdbError() as the first way to determine whether an NdbTransaction.execute() call failed.  If the operation batch being executed included operations with AO_IgnoreError abort option set, then it is possible that there were multiple failures, and the completed operations should be checked individually for errors using NdbOperation.getNdbError().

Implicit NdbTransaction.execute() calls in scan and Blob methods
----------------------------------------------------------------
Scan operations are executed in the same way as other operations, and also have implicit execute() calls within the NdbScanOperation.nextResult() method.  When
NdbScanOperation.nextResult() returns an error (e.g. -1), the transaction object should be checked for an error.  The NdbScanOperation may also contain the error, but will not if the error is not operation-specific.

Some Blob manipulation methods also have implicit internal execute() calls, and so can experience operation execution failures at these points.  The following NdbBlob methods *can* generate implicit execute() calls and so also require that the NdbTransaction object is checked for errors via NdbTransaction.getNdbError() if they return a failure return code
:
 - setNull()
 - truncate()
 - readData()
 - writeData()

Summary
-------
In general, when calling one of the following methods : 
  NdbTransaction.execute()
  NdbScanOperation.nextResult()
  NdbBlob.setNull()
  NdbBlob.truncate()
  NdbBlob.readData()
  NdbBlob.writeData()
it is possible for an error to occur during execution, resulting in a failure return code.
 If this happens, the NdbTransaction.getNdbError() method should be called to determine the first error that occurred.  Where operations were batched, and there were IgnoreError operations in the batch, there may be multiple operations with errors in the transaction.  These can be found by iterating the completed operations (NdbTransaction.getNextCompletedOperation()) and calling NdbOperation.getNdbError().

Where IgnoreError is set on some operations in an execution batch, the NdbTransaction.execute() method will return success in non transaction aborting error cases.  To determine whether there were any ignored errors, the transaction error status can be checked (NdbTransaction.getNdbError()).  If this is set to success then there were no errors.  If it is set to an error code, and operations were batched, then the completed operations should be iterated to find all the operations with ignored errors.

Example pseudo-code
-------------------

/* Transaction execute which may have batched operations and a mix
 * of AO_IgnoreError and AbortOnError AbortOptions
 */
  int execResult= NdbTransaction.execute(..);

  /* Check for errors regardless of execResult - errors on AO_IgnoreError operations
   * will not affect the execute() return code
   */
  NdbError err= NdbTransaction.getNdbError(); 

  if (err.code != 0)
  {
    /* Some error on the transaction, could be : 
     *  - Transaction wide error such as data node failure - transaction aborted
     *  - Single operation specific aborting error, such as constraint violation 
     *     - transaction aborted
     *  - Single operation specific ignored error such as no data found 
     *     - transaction ok
     *  - First of many operation specific ignored errors such as no data found 
     *      (when batching) - transaction ok
     *  - First of n operation specific ignored errors such as no data found
     *      (when batching) before an aborting operation error 
     *      - transaction aborted
     */

     if (execResult != 0)
     {
       /* Transaction aborted, let's case on the transaction error, and only 
        * if necessary for reporting, iterate over the completed operations 
        * (if any) for errors.
        */
       ...
     }
     else
     {
       /* Transaction did not abort, must be some ignored error(s).
        * Let's iterate over the operation(s) to see what happened and 
        * handle/report it....
        */
       ...
     }
  }

/* NdbScanOperation nextResult() which returns -1 (failure) */

  int nextrc= NdbScanOperation.nextResult(...);

  /* Handle nextrc == 0, 1, 2 success cases */

  if (nextrc == -1)
  {
    /* First check scan operation object for error */
    NdbError err= NdbScanOperation.getNdbError();

    if (err.code == 0)
      /* No error found on scan operation object, must be a
       * transaction wide error
       */
      err= NdbTransaction.getNdbError();

    /* Handle error .... */
    ...
  }

Changing status to 'Not A Bug' as :
 - Original example given does not behave in the way described
 - Separate Bug (40242) adds inter-NDBAPI object accessor methods mentioned in discussions
 - Comments contain description of how error detection and handling should be done.

Please close the bug if you agree with this status.