Bug #31775 NdbOperation errors are found on the NdbTransaction object
Submitted: 23 Oct 2007 7:30 Modified: 22 Oct 2008 19:23
Reporter: Monty Taylor
Status: Not a Bug
Category:Server: NDBAPI Severity:S3 (Non-critical)
Version:5.1.22 OS:Any
Assigned to: Frazer Clement Target Version:
Triage: Triaged: D3 (Medium)

[23 Oct 2007 7:30] Monty Taylor
Description:
If an error occurs on an operation method, like: 

op->setValue("foo",1);

a call to op->getNdbError().message will result in "No Error", but a valid error is
returned for tran->getNdbError()

This makes it almost impossible to write any code that operates on the operation without
also having the transaction around. (This is a very common occurrence in helper methods
in the NDB/Connectors) 

How to repeat:
  Ndb_cluster_connection *cluster_connection=
    new Ndb_cluster_connection(); 
  cluster_connection->connect(5,3,1);
  cluster_connection->wait_until_ready(30,30);

  Ndb* myNdb = new Ndb( cluster_connection,"TEST_DB_1" ); 
  myNdb->init();
  const NdbDictionary::Dictionary* myDict= myNdb->getDictionary();
  const NdbDictionary::Table *myTable= myDict->getTable("MYTABLENAME");
  
  NdbTransaction *myTransaction= myNdb->startTransaction();
  NdbOperation *myOperation= myTransaction->getNdbOperation(myTable);
      
  myOperation->insertTuple();
              
// HERE - put in a column that doesn't exist.
  myOperation->equal("ATTR1",1);
  printf("%s == No Error",myOperation->getNdbError().message);
  printf("%s == Valid Error",myTransaction->getNdbError().message);

Suggested fix:
Well, a patch that just went in post 5.1.22 will allow a work around, which is that
NdbOperation now has a getTransaction() method, so something operating on an NdbOperation
could call that to then call getNdbError()

But in general, having the error attached to the thing that caused it would be quite a
bit clearer, since that is the prevailing pattern in the rest of the code.
[24 Oct 2007 12:35] Hartmut Holzgraefe
I always wondered about this, too ...
[22 Oct 2008 19:20] Frazer Clement
Note that the example given above does not fail *as described*.

A separate bug#40242 : NDBAPI : All NDBAPI objects should provide accessors for their
'parents' has been created to add accessors to the Transaction object from Blob and Scan
operations.

For this bug, some further description of how errors can be detected and mapped onto
particular operations is given below.

Errors during operation definition
----------------------------------
NDBAPI can generate errors during operation definition, and during operation execution. 
Errors generated during operation definition result in a failure return code from the
method called.  The actual error can be determined by examining the relevant NdbOperation
object, or the operation's NdbTransaction object.

Errors during operation execution
---------------------------------
Errors occurring during operation execution will result in an abort of the transaction
that they are part of unless the AO_IgnoreError abort option is set for the operation. 
By default, read operations are run with AO_IgnoreError, and write operations are run
with AbortOnError, but this can be overridden by the user.  When an error causes
transaction abort, the execute() method call will return a failure return code.  Where an
error is ignored, due to AO_IgnoreError being set on the operation, the execute() method
will return a success code, and the user must examine all operations for failure using
NdbOperation.getNdbError().  For this reason, NdbTransaction.getNdbError() should usually
be checked for errors, even if execute() returns success.  If the client application does
not keep track of NdbOperation objects in-execution, then
NdbTransaction.getNextCompletedOperation() can be used to iterate over them.

In all operation-specific error cases, an execution error with an operation will be
marked against both the operation and the transaction object.  Where there are multiple
operation errors in a single NdbTransaction.execute() call, due to operation batching and
the use of AO_IgnoreError, only the first is marked against the NdbTransaction object. 
The other errors will be recorded against their NdbOperation object(s) only.  

Additionally, there are some errors which occur during execution, such as a data node
failure, which are marked against the transaction object, but *not* against the
underlying operation objects.  This is because these errors apply to the transaction as a
whole, not to individual operations within it.

For this reason, applications should use NdbTransaction.getNdbError() as the first way to
determine whether an NdbTransaction.execute() call failed.  If the operation batch being
executed included operations with AO_IgnoreError abort option set, then it is possible
that there were multiple failures, and the completed operations should be checked
individually for errors using NdbOperation.getNdbError().

Implicit NdbTransaction.execute() calls in scan and Blob methods
----------------------------------------------------------------
Scan operations are executed in the same way as other operations, and also have implicit
execute() calls within the NdbScanOperation.nextResult() method.  When
NdbScanOperation.nextResult() returns an error (e.g. -1), the transaction object should
be checked for an error.  The NdbScanOperation may also contain the error, but will not
if the error is not operation-specific.

Some Blob manipulation methods also have implicit internal execute() calls, and so can
experience operation execution failures at these points.  The following NdbBlob methods
*can* generate implicit execute() calls and so also require that the NdbTransaction
object is checked for errors via NdbTransaction.getNdbError() if they return a failure
return code
:
 - setNull()
 - truncate()
 - readData()
 - writeData()

Summary
-------
In general, when calling one of the following methods : 
  NdbTransaction.execute()
  NdbScanOperation.nextResult()
  NdbBlob.setNull()
  NdbBlob.truncate()
  NdbBlob.readData()
  NdbBlob.writeData()
it is possible for an error to occur during execution, resulting in a failure return
code.
 If this happens, the NdbTransaction.getNdbError() method should be called to determine
the first error that occurred.  Where operations were batched, and there were IgnoreError
operations in the batch, there may be multiple operations with errors in the transaction. 
These can be found by iterating the completed operations
(NdbTransaction.getNextCompletedOperation()) and calling NdbOperation.getNdbError().

Where IgnoreError is set on some operations in an execution batch, the
NdbTransaction.execute() method will return success in non transaction aborting error
cases.  To determine whether there were any ignored errors, the transaction error status
can be checked (NdbTransaction.getNdbError()).  If this is set to success then there were
no errors.  If it is set to an error code, and operations were batched, then the completed
operations should be iterated to find all the operations with ignored errors.

Example pseudo-code
-------------------

/* Transaction execute which may have batched operations and a mix
 * of AO_IgnoreError and AbortOnError AbortOptions
 */
  int execResult= NdbTransaction.execute(..);

  /* Check for errors regardless of execResult - errors on AO_IgnoreError operations
   * will not affect the execute() return code
   */
  NdbError err= NdbTransaction.getNdbError(); 

  if (err.code != 0)
  {
    /* Some error on the transaction, could be : 
     *  - Transaction wide error such as data node failure - transaction aborted
     *  - Single operation specific aborting error, such as constraint violation 
     *     - transaction aborted
     *  - Single operation specific ignored error such as no data found 
     *     - transaction ok
     *  - First of many operation specific ignored errors such as no data found 
     *      (when batching) - transaction ok
     *  - First of n operation specific ignored errors such as no data found
     *      (when batching) before an aborting operation error 
     *      - transaction aborted
     */

     if (execResult != 0)
     {
       /* Transaction aborted, let's case on the transaction error, and only 
        * if necessary for reporting, iterate over the completed operations 
        * (if any) for errors.
        */
       ...
     }
     else
     {
       /* Transaction did not abort, must be some ignored error(s).
        * Let's iterate over the operation(s) to see what happened and 
        * handle/report it....
        */
       ...
     }
  }

/* NdbScanOperation nextResult() which returns -1 (failure) */

  int nextrc= NdbScanOperation.nextResult(...);

  /* Handle nextrc == 0, 1, 2 success cases */

  if (nextrc == -1)
  {
    /* First check scan operation object for error */
    NdbError err= NdbScanOperation.getNdbError();

    if (err.code == 0)
      /* No error found on scan operation object, must be a
       * transaction wide error
       */
      err= NdbTransaction.getNdbError();

    /* Handle error .... */
    ...
  }
[22 Oct 2008 19:23] Frazer Clement
Changing status to 'Not A Bug' as :
 - Original example given does not behave in the way described
 - Separate Bug (40242) adds inter-NDBAPI object accessor methods mentioned in
discussions
 - Comments contain description of how error detection and handling should be done.

Please close the bug if you agree with this status.