Bug #19537 NDB API: NdbOperation::write_attr does not update 32-bit fields correctly
Submitted: 4 May 2006 14:50 Modified: 22 May 2006 9:45
Reporter: Anatoly Pidruchny (Candidate Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.0.21-max OS:HP/UX (HP-UX 11.11)
Assigned to: Pekka Nousiainen CPU Architecture:Any

[4 May 2006 14:50] Anatoly Pidruchny
Description:
The problem manifests itself only on big-endian platforms. As documented in the Ndb.hpp header file, the Interpreted programs are executed in a register-based virtual machine. The virtual machine has eight 64 bit registers numbered 0-7. The write_attr operation writes a register value into an attribute. If the attribute is a 32-bit integer then the first 32 bit of the 64-bit value are taken from the register. On little-endian platforms this is OK, but on big-endian platforms the 4 most significant bytes are taken and this is wrong.

How to repeat:
Below is a small program that illustrates the problem. At the end, the program prints "LOCK_FLAG=0x1020304". This is wrong. The correct result should be "LOCK_FLAG=0x5060708".

/*
  Create the following table in the test database:

    use test;
    CREATE TABLE call_context (
        CONTEXT_ID INT NOT NULL,
        VERSION    INT DEFAULT 3000000 NOT NULL,
        LOCK_FLAG  INT UNSIGNED NOT NULL,
        LOCK_TIME  INT UNSIGNED NOT NULL,
        LOCK_TIME_USEC INT UNSIGNED NOT NULL,
        CONTEXT_DATA VARBINARY(4000) NOT NULL,
        PRIMARY KEY (CONTEXT_ID)
    ) ENGINE=NDBCLUSTER;
    insert into call_context values (1, 3000000, 11, unix_timestamp(now()), 777, '\0\0\0\0');

*/

#include <NdbApi.hpp>
#include <stdio.h>
#include <iostream.h>
#include <time.h>
#include <vector>

#define PRINT_ERROR(code,msg) \
    cout << "Error in " << __FILE__ << ", line: " << __LINE__ \
         << ", code: " << code \
         << ", msg: " << msg << "." << endl
#define APIERROR(error) { \
    PRINT_ERROR(error.code,error.message); \
    exit(-1); }

static void run_application(Ndb_cluster_connection &cluster_connection);

int main(int argc, char* argv[])
{
    // ndb_init must be called first
    ndb_init();
    printf("NDB library initialized\n");
    // connect to mysql server and cluster and run application
    {
        // Object representing the cluster
        Ndb_cluster_connection cluster_connection("nodeid=9,localhost:1186");

        // Connect to cluster management server (ndb_mgmd)
        if (cluster_connection.connect(4 /* retries               */,
                                       5 /* delay between retries */,
                                       1 /* verbose               */))
        {
            cout << "Cluster management server was not ready within 30 secs.\n";
            exit(1);
        }
        printf("Connected to the management server\n");

        // Optionally connect and wait for the storage nodes (ndbd's)
        if (cluster_connection.wait_until_ready(30, 0) < 0)
        {
            cout << "Cluster was not ready within 30 secs.\n";
            exit(1);
        }
        printf("Cluster is ready\n");
        run_application(cluster_connection);
    }
    ndb_end(0);
    return 0;
}

static void do_readWithLock(Ndb &myNdb);

static void run_application(Ndb_cluster_connection &cluster_connection)
{
    // Connect to database via NdbApi

    // Object representing the database
    Ndb myNdb( &cluster_connection, "test" );
    if (myNdb.init())
    {
        APIERROR(myNdb.getNdbError());
    }
    printf("Connected to the \"test\" database\n");
    do_readWithLock(myNdb);
}

static void do_readWithLock(Ndb &myNdb)
{
    const NdbDictionary::Dictionary* myDict = myNdb.getDictionary();
    const NdbDictionary::Table *myTable = myDict->getTable("call_context");

    if (myTable == NULL) 
        APIERROR(myDict->getNdbError());

    {
        // Indexes
        NdbDictionary::Dictionary::List lst;
        int res = myDict->listIndexes(lst, "call_context");
        if (res < 0)
            APIERROR(myDict->getNdbError());
        cout << "Number of indexes=" << lst.count << endl;
        for (int i = 0; i < lst.count; i++)
        {
            cout << "Index " << i << " name=<" << lst.elements[i].name << ">" << endl;
        }
    }

    vector<char> contextData(myTable->getColumn("CONTEXT_DATA")->getLength() + 2);

    NdbTransaction *myTransaction = myNdb.startTransaction();
    if (myTransaction == NULL) APIERROR(myNdb.getNdbError());

    NdbOperation *myOperation = myTransaction->getNdbOperation(myTable);
    if (myOperation == NULL) APIERROR(myTransaction->getNdbError());

    if (myOperation->interpretedUpdateTuple()) APIERROR(myTransaction->getNdbError());
    if (myOperation->equal("CONTEXT_ID", 1)) APIERROR(myTransaction->getNdbError());

    if (
        myOperation->load_const_u64(1, (Uint64)0x0102030405060708) == -1 || // Reg1 = 0x0102030405060708
        myOperation->write_attr("LOCK_FLAG", 1) == -1 // LOCK_FLAG = Reg1
        ) APIERROR(myOperation->getNdbError());

    NdbRecAttr *lockFlagAttr = myOperation->getValue("LOCK_FLAG", NULL);
    if (lockFlagAttr == NULL) APIERROR(myTransaction->getNdbError());

    if (myTransaction->execute( NdbTransaction::Commit ) == -1)
        APIERROR(myTransaction->getNdbError());

    cout << "LOCK_FLAG=0x" << hex << (unsigned)lockFlagAttr->u_32_value() << dec << endl;

    myNdb.closeTransaction(myTransaction);
}

Suggested fix:
File mysql-5.0.21\ndb\src\kernel\blocks\dbtup\DbtupExecQuery.cpp, from line 1552:

          TdataForUpdate[1] = TregMemBuffer[theRegister + 2];
          TdataForUpdate[2] = TregMemBuffer[theRegister + 3];

The suggested fix is:

          if (TattrNoOfWords == 1) {
            TdataForUpdate[1] = (Uint32)* (Int64*)(TregMemBuffer + theRegister + 2);
          } else {
            TdataForUpdate[1] = TregMemBuffer[theRegister + 2];
            TdataForUpdate[2] = TregMemBuffer[theRegister + 3];
          }

Short explanation of the fix:
The array TregMemBuffer is the memory area for the 64-bit registers 0-7. Four 32-bit words are used for each register. The first word is for the type of the value (0 - NULL, 0x50 - 32-bit value, 0x60 - 64-bit value); the second word is not really used and exists there just as a padding to avoid the possible SIG_BUS signals when working with the 64-bit value that follows; the third and fourth words contain the 64-bit value. Variable theRegister contains the virtual register number multiplied by 4 (i.e. 12 for register 3).

The two lines of code (1552 and 1553) copy the 64-bit value from the register area into the TdataForUpdate array which is just a data structure needed for the updateAttributes function. As you can see, two 32-bit words are always copied into the TdataForUpdate array, even if the attribute is a 32-bit integer. But when the attribute is a single 32-bit integer word, then only one word of the value is used and the second is ignored. On little-endian platforms it happens that the correct word is used, but on big-endian the wrong word is used. The suggested fix is to check if the attribute to be updated is a 32-bit value, and if so, then cast the 64-bit value of the register into the 32-bit value and put this single value into the TdataForUpdate array.

The fix was tested in our HP-UX 11.11 environment and worked fine.
[15 May 2006 16:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6389
[22 May 2006 9:45] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 4.1.20/5.0.22/5.1.11 changelogs; closed.