Bug #20296 NDB API: interpretedUpdateTuple with interpret_exit_nok causes data node crash
Submitted: 6 Jun 2006 15:42 Modified: 6 Sep 2006 12:25
Reporter: Anatoly Pidruchny (Candidate Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S3 (Non-critical)
Version:5.0.21-max OS:Any (all)
Assigned to: Jonas Oreland CPU Architecture:Any

[6 Jun 2006 15:42] Anatoly Pidruchny
Description:
The test program calls NdbOperation::interpretedUpdateTuple and then uses interpret_exit_nok instruction. Looks like when the interpret_exit_nok instruction is executed by the ndbd node, it causes some kind of corruption of memory for the record. The next operation for the same record causes the node crash.

The interpret_exit_nok instraction is necessary to be able to fail the operation from an interpreted program. I tried to use interpretedUpdateTuple to be able to check the record before the update (and fail the operation if certain condition is not satisfied) in a single database operation.

How to repeat:
Please see the attached program interpret_exit_nok_crash.cpp. The program was tested on two platforms:
1. Linux ct01sr01 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:29:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux.
2. HP-UX sm27cp01 B.11.11 U 9000/800 105901527 unlimited-user license

In both cases it causes crashes of ndbd nodes. I am also attaching the config.ini and all the trace and log files from the test Cluster on the Linux platform, as well as the info, generated by mysqlbug.
[6 Jun 2006 15:43] Anatoly Pidruchny
The test program that causes ndbd node crashes.

Attachment: interpret_exit_nok_crash.cpp (application/octet-stream, text), 5.29 KiB.

[6 Jun 2006 15:44] Anatoly Pidruchny
Config file for the cluster.

Attachment: config.ini (application/octet-stream, text), 555 bytes.

[6 Jun 2006 15:44] Anatoly Pidruchny
mysqlbug info

Attachment: mysqlbug_out (application/octet-stream, text), 3.47 KiB.

[6 Jun 2006 15:45] Anatoly Pidruchny
PID for of the mgmd node.

Attachment: ndb_1.pid (application/octet-stream, text), 5 bytes.

[6 Jun 2006 15:45] Anatoly Pidruchny
Cluster log file

Attachment: ndb_1_cluster.log (application/octet-stream, text), 3.20 KiB.

[6 Jun 2006 15:46] Anatoly Pidruchny
Out file of the mgmd node

Attachment: ndb_1_out.log (application/octet-stream, text), 72 bytes.

[6 Jun 2006 15:46] Anatoly Pidruchny
PID file of the ndbd node.

Attachment: ndb_2.pid (application/octet-stream, text), 5 bytes.

[6 Jun 2006 15:46] Anatoly Pidruchny
Error log of the ndbd node

Attachment: ndb_2_error.log (application/octet-stream, text), 568 bytes.

[6 Jun 2006 15:46] Anatoly Pidruchny
Out file of the ndbd node.

Attachment: ndb_2_out.log (application/octet-stream, text), 784 bytes.

[6 Jun 2006 15:48] Anatoly Pidruchny
Zipped trace file of the ndbd node

Attachment: ndb_2_trace.log.1.gz (application/x-gzip, text), 49.80 KiB.

[26 Jul 2006 12:29] Hartmut Holzgraefe
test project

Attachment: bug20296-1.0.tar.bz2 (application/x-tar, text), 192.84 KiB.

[26 Jul 2006 12:30] Hartmut Holzgraefe
attached test project fails with:

./bug20296 
Error in bug20296.cpp, line: 156, code: 4010, msg: Node failure caused abort of transaction.
[26 Jul 2006 12:33] Hartmut Holzgraefe
logs from the test projects faild run

Attachment: bug20296-logs.tar.gz (application/x-gzip, text), 94.12 KiB.

[3 Aug 2006 14:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10011

ChangeSet@1.2235, 2006-08-03 16:25:47+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - bug#20296
     Make sure that tupkeyErrorLab is run if interpretedUpdate(fail), so that entry is not inserted into index.
        Yeilding crash on following dml on tupe
[3 Aug 2006 14:31] Jonas Oreland
Hi

Thx for yet another excellent bug report.

I committed a patch that i think will solve the problem.
FYI: the problem was that an extra entry was added to ordered index
  even if update was aborted.
  And then following dml on record crashed as index was inconsistent.

  So the program should probably have worked if you created the table
  like this instead 
  CREATE TABLE mytablename (
      ATTR1 INT UNSIGNED NOT NULL, 
      ATTR2 INT UNSIGNED NOT NULL, 
      primary key using hash(attr1)

  Note I have not tested this yet (but will tomorrow)
    (and excuse me if syntax for "primary key using hash" 
     is incorrect, i'm no sql expert)

  I'll also run your test program against 5.1 tomorrow...as DbtupExecQuery has
    changed _a lot_ in 5.1 relative 5.0

/Jonas
[3 Aug 2006 17:58] Anatoly Pidruchny
Hi, Jonas,

thanks for fixing this. Not sure why any change is even done to the ordered index during the update operation since the index is on the primary key column and the primary key does not change during update. But it is OK.

I am moving my development and testing to 5.1. I tried to compile and run this test program against 5.1.11 and it looks OK. The data nodes did not crash.

When I compile programs against 5.1.11, I am getting the following warning:

Warning 552: "/usr/local/mysql/include/ndb/ndbapi/NdbDictionary.hpp", line 790 # Ambiguous overloaded function declaration; default arguments make this function indistinguishable from
    previous one. Clashing function "const char *NdbDictionary::Table::getTablespace() const" was previously declared at ["/usr/local/mysql/include/ndb/ndbapi/NdbDictionary.hpp", line
    789].
        bool getTablespace(Uint32 *id= 0, Uint32 *version= 0) const;
             ^^^^^^^^^^^^^                                          

It is just a warning, nothing else. Just want to bring it to your attention. Please fix if possible and not fixed already. No warnings happened if compiled against 5.0.x. Looks like the function const char * NdbDictionary::Table::getTablespace() const is declared, but not defined anywhere. So, the declaration can be safely removed from the NdbDictionary.hpp file.

Thanks again,

Anatoly.
[4 Aug 2006 6:42] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10032

ChangeSet@1.2538, 2006-08-04 08:41:32+02:00, jonas@perch.ndb.mysql.com +1 -0
  ndb - bug#20296 (recommit in 4.1)
     Make sure that tupkeyErrorLab is run if interpretedUpdate(fail), so that entry is not inserted into index.
       Yeilding crash on following dml on tupel
[1 Sep 2006 7:57] Jonas Oreland
pushed to 5.1.12
[1 Sep 2006 19:26] Jonas Oreland
pushed to 5.0.25
[6 Sep 2006 7:11] Jonas Oreland
pushed into 4.1.22
[6 Sep 2006 12:25] Jon Stephens
Note: Interpeted programs are not part of the public API - they're for internal use only.