Bug #24087 varbinary as distribution key to Ndb.startTransaction gives libndbclient sigsegv
Submitted: 8 Nov 2006 13:04 Modified: 4 Dec 2006 9:38
Reporter: Jim Dowling Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: NDB API Severity:S2 (Serious)
Version:5.1.12 OS:Linux (Linux)
Assigned to: Hartmut Holzgraefe CPU Architecture:Any
Tags: cluster, distribution key, ndb api, startTransaction

[8 Nov 2006 13:04] Jim Dowling
Description:
This is a libndbclient issue.
When we pass a hint to Ndb.startTransaction(const NdbDictionary::Table,const char*, uint32) to a table partitioned by a varbinary primary key, we get problems, such as SIGSEGV (sometimes), and sometimes to stdout we get:
"TransporterFacade::getIsNodeSendable: Illegal node type: 1 of node: 30376"

These errors produces no failure messages in the cluster logs.

When we replace:
Ndb.startTransaction(table_obj, string_key, key_len)
with
Ndb.startTransaction()
the problem goes away.

The most common error is:

SIGSEGV (0xb) at pc=0x6bada4b0, pid=4408, tid=3086907072

Examples
Stack: [0xbfe00000,0xc0000000),  sp=0xbfff95c4,  free space=2021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libndbclient.so.0+0x534b0]  _ZN14NdbTransaction4initEv+0x20
C  [libndbclient.so.0+0x48f35]  _ZN3Ndb21startTransactionLocalEjj+0x65
C  [libndbclient.so.0+0x49017]  _ZN3Ndb16startTransactionEPKN13NdbDictionary5TableEPKcj+0x77
C  [libndbj.so+0xca33]  Java_com_nortel_ahp_base_db_mysql_ndbapi_Ndb_startTransactionStr+0x193
j  com.nortel.ahp.base.db.mysql.ndbapi.Ndb.startTransactionStr(JLjava/lang/String;Ljava/lang/String;)J+0

C  [libndbclient.so.0+0x489b0]  _ZN3Ndb26getConnectedNdbTransactionEj+0x10
C  [libndbclient.so.0+0x48c4c]  _ZN3Ndb9doConnectEj+0xdc
C  [libndbclient.so.0+0x48f1c]  _ZN3Ndb21startTransactionLocalEjj+0x4c
C  [libndbclient.so.0+0x49017]  _ZN3Ndb16startTransactionEPKN13NdbDictionary5TableEPKcj+0x77
C  [libndbj.so+0xca33]  Java_com_nortel_ahp_base_db_mysql_ndbapi_Ndb_startTransactionStr+0x193
j  com.nortel.ahp.base.db.mysql.ndbapi.Ndb.startTransactionStr(JLjava/lang/String;Ljava/lang/String;)J+0

We are running a test cluster with 2 NDBDs and 2 MGMDs:
config.ini:

[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=80M  # Reduced to total 100M per replica
IndexMemory=20M
NoOfFragmentLogFiles=25
TimeBetweenLocalCheckpoints=6
MaxNoOfConcurrentOperations=12500
TransactionInactiveTimeout=30000        # 30seconds of inactivity=rollback

[NDB_MGMD]
Hostname=localhost
nodeid=62
portnumber=23131
DataDir=/var/lib/mysql-cluster/dbmgmd1

[NDB_MGMD]
Hostname=localhost
nodeid=63
portnumber=23132
DataDir=/var/lib/mysql-cluster/dbmgmd2

[NDBD]
HostName=localhost
datadir=/var/lib/mysql-cluster/dbdata1
nodeid=1

[NDBD]
HostName=localhost
datadir=/var/lib/mysql-cluster/dbdata2
nodeid=2

# Auto-enumerated API node slots,
# Counting down from 61
#
[MYSQLD]
nodeid=61
[MYSQLD]
nodeid=60
[MYSQLD]
nodeid=59
[MYSQLD]
nodeid=58
[MYSQLD]
nodeid=57
[MYSQLD]
nodeid=56
[MYSQLD]
nodeid=55
[MYSQLD]
nodeid=54
[MYSQLD]
nodeid=53
[MYSQLD]
nodeid=52
[MYSQLD]
nodeid=51
[MYSQLD]
nodeid=50
[MYSQLD]
nodeid=49
[MYSQLD]
nodeid=48
[MYSQLD]
nodeid=47
[MYSQLD]
nodeid=46

How to repeat:
I tried reproducing this on a sample program, by passing a string to:
Ndb.startTransaction(table_obj, string_key, key_len)
but it worked ok.
So I don't have a reproduction handy.
[4 Dec 2006 9:38] Jonas Oreland
refering issue has been closed, due to problem in there app.
suggest we close this aswell...
[5 Dec 2006 15:21] Jim Dowling
When not debugging this error using gdb, we get a SIGSEGV.
I'm attaching a jvm crash report file with some trace information on this.