Bug #74431 Problems with setPartitionKey
Submitted: 17 Oct 2014 14:02 Modified: 10 Sep 2019 10:28
Reporter: salman niazi Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Cluster: Cluster/J Severity:S3 (Non-critical)
Version:7.3.5 OS:Linux
Assigned to: MySQL Verification Team CPU Architecture:Any
Tags: Clusterj partitionKey

[17 Oct 2014 14:02] salman niazi
Description:
We are using NDB to store filesystem tree. We use ClusterJ to access the database. Each transaction in our system start by reading the root inode row. We have observed that all the transactions start on the same NDB datanode that contain the root inode data. This is becoming a performance bottleneck for us. We have tried using setPartitionKey with random hint to start the transactions on different ndb datanodes, but it does not seem to work. 

Attached is a very simple program that reads and writes same row in an infinite loop. The partition key is set to a random number. The cluster log below show that the transactions are not randomly distributed on all the datanodes

2014-10-17 15:35:20 [MgmtSrvr] INFO     -- Node 4: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:21 [MgmtSrvr] INFO     -- Node 5: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:21 [MgmtSrvr] INFO     -- Node 3: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:21 [MgmtSrvr] INFO     -- Node 1: Trans. Count = 3447, Commit Count = 3447, Read Count = 3447, Simple Read Count = 3447, Write Count = 3446, AttrInfo Count = 10339, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 3447 Local Write Count = 3446
2014-10-17 15:35:21 [MgmtSrvr] INFO     -- Node 6: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:21 [MgmtSrvr] INFO     -- Node 2: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:25 [MgmtSrvr] INFO     -- Node 4: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:26 [MgmtSrvr] INFO     -- Node 5: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:26 [MgmtSrvr] INFO     -- Node 3: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:26 [MgmtSrvr] INFO     -- Node 1: Trans. Count = 3586, Commit Count = 3587, Read Count = 3586, Simple Read Count = 3586, Write Count = 3587, AttrInfo Count = 10760, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 3586 Local Write Count = 3587
2014-10-17 15:35:26 [MgmtSrvr] INFO     -- Node 6: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:26 [MgmtSrvr] INFO     -- Node 2: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:30 [MgmtSrvr] INFO     -- Node 4: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:31 [MgmtSrvr] INFO     -- Node 5: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:31 [MgmtSrvr] INFO     -- Node 3: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:31 [MgmtSrvr] INFO     -- Node 1: Trans. Count = 3530, Commit Count = 3530, Read Count = 3530, Simple Read Count = 3530, Write Count = 3530, AttrInfo Count = 10590, Concurrent Operations = 1, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 3530 Local Write Count = 3530
2014-10-17 15:35:31 [MgmtSrvr] INFO     -- Node 6: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:31 [MgmtSrvr] INFO     -- Node 2: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:35 [MgmtSrvr] INFO     -- Node 4: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:36 [MgmtSrvr] INFO     -- Node 5: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:36 [MgmtSrvr] INFO     -- Node 3: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:36 [MgmtSrvr] INFO     -- Node 1: Trans. Count = 3884, Commit Count = 3884, Read Count = 3884, Simple Read Count = 3884, Write Count = 3884, AttrInfo Count = 11652, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 3884 Local Write Count = 3884
2014-10-17 15:35:36 [MgmtSrvr] INFO     -- Node 6: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0
2014-10-17 15:35:36 [MgmtSrvr] INFO     -- Node 2: Trans. Count = 0, Commit Count = 0, Read Count = 0, Simple Read Count = 0, Write Count = 0, AttrInfo Count = 0, Concurrent Operations = 0, Abort Count = 0 Scans = 0 Range scans = 0, Local Read Count = 0 Local Write Count = 0

Could you please tell us why the transactions are not starting on different ndb datanodes?

/salman

How to repeat:
import com.mysql.clusterj.ClusterJException;
import com.mysql.clusterj.ClusterJHelper;
import com.mysql.clusterj.Session;
import com.mysql.clusterj.SessionFactory;
import com.mysql.clusterj.annotation.Column;
import com.mysql.clusterj.annotation.PersistenceCapable;
import com.mysql.clusterj.annotation.PrimaryKey;
import java.util.Properties;
import java.util.Random;

/**
 *
 * @author salman
 */
public class TestPartitionKey {

    public static void main(String argv[]) {
        new TestPartitionKey().testPartitionKey();
    }

    public void testPartitionKey() {
        Random rand = new Random(System.currentTimeMillis());
        Properties props = new Properties();
        SessionFactory factory;

        props.setProperty("com.mysql.clusterj.connectstring", "XXX");
        props.setProperty("com.mysql.clusterj.database", "XXX");
        props.setProperty("com.mysql.clusterj.connect.retries", "4");
        props.setProperty("com.mysql.clusterj.connection.pool.size", "1");
        props.setProperty("com.mysql.clusterj.connect.delay", "5");
        props.setProperty("com.mysql.clusterj.connect.verbose", "1");
        props.setProperty("com.mysql.clusterj.connect.timeout.before", "30");
        props.setProperty("com.mysql.clusterj.connect.timeout.after", "20");
        props.setProperty("com.mysql.clusterj.max.transactions", "1024");

        try {
            factory = ClusterJHelper.getSessionFactory(props);
        } catch (ClusterJException ex) {
            throw ex;
        }

        while (true) {
            Session session = factory.getSession();
            
            session.setPartitionKey(TestDTO.class, new Integer(rand.nextInt()));
            
            session.currentTransaction().begin();

            Integer id = 0;//rand.nextInt();
            TestDTO result = null;
            result = session.find(TestDTO.class, id);
            if (result == null) {
                result = session.newInstance(TestDTO.class);
                result.setId(id);
                result.setCounter(0);
            }else{
                result.setCounter(result.getCounter()+1);
            }
                        
            session.savePersistent(result);
            session.currentTransaction().commit();
            session.close();

            System.out.println("Counter is: " + result.getCounter());
        }
    }

    /*
     CREATE TABLE `test` (
     `id` int(11) NOT NULL,
     `counter` int(11) NOT NULL,
     PRIMARY KEY (`id`)
     ) PARTITION BY KEY (id);
     */
    @PersistenceCapable(table = "test")
    public interface TestDTO {

        @PrimaryKey
        @Column(name = "id")
        int getId();

        void setId(int id);

        @Column(name = "counter")
        int getCounter();

        void setCounter(int counter);
    }
}
[10 Sep 2019 10:28] MySQL Verification Team
Hi,

This is not a bug. You are reading a same data and that data is on single data node so it's expected that it will come always from same data node. Partition key will change where the data will be but reading same piece of data will always come from same node.

When API node comes up (cluster/J is API node) it will connect to ALL nodes but will start transaction by choosing a transaction coordinator. The transaction coordinator is chosen by data locality (this is your case), network locality or load balancing. Since you are always pulling same data based on data locality you will always have same node as transaction coordinator.

You can discuss solution to this issue further with our support team.

kind regards