Description:
When running applications that creates and releases loads of
new instances using the session.newInstance and session.release
interface the scalability is hampered dramatically and lots of
overhead is created for the application.
In a simple benchmark where one runs one batch of primary key
lookups at a time, each batch containing 380 lookups, the
microbenchmark using standard ClusterJ can only scale to 3
threads and then handling a bit more than 1000 batches per second.
Using the BPF tool it was clear that more than a quarter of the
CPU time is spent in the call to session.newInstance. ClusterJ uses
Java Reflection to be handled dynamic class objects.
A very simple patch of the microbenchmark showed that this overhead
and scalability hog could quite easily be removed by putting the
objects into a cache inside the thread instead of calling the
session.release. This simple change made the microbenchmark scale to
12 threads and reach 3700 batches per second.
The microbenchmark was executed on an AMD workstation using a cluster
with 2 data nodes with 2 LDM threads in each and 1 ClusterJ application.
This feature moves this cache of objects into the Session object.
The session object is by design single-threaded, so no extra mutex
protections are required as using a Session object from multiple
threads at the same time is an application bug.
It is required to maintain one cache per Class. A new configuration
parameter com.mysql.clusterj.max.cached.instances was added where
one can set the maximum number of cached objects per session object.
We maintain a global linked list of the age of objects in the cache
independent of its Class. If the cache is full and we need to store
an object in the cache we will only do so if the oldest object has
not been used for at least a max number. Each put into the cache
increases the age by 1. If the age of the oldest object is higher
than 4 * com.mysql.clusterj.max.cached.instances, then this object
will be replaced, otherwise we will simply release the object.
From benchmark runs it is clear that it's necessary to use a full
caching of hot objects to get the performance advantage.
The application controls the caching by calling releaseCache(Class<?>)
to cache the object. The release call will always release the object.
In addition the application can call session.dropCacheInstance(Class<?>)
to drop all cached objects of a certain Class. It can also call
session.dropCacheInstance() to drop all cached objects.
In addition the application can also cache session objects if it
adds and drops those at rapid rates by using the call session.closeCache().
If one wants to clear the cache before placing the session object into
the cache one can use session.closeCache(true).
How to repeat:
See description
Suggested fix:
See description