Bug #102510 ClusterJ Feature Request
Submitted: 6 Feb 2021 18:41 Modified: 3 Jul 1:31
Reporter: Mikael Ronström Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster/J Severity:S4 (Feature request)
Version:8.0.23 OS:Ubuntu
Assigned to: CPU Architecture:Any

[6 Feb 2021 18:41] Mikael Ronström
Description:
    When running applications that creates and releases loads of
    new instances using the session.newInstance and session.release
    interface the scalability is hampered dramatically and lots of
    overhead is created for the application.
    
    In a simple benchmark where one runs one batch of primary key
    lookups at a time, each batch containing 380 lookups, the
    microbenchmark using standard ClusterJ can only scale to 3
    threads and then handling a bit more than 1000 batches per second.
    
    Using the BPF tool it was clear that more than a quarter of the
    CPU time is spent in the call to session.newInstance. ClusterJ uses
    Java Reflection to be handled dynamic class objects.
    
    A very simple patch of the microbenchmark showed that this overhead
    and scalability hog could quite easily be removed by putting the
    objects into a cache inside the thread instead of calling the
    session.release. This simple change made the microbenchmark scale to
    12 threads and reach 3700 batches per second.
    
    The microbenchmark was executed on an AMD workstation using a cluster
    with 2 data nodes with 2 LDM threads in each and 1 ClusterJ application.
    
    This feature moves this cache of objects into the Session object.
    The session object is by design single-threaded, so no extra mutex
    protections are required as using a Session object from multiple
    threads at the same time is an application bug.
    
    It is required to maintain one cache per Class. A new configuration
    parameter com.mysql.clusterj.max.cached.instances was added where
    one can set the maximum number of cached objects per session object.
    
    We maintain a global linked list of the age of objects in the cache
    independent of its Class. If the cache is full and we need to store
    an object in the cache we will only do so if the oldest object has
    not been used for at least a max number. Each put into the cache
    increases the age by 1. If the age of the oldest object is higher
    than 4 * com.mysql.clusterj.max.cached.instances, then this object
    will be replaced, otherwise we will simply release the object.
    
    From benchmark runs it is clear that it's necessary to use a full
    caching of hot objects to get the performance advantage.
    
    The application controls the caching by calling releaseCache(Class<?>)
    to cache the object. The release call will always release the object.
    
    In addition the application can call session.dropCacheInstance(Class<?>)
    to drop all cached objects of a certain Class. It can also call
    session.dropCacheInstance() to drop all cached objects.
    
    In addition the application can also cache session objects if it
    adds and drops those at rapid rates by using the call session.closeCache().
    If one wants to clear the cache before placing the session object into
    the cache one can use session.closeCache(true).

How to repeat:
See description

Suggested fix:
See description
[7 Feb 2021 5:33] MySQL Verification Team
Hello Mikael,

Thank you for the feedback and reasonable feature request!

Sincerely,
Umesh
[27 Feb 18:18] John Duncan
Posted by developer:
 
Merged into bug#37476251.
[27 Feb 18:19] John Duncan
Merged into bug#117205.
[3 Jul 1:31] Daniel So
Posted by developer:
 
Added the following entry to the MySQL NDB 9.4.0 changelog: 

"Enhancements have been made to improve the scalability of ClusterJ applications using the session.newInstance and session.release interfaces in ClusterJ. A new cache has been introduced in the Session object to store objects, reducing overhead and improving performance. The cache size can be controlled using the com.mysql.clusterj.max.cached.instances configuration parameter, which enables the session cache by default with a maximum size of 100. Additionally, methods have been added to manage the cache, including releaseCache(), dropCacheInstance(), and closeCache(). "