Bug #36565 Statement cancellation timer can cause permgen memory leak in web applications
Submitted: 7 May 2008 14:15 Modified: 8 Jan 2010 11:32
Reporter: Charles Blaxland Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S5 (Performance)
Version:5.1.6 OS:Any
Assigned to: CPU Architecture:Any

[7 May 2008 14:15] Charles Blaxland
Description:
PermGen memory leaks occur in web containers when a webapp is unloaded, however some GC roots still refer to classes loaded by the that webapp's ClassLoader. This prevents the ClassLoader from being garbage collected, and hence all the class definitions remain in memory. Given enough application reloads this results in an OutOfMemoryException for the PermGen space. This is a well documented "problem". I believe the connector/j statement cancellation timer can cause this sort of memory leak.

The statement cancellation timer is implemented as a static attribute of type java.util.Timer within the ConnectionImpl class. When this class gets loaded the timer is initialized via a static initialization block. Behind the scenes the JVM starts a thread to service timer tasks.

When you unload your webapp, the cancellation timer thread does not terminate. This means that the ClassLoader cannot be collected, resulting in a PermGen memory leak. Moving the connector/j jar to the container's common library directory (common/lib in Tomcat) does not help.

How to repeat:
1. Start a web application that uses connector/j and play around with it so it hits the database.
2. Stop and unload this web application from the container.
3. Connect a JMX console and check the running threads - the "MySQL Statement Cancellation Timer" thread will still be running.
4. Use a profiler or jmap/jhat (as described here: http://blogs.sun.com/fkieviet/entry/how_to_fix_the_dreaded) and make a memory dump. All the class definitions from the web application that was unloaded will still be in memory.

The specific test setup that I used was:
- Tomcat 5.5.26
- MySQL connector/j 5.1.6
- MySQL 5.0.51a
- YourKit profiler to verify the leak

I also created a simple servlet as shown below to verify that java.util.Timer instances that are static do not in fact get automatically terminated when the web application is unloaded. If the timer is changed to be non-static, the application gets cleaned up OK.

package tinker;

import java.io.IOException;
import java.util.Timer;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class TestServlet extends javax.servlet.http.HttpServlet implements javax.servlet.Servlet {
    private static Timer timer;

    static {
        timer = new Timer("Test Timer", false);
    }

    public TestServlet() {
        super();
    }

    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
    }
}

Suggested fix:
For the time being I have added the following (nasty) code to a cleanup method called (via a ServletContextListener) when the webapp shuts down:

if (ConnectionImpl.class.getClassLoader() == getClass().getClassLoader()) {
    Field f = ConnectionImpl.class.getDeclaredField("cancelTimer");
    f.setAccessible(true);
    Timer timer = (Timer) f.get(null);
    timer.cancel();
}

As for a longer term fix I'm not sure. Ideally the timer would get explicitly cancelled at a point when it is clear that it is no longer required, perhaps in a finalize method somewhere.

At the very least, a public convenience method on ConnectionImpl to cancel the timer would be useful to avoid the messy reflection code above.
[9 May 2008 9:56] Tonci Grgin
Hi Charles and thanks for your report.

Looking into code and explanation I would agree with you. But, the issue is that this is not *technically* a bug, it's a problem with Sun's VM and how it manages class instances. I've been informed that it will take some time to come up with a clean fix, but, as you can see, out c/J team is aware of the problem.
[10 May 2008 4:51] Charles Blaxland
Hi Tonci, thanks for the quick reply.

I'm not so sure that it's correct to write this off as a simply a "problem" with your JVM (can you point me to a Sun bug database report describing this "problem"?). It's an annoying "oddity" perhaps, but one that Java developers shouldn't just ignore.

I do realise however the complicated nature of this problem, and I understand that it's probably not easy for you guys to fix. Perhaps in the meantime you could supply a public static getter or cleanup method for the timer so at least app developers have the ability to clean up the timer themselves when they want to?
[10 May 2008 19:51] Mark Matthews
Hi Charles,

It's really a design issue with the Sun VM, in that classes need to be loaded in permgen. Other VMs (IBM, JRockit) don't have this issue.

You can see similar "leaks" when you use Spring, Hibernate, etc. for the same reason.

We'll see what we can do to mitigate this, but it will be a tradeoff, in that it will either require end-users to clear the timers themselves, or we'll end up having per-connection cancellation timers. We'll have to do some benchmarking to see which one has the least impact.

  -Mark
[11 May 2008 3:03] Charles Blaxland
Thanks Mark,

I agree that Sun's JVM *shouldn't* have this problem... but unfortunately it does.

I disagree that you *necessarily* see similar leaks when using Spring, Hibernate and other libraries. Yes, other libraries sometimes do sometimes cause permgen leaks, but I've been able to remove all these sorts of leaks from my app now, and a lot of fixes involved upgrading to a newer library versions where the developers had recognised and fixed the problem.

Anyway, I appreciate your response. I'd be happy with a cleanup method that I could call myself.

Charles
[5 Aug 2009 15:35] Jim Kool
This hasn't been touched for such a long time but I'll comment on it anyways.

Setup
Ubuntu (Linux), Java 1.6.0, Tomcat 6.0.18, Connector/J 5.1.7 in tomcat/lib

On undeploy, the Timer's internal thread uses the Web app's classloader as the context classloader rather than the Tomcat's standard classloader. Because of the hard reference to classloader from the timer thread, the classes will remain in perm gen forever. In JRockit, it will still be in memory forever, even if it doesn't use perm gen. This is what I found through jhat.
[18 Dec 2009 22:24] Mark Matthews
Fixed for 5.1.11. Unfortunately no great fix exists that lets us keep the cancellation timer shared amongst connection instances, so instead it's lazily created if need be per-instance, and torn down when the connection is closed.
[8 Jan 2010 11:32] Tony Bedford
An entry has been added to the 5.1.11 changelog:

A PermGen memory leaked was caused by the Connector/J statement cancellation timer (java.util.Timer). When the application was unloaded the cancellation timer did not terminate, preventing the ClassLoader from being garbage collected.
[22 Mar 2010 14:14] Christopher Schultz
Mark,

FWIW, could you have just made it easier for clients to cancel this Timer?

Charles's original hack was to use heavy-handed introspection to cancel the Timer (and may not have worked when a SecurityManager was in use). A static method on, say, the ConnectionImpl object would have been sufficient to allow client code to manually cancel the timer. In a webapp, writing ServletContextListeners to perform such cleanup is routine, so long as the component (Connector/J in this case) has documentation that includes notes for webapp developers.

It sounds like your solution (I haven't inspected the code) will end up creating lots of Timer threads, no?

Thanks,
-chris
[22 Mar 2010 14:23] Mark Matthews
Christopher,

The Timer will only be created if a statement is ever asked to have a timeout applied, and even then, only once per JDBC connection. It will have the same lifespan of the JDBC connection.

The common use case (at least with MySQL) is to not use statement timeouts, and instead rely on network timeouts to un-cleanly cancel running statements (because canceling a running statement from the client when using MySQL is not necessarily guaranteed to happen in a timely matter, there are only a few points where statement cancellation is checked for in the server).
[30 Apr 2010 16:55] Christopher Schultz
Mark,

I think there's another possibility: instead of using the ContextClassloader for the timer thread, maybe you could use the ClassLoader of the Connection itself.

Of course, it's the Thread class that caches the ContextClassLoader, but maybe you can trick it like this:

Thread thread = Thread.currentThread();
ClassLoader ctxCl = thread.getContextClassLoader();

// Push our ClassLoader
thread.setContextClassLoader(this.getClass().getClassLoader());
Timer timer = new Timer();

// Pop the original ContextClassLoader
thread.setContextClassLoader(ctxCl);

The above code would allow you to share Timers across Connections, etc.
[30 Apr 2010 16:58] Christopher Schultz
Mark,

One more comment: to my knowledge, we aren't using any Statements that have time limits on them, and the Timer is still being created, so there's no way for us to avoid this leak without writing an ugly hack such as Charles's reflective-nulling technique.

Thanks,
-chris
[30 Apr 2010 18:18] Mark Matthews
Christopher,

I'm not sure how one is getting created if -something- isn't using statement timeouts. The timer is created lazily only when statement timeouts are being used.

The only things I can think of are that you're not using a version of the driver that has the fix (even though you think you are...something in jre/lib/ext?, or your framework is using statement timeouts for you).

You might want to consider setting "enableQueryTimeouts" to "false" in your JDBC URL, which will cause the driver to ignore query timeouts, and thus the timer shouldn't get created.
[4 May 2010 13:55] Christopher Schultz
Mark,

I've been trying to use YourKit to figure out where the Timer gets allocated, but it's not registering for some reason... it definitely gets created before I can connect the profiler to the running process. I've even tried allocation profiling that starts at JVM startup time, but no love there, either.

I'm definitely /not/ using a driver with the fix: I'm using Connector/J 5.1.8 and I'm trying to implement a workaround for the time being, and my workarounds aren't working. I've tried Charles's suggestion from this bug and it does not work for several reasons:

1. The ClassLoader used for creating the Timer is that of the container and not of the webapp (and thus the predicate fails)

2. Calling Timer.cancel appears to have no effect

Here is my code (modified from Charles's so that Connector/J is not a compile-time requirement):

            ClassLoader myClassLoader = this.getClass().getClassLoader();
            Class clazz = Class.forName("com.mysql.jdbc.ConnectionImpl",
                                        false,
                                        myClassLoader);

            if(!(clazz.getClassLoader() == myClassLoader))
            {
                log.info("MySQL ConnectionImpl was loaded with another ClassLoader: (" + clazz.getClassLoader() + "): cancelling anyway");
            }
            else
            {
                log.info("MySQL ConnectionImpl was loaded with the WebappClassLoader: cancelling the Timer");
            }

            Field f = clazz.getDeclaredField("cancelTimer");
            f.setAccessible(true);
            Timer timer = (Timer) f.get(null);
            timer.cancel();
            log.info("completed timer cancellation");

Upon webapp reload, I can see all log messages including "completed timer cancellation" yet the Timer object and the TimerTask continue to live on, forcing the WebappClassLoader to remain.

I'd like to get this workaround working if for no other reason than to understand what's going on. It doesn't hurt that we have a rather long period of testing before we can roll-out a new JDBC driver into production.

Finally, this is not urgent for us because we never restart our webapps in production: we always do a full Tomcat restart just to avoid possible PermGen issues.

Thanks,
-chris
[25 Jan 2011 10:55] he nian
Hi,Here is my workaround,working fine till now:
Class c= Class.forName("com.mysql.jdbc.Connection");
			Field f = c.getDeclaredField("cancelTimer");
			f.setAccessible(true);
			f.set(null,new Timer() {
				@Override
				public void schedule(TimerTask task, Date firstTime, long period) {
					//Do nothing
				}
				@Override
				public void schedule(TimerTask task, Date time) {
					//Do nothing
				}
				@Override
				public void schedule(TimerTask task, long delay) {
					//Do nothing
				}
				@Override
				public void schedule(TimerTask task, long delay, long period) {
					//Do nothing
				}
				@Override
				public void scheduleAtFixedRate(TimerTask task, Date firstTime, long period) {
					//Do nothing
				}
				@Override
				public void scheduleAtFixedRate(TimerTask task, long delay, long period) {
					//Do nothing
				}
			});
[15 Nov 2013 15:05] Ricardo Burgos
Hello solutions driver problem updating mysql to version 5.1.27 or higher

http://dev.mysql.com/downloads/connector/j/
[23 Aug 2017 7:28] Cedric Counotte
How can this be closed!?

I'm using version 5.1.39, have set no timeout on any connections, and keep getting this error!

I am tracking all connections and printing any that could have leaked and don't have any.

Previous work-around mentioned here are not working as the cancelTimer field does not exists!?
[25 Sep 2018 13:21] Cedric Counotte
Just wanted to confirm this bug is still present in latest version of the connector, that is 5.1.47.

It's very easy to reproduce:

Open a connection, make a query using statement and setQueryTimeout(30), then close the connection.

The timer will be indefinitely and permanently running.

That's a real problem when running in a service running 24x7, and will end-up consuming all available threads and ressources.

To avoid the leak, it's "simple enough": DO NOT USE THE BUGGY setQueryTimeout AT ALL. Keep fingers crossed that you don't have a query that never ends for whatever reason, because of another bug for example.

Here is a sample stack trace:

org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [xxxx] appears to have started a thread named [MySQL Statement Cancellation Timer] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
 java.lang.Object.wait(Native Method)
 java.util.TimerThread.mainLoop(Timer.java:552)
 java.util.TimerThread.run(Timer.java:505)