Bug #36565 Statement cancellation timer can cause permgen memory leak in web applications
Submitted: 7 May 2008 16:15 Modified: 8 Jan 12:32
Reporter: Charles Blaxland
Status: Closed
Category:Connector/J Severity:S5 (Performance)
Version:5.1.6 OS:Any
Assigned to: Target Version:
Triage: D3 (Medium)

[7 May 2008 16:15] Charles Blaxland
Description:
PermGen memory leaks occur in web containers when a webapp is unloaded, however some GC
roots still refer to classes loaded by the that webapp's ClassLoader. This prevents the
ClassLoader from being garbage collected, and hence all the class definitions remain in
memory. Given enough application reloads this results in an OutOfMemoryException for the
PermGen space. This is a well documented "problem". I believe the connector/j statement
cancellation timer can cause this sort of memory leak.

The statement cancellation timer is implemented as a static attribute of type
java.util.Timer within the ConnectionImpl class. When this class gets loaded the timer is
initialized via a static initialization block. Behind the scenes the JVM starts a thread
to service timer tasks.

When you unload your webapp, the cancellation timer thread does not terminate. This means
that the ClassLoader cannot be collected, resulting in a PermGen memory leak. Moving the
connector/j jar to the container's common library directory (common/lib in Tomcat) does
not help.

How to repeat:
1. Start a web application that uses connector/j and play around with it so it hits the
database.
2. Stop and unload this web application from the container.
3. Connect a JMX console and check the running threads - the "MySQL Statement
Cancellation Timer" thread will still be running.
4. Use a profiler or jmap/jhat (as described here:
http://blogs.sun.com/fkieviet/entry/how_to_fix_the_dreaded) and make a memory dump. All
the class definitions from the web application that was unloaded will still be in
memory.

The specific test setup that I used was:
- Tomcat 5.5.26
- MySQL connector/j 5.1.6
- MySQL 5.0.51a
- YourKit profiler to verify the leak

I also created a simple servlet as shown below to verify that java.util.Timer instances
that are static do not in fact get automatically terminated when the web application is
unloaded. If the timer is changed to be non-static, the application gets cleaned up OK.

package tinker;

import java.io.IOException;
import java.util.Timer;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class TestServlet extends javax.servlet.http.HttpServlet implements
javax.servlet.Servlet {
    private static Timer timer;

    static {
        timer = new Timer("Test Timer", false);
    }

    public TestServlet() {
        super();
    }

    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws
ServletException, IOException {
    }
}

Suggested fix:
For the time being I have added the following (nasty) code to a cleanup method called
(via a ServletContextListener) when the webapp shuts down:

if (ConnectionImpl.class.getClassLoader() == getClass().getClassLoader()) {
    Field f = ConnectionImpl.class.getDeclaredField("cancelTimer");
    f.setAccessible(true);
    Timer timer = (Timer) f.get(null);
    timer.cancel();
}

As for a longer term fix I'm not sure. Ideally the timer would get explicitly cancelled
at a point when it is clear that it is no longer required, perhaps in a finalize method
somewhere.

At the very least, a public convenience method on ConnectionImpl to cancel the timer
would be useful to avoid the messy reflection code above.
[9 May 2008 11:56] Tonci Grgin
Hi Charles and thanks for your report.

Looking into code and explanation I would agree with you. But, the issue is that this is
not *technically* a bug, it's a problem with Sun's VM and how it manages class instances.
I've been informed that it will take some time to come up with a clean fix, but, as you
can see, out c/J team is aware of the problem.
[10 May 2008 6:51] Charles Blaxland
Hi Tonci, thanks for the quick reply.

I'm not so sure that it's correct to write this off as a simply a "problem" with your JVM
(can you point me to a Sun bug database report describing this "problem"?). It's an
annoying "oddity" perhaps, but one that Java developers shouldn't just ignore.

I do realise however the complicated nature of this problem, and I understand that it's
probably not easy for you guys to fix. Perhaps in the meantime you could supply a public
static getter or cleanup method for the timer so at least app developers have the ability
to clean up the timer themselves when they want to?
[10 May 2008 21:51] Mark Matthews
Hi Charles,

It's really a design issue with the Sun VM, in that classes need to be loaded in permgen.
Other VMs (IBM, JRockit) don't have this issue.

You can see similar "leaks" when you use Spring, Hibernate, etc. for the same reason.

We'll see what we can do to mitigate this, but it will be a tradeoff, in that it will
either require end-users to clear the timers themselves, or we'll end up having
per-connection cancellation timers. We'll have to do some benchmarking to see which one
has the least impact.

  -Mark
[11 May 2008 5:03] Charles Blaxland
Thanks Mark,

I agree that Sun's JVM *shouldn't* have this problem... but unfortunately it does.

I disagree that you *necessarily* see similar leaks when using Spring, Hibernate and
other libraries. Yes, other libraries sometimes do sometimes cause permgen leaks, but
I've been able to remove all these sorts of leaks from my app now, and a lot of fixes
involved upgrading to a newer library versions where the developers had recognised and
fixed the problem.

Anyway, I appreciate your response. I'd be happy with a cleanup method that I could call
myself.

Charles
[5 Aug 2009 17:35] Jim Kool
This hasn't been touched for such a long time but I'll comment on it anyways.

Setup
Ubuntu (Linux), Java 1.6.0, Tomcat 6.0.18, Connector/J 5.1.7 in tomcat/lib

On undeploy, the Timer's internal thread uses the Web app's classloader as the context
classloader rather than the Tomcat's standard classloader. Because of the hard reference
to classloader from the timer thread, the classes will remain in perm gen forever. In
JRockit, it will still be in memory forever, even if it doesn't use perm gen. This is
what I found through jhat.
[18 Dec 2009 23:24] Mark Matthews
Fixed for 5.1.11. Unfortunately no great fix exists that lets us keep the cancellation
timer shared amongst connection instances, so instead it's lazily created if need be
per-instance, and torn down when the connection is closed.
[8 Jan 12:32] Tony Bedford
An entry has been added to the 5.1.11 changelog:

A PermGen memory leaked was caused by the Connector/J statement cancellation timer
(java.util.Timer). When the application was unloaded the cancellation timer did not
terminate, preventing the ClassLoader from being garbage collected.