Bug #18300 | Memory leak of st_lex (and maybe sp_pcontext?) structures? | ||
---|---|---|---|
Submitted: | 17 Mar 2006 10:45 | Modified: | 3 Aug 2009 17:35 |
Reporter: | Jeremy Cole (Basic Quality Contributor) (OCA) | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Stored Routines | Severity: | S1 (Critical) |
Version: | 5.0.19, 4.1.18 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | bfsm_2007_02_15, memory leak, qc |
[17 Mar 2006 10:45]
Jeremy Cole
[17 Mar 2006 17:49]
MySQL Verification Team
Hi Jeremy, Thank you for the bug report. Could you please provide a reproducible test case for to try it on our side? Thanks in advance.
[18 Mar 2006 2:19]
Jeremy Cole
If I had a reproducible test case I would have submitted it. :) This is only happening on a production server, and attempts to reproduce it on a test machine have failed.
[18 Mar 2006 8:26]
Per-Erik Martin
Can you tell us something about which features these systems are using? Stored procedures or functions, or both? ...with cursors and/or exception handlers? Triggers? Views? Which storage engine(s)?
[18 Mar 2006 8:33]
Jeremy Cole
Hi PEM! It's 100% InnoDB, using triggers, stored procedures, and prepared statements (both via PREPARE/EXECUTE commands with SPs and via JDBC). Highly variable number of concurrent users and load: 25-500 concurrent sessions, and 200-1500 q/s total. It leaks much faster during peak (reaching 2.5G address space limit in ~35 minutes). Nothing special otherwise. It seems to be leaking only from SELECTs, as slaves show no leaks (although they don't run very long -- more replication bugs).
[18 Mar 2006 8:39]
Jeremy Cole
You can find the error log containing the complete debug dumps, as well as a log of out of memory errors etc., here: http://jcole.us/files/bug18300/gndb1.gaggle.net.err.20060318.gz A debug trace from about 1 minute of operation here: http://jcole.us/files/bug18300/mysqld.trace.gz
[18 Mar 2006 8:42]
Jeremy Cole
I didn't state, but maybe it's worth stating: No cursors, no views, no stored functions.
[24 Mar 2006 15:51]
Per-Erik Martin
I assume you are using local variables in your triggers? Can you say anything about the number of rows you have in tables with triggers? (I have a feeling it might be BUG#17260)
[28 Mar 2006 23:56]
Jeremy Cole
An interesting turn of events... Since the memory leak was so painful, it was decided to roll back to 4.1 instead, since that seemed a sure thing. Unfortunately, the memory leak is still occurring, and 4.1 is now crashing regularly with "out of memory" at about the same rate as 5.0 was. That certainly limits the features that could be materially involved in the leak... really the only thing left in their code is prepared statements (via JDBC, and via PREPARE/EXECUTE SQL statements). Here's a crazy thought: Is it possible that the OS/threading library is leaking the memory somehow? It's RHEL3, running the RHEL3 RPMs from MySQL.com.
[29 Mar 2006 9:50]
Konstantin Osipov
To check that you're freeing all the allocated prepared statements, you could do the following: show status like '%stmt%'; Check that the difference between Com_stmt_prepare and Com_stmt_close does not increase while your application runs. This difference would give you an estimation of how many statements there are allocated in the server.
[29 Mar 2006 9:56]
Per-Erik Martin
I think it's very likely that at least part of the problem is BUG#17260 (for which a fix exists). It would be interesting to know if this solves your problem. Do you want to wait for a release with the fix, or try out a separate patch?
[29 Mar 2006 12:02]
Jeremy Cole
Konst: OK, now we're getting somewhere. Is it true that for every allocated statement, Com_stmt_close should always be incremented? Does it get incremented even if the statement is not explicitly deallocated (i.e. you depend on the cleanup at connection close to free it)? The difference between Com_stmt_prepare and Com_stmt_close is definitely growing uncontrolled on this system: $ while [ 1 ]; do echo -n `date` ": "; echo `mysql -N --batch -u root -e 'show status like "Com_stmt_prepare"' | cut -f2` - `mysql -N --batch -u root -e 'show status like "Com_stmt_close"' | cut -f2` | bc; sleep 10; done Wed Mar 29 03:55:35 PST 2006 : 789706 Wed Mar 29 03:55:45 PST 2006 : 789711 Wed Mar 29 03:55:56 PST 2006 : 789737 Wed Mar 29 03:56:06 PST 2006 : 789746 Wed Mar 29 03:56:16 PST 2006 : 789748 Wed Mar 29 03:56:26 PST 2006 : 789755 Wed Mar 29 03:56:36 PST 2006 : 789763 Wed Mar 29 03:56:46 PST 2006 : 789770 Wed Mar 29 03:56:56 PST 2006 : 789768 Wed Mar 29 03:57:06 PST 2006 : 789772 Wed Mar 29 03:57:16 PST 2006 : 789776 Wed Mar 29 03:57:26 PST 2006 : 789779 Wed Mar 29 03:57:36 PST 2006 : 789782 Wed Mar 29 03:57:46 PST 2006 : 789787 Wed Mar 29 03:57:56 PST 2006 : 789789 Wed Mar 29 03:58:06 PST 2006 : 789794 Wed Mar 29 03:58:16 PST 2006 : 789798 Wed Mar 29 03:58:26 PST 2006 : 789799 Wed Mar 29 03:58:36 PST 2006 : 789803 Wed Mar 29 03:58:46 PST 2006 : 789808 Wed Mar 29 03:58:56 PST 2006 : 789810 Wed Mar 29 03:59:06 PST 2006 : 789819 Wed Mar 29 03:59:17 PST 2006 : 789819 Wed Mar 29 03:59:27 PST 2006 : 789827 Wed Mar 29 03:59:37 PST 2006 : 789826 Wed Mar 29 03:59:47 PST 2006 : 789828 Wed Mar 29 03:59:57 PST 2006 : 789828 Wed Mar 29 04:00:07 PST 2006 : 789833 Wed Mar 29 04:00:17 PST 2006 : 789867 Wed Mar 29 04:00:27 PST 2006 : 789884 Wed Mar 29 04:00:37 PST 2006 : 789881 Wed Mar 29 04:00:47 PST 2006 : 789889 Wed Mar 29 04:00:57 PST 2006 : 789880 Wed Mar 29 04:01:07 PST 2006 : 789883 Wed Mar 29 04:01:17 PST 2006 : 789897 Wed Mar 29 04:01:27 PST 2006 : 789900 Wed Mar 29 04:01:37 PST 2006 : 789889
[29 Mar 2006 12:04]
Jeremy Cole
Per-Erik: How could that bug solve the problem, when they are running 4.1 now? No triggers anymore...
[29 Mar 2006 12:21]
Per-Erik Martin
Well, you initially indicated that is was sp structures leaking, and that you were using triggers. Both exists in 5.0 only, and if you for instance use local variables in the triggers it will definitely leak memory. But this does not rule out other leaks as well (in 4.1), of course. :)
[29 Mar 2006 12:24]
Konstantin Osipov
COM_STMT_PREPARE is incremented for every attempt to prepare a statement. So if you have statements that have syntax error or failed to prepare for some other reasons, COM_STMT_PREPARE will differ from the accurate number of prepared statements. Otherwise it gives you the accurate number. COM_STMT_CLOSE is incremented for every deallocate of a prepared statement. Basically, it's called in the destructor. So it doesn't matter who issued the deallocate call: the server itslef (at connection close) or the client (by calling mysql_stmt_close or DEALLOCATE PREPARE). Normally COM_STMT_PREPARE - COM_STMT_CLOSE would give you the more or less accurate number of prepared statements in the server.
[29 Mar 2006 12:25]
Konstantin Osipov
You could try to apply my patch for 4.1 (Bug#16365) and set the hard limit for max_prepared_stmt_count. Then you'll be able to easily see if your application leaks statements, as at some point in time mysql_stmt_prepare will fail.
[30 Mar 2006 23:08]
Jeremy Cole
OK, we have some exciting news. Besides the previous comment that confirmed that it was server-side prepared statements leaking, we have further confirmation. The programmer found a setting in Connector/J, useServerPrepStmts=false, which forces the driver to always use client-side prepared statements. That has solved the memory leak, and the server is now steady using 1.1G of memory. Someone needs to go over the server-side prepared statement code with a fine-toothed comb...
[31 Mar 2006 15:39]
Mark Matthews
Jeremy, while "useServerPrepStmts=false" is a workaround that seems to alleviate the problem for you, I would suggest going back and checking _all_ of your code paths to make sure you're actually closing statements in all cases. This can be an issue even with client-side emulated prepared statements, as it can appear to be a memory leak in your Java application, even though it's as-designed, as it's required by the JDBC specification (more on this in a bit). We've seen this behavior before with other users, and it turned out they missed a PreparedStatement.close() somewhere (or everywhere!). The reason for this behavior is that the JDBC specification states that users are responsible for explicitly closing statements if they want them "freed" before the physical connection is closed. Because of this, if you're using a pool with long-lived connections, these statements will not be freed for quite some time. There are some tools to help track down these types of issues. Our JDBC driver has the "usage advisor" (enabled by adding useUsageAdvisor=true"), which will log when statements are leaked (amongst other things), but if your application does a lot of things that it would complain about (statement leaks, prepared statements being prepared but executed 1 or less times, not looking at all rows or columns of a result set, etc), it might be too much logging for something in production. If you can't run a tool in production, then a tool I use myself and highly recommend for static code analysis is FindBugs, which can catch cases where not all code paths close JDBC resources. http://findbugs.sourceforge.net/ Notice, I'm not saying that your application is definitely not closing statements, but empirical evidence with our userbase says that this is the most common cause, especially when the counts for PREPARE and CLOSE are way out of whack. In any case, it's something you probably want to look into, as not closing these statements even when client-side emulated can lead to GC issues, as these prepared statement instances turn into tenured instances, and end up in a heap that is specially designed for them, but it much slower to collect. This can limit throughput and stability of your Java application itself.
[31 Mar 2006 16:26]
Jeremy Cole
Hi Mark, I had thought of this already, so some changes were made to try to ensure that things got cleaned up. The connection pool was changed so that connections were closed/reopened at least every 60 seconds, but it didn't solve the problem. Shouldn't closing a connection *always* free up all of these resources?
[31 Mar 2006 16:29]
Mark Matthews
Jeremy, Closing the physical connection, should in theory take care of all of that. What connection pool are you using? Is it something like Apache DBCP, or C3p0, or something home-grown?
[8 Jun 2006 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[16 Feb 2007 0:34]
James Day
Jeremy, a number of candidate causes have been fixed and we'd like verification that you still see this problem.
[26 Feb 2007 18:37]
Valeriy Kravchuk
Jeremy, Will it be possible for you to check your application with server-side prepared statements on latest (>5.0.24 and >=4.1.22) versions of MySQL server? I want to check if this is not a duplicate of http://bugs.mysql.com/bug.php?id=21206.
[26 Feb 2007 18:46]
Jeremy Cole
Hi Valeriy, Sorry, it probably won't be possible to test that application again, as it is a production system, and the problems they saw last time were very serious. Regards, Jeremy
[11 Aug 2007 12:49]
Konstantin Osipov
Closing as per comment from Jeremy.
[3 Aug 2009 17:02]
Konstantin Osipov
This was open for 2.5 years without being verified. Valeriy, please either close this bug or verify it. Thanks.
[3 Aug 2009 17:35]
Valeriy Kravchuk
I see no way to repeat this on current 5.0.x (other than NOT closing prepared statements properly). If anybody has a test case that shows some different source of related memory leak, please, provide it.