Bug #80615 prepared statement leak when rewriteBatchedStatements=true and useServerPrepStmt
Submitted: 4 Mar 2016 13:55 Modified: 12 Oct 2016 19:28
Reporter: Julian Reschke Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / J Severity:S2 (Serious)
Version:5.1.38 OS:Any
Assigned to: Filipe Silva CPU Architecture:Any

[4 Mar 2016 13:55] Julian Reschke
Description:
When connecting with rewriteBatchedStatements=true and useServerPrepStmts=true, the connector apparently can fail to properly close prepared statements on the server; it's quite possible that this only happens when concurrent threads access the database.

How to repeat:
See bug report https://issues.apache.org/jira/browse/OAK-4082. We can reproduce this in Apache Jackrabbit OAK (trunk) by running 

java -cp mysql.jar:oak-run-1.6-SNAPSHOT.jar \
      org.apache.jackrabbit.oak.run.Main benchmark
ConcurrentCreateNodesTest Oak-RDB \
      --rdbjdbcuri 'jdbc:mysql:...?&useServerPrepStmts=true&rewriteBatchedStatements=true&useConfigs=maxPerformance'
\
      --rdbjdbcuser ... \
      --rdbjdbcpasswd ... \
      --dropDBAfterTest false

The problem seems to be caused by a single INSERT statement used in a batch; when that statement is marked as non-poolable, the problems go away. Note that Statement.setPoolable() is just a hint for the JDBC driver, under no condition its use (or non-use) should affect anything but performance.
[16 Mar 2016 17:06] Filipe Silva
Hi Julian,

Thank you for this bug report.

I was able to verify the behavior you reported but I failed to fully understand your specific scenario.

This issue happens because either you have some server prepared statements leakage on this server (you are not closing server prepared statements somewhere) or you have set the prepared statement cache size too high (higher than or close to the value of the global server variable 'max_prepared_stmt_count').

From what I've seen (briefly) from your code you are closing all statements you open. So, unless the statements cache is very large (which I don't see you changing the default size of 25) it remains to suspect of some other process creating statements without closing them. After all you mentioned something about concurrency too.

Check this blog post (http://mysqlblog.fivefarmers.com/2012/07/05/whos-leaking-prepared-statements/), it may help you find the culprits.

For me to be able to analyze this further (beyond the issue I already found) I'll need some more information:
- Please rerun with latest Connector/J version and provide a full stack trace.
- What server version are you using?
- Consult the counters (global and session) of server prepared statements when the exception is thrown (use the method described in the blog post above).
- Is there anything else in the database connection string we are not seeing? If so please explain.

Worth notice that, for me, marking the statements as non-poolable helps causing the problem, while marking them as poolable makes it better since this way the statements are actually closed when removing the oldest entries from the cache, as opposed to your case.
[16 Mar 2016 17:28] Julian Reschke
"I was able to verify the behavior you reported but I failed to fully understand your specific scenario."

So you were able to reproduce the problem?

"This issue happens because either you have some server prepared statements leakage on this server (you are not closing server prepared statements somewhere) or you have set the prepared statement cache size too high (higher than or close to the value of the global server variable 'max_prepared_stmt_count')."

I don't believe there is leakage in the application code. As described in <https://issues.apache.org/jira/browse/OAK-4082>, the problem occurs if and only if "rewriteBatchedStatements=true" and the batch insert statement is marked as poolable. That clearly indicates that there is a bug in the connector, no?

"From what I've seen (briefly) from your code you are closing all statements you open. So, unless the statements cache is very large (which I don't see you changing the default size of 25) it remains to suspect of some other process creating statements without closing them. After all you mentioned something about concurrency too."

The statement cache has the default size. I mentioned concurrency because I believe it's a concurrency problem in the connector.

"For me to be able to analyze this further (beyond the issue I already found) I'll need some more information:"

...what is the issue you already found?

"- Please rerun with latest Connector/J version and provide a full stack trace."

I am running the latest connector; the stack trace is:

15:45:58 Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Can't create more than max_prepared_stmt_count statements (current value: 16382)
15:45:58 	at sun.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source)
15:45:58 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
15:45:58 	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
15:45:58 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:400)
15:45:58 	at com.mysql.jdbc.Util.getInstance(Util.java:383)
15:45:58 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:980)
15:45:58 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3847)
15:45:58 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3783)
15:45:58 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2447)
15:45:58 	at com.mysql.jdbc.ServerPreparedStatement.serverPrepare(ServerPreparedStatement.java:1514)
15:45:58 	at com.mysql.jdbc.ServerPreparedStatement.<init>(ServerPreparedStatement.java:389)
15:45:58 	at com.mysql.jdbc.ServerPreparedStatement.prepareBatchedInsertSQL(ServerPreparedStatement.java:2832)
15:45:58 	at com.mysql.jdbc.PreparedStatement.executeBatchedInserts(PreparedStatement.java:1524)
15:45:58 	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1262)
15:45:58 	at org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStoreJDBC.insert(RDBDocumentStoreJDBC.java:302)
15:45:58 	at org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStoreJDBC.update(RDBDocumentStoreJDBC.java:446)
15:45:58 	at org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.bulkUpdate(RDBDocumentStore.java:456)
15:45:58 	... 21 more

(see <https://issues.apache.org/jira/browse/OAK-4082>)

"- What server version are you using?"

5.7.11.

"- Consult the counters (global and session) of server prepared statements when the exception is thrown (use the method described in the blog post above)."

I'll try to get to that.

"- Is there anything else in the database connection string we are not seeing? If so please explain."

No, the only part I left out are server name and port number.

"Worth notice that, for me, marking the statements as non-poolable helps causing the problem, while marking them as poolable makes it better since this way the statements are actually closed when removing the oldest entries from the cache, as opposed to your case."

Did you try to run the test mentioned in the ticket? It's just a matter of checking out "apache oak", building, and invoking the test as explained in the ticket...
[16 Mar 2016 18:39] Filipe Silva
As I said I was able to verify a bug that fits in the overall description from your first report but not exactly the same situation. The main difference is that for me it works with poolable statements while it fails otherwise. But this depends on the configurations combination I use and whether I close the statements or not.

Well, actually this report hits more that one bug.
Regarding the usage of "rewriteBatchedStatements=true". If the maximum number of prepared statements (max_prepared_stmt_count) was previously reached when this statement is executed, then, under certain circumstances, it will fail to create a new server prepared statement (this is the stack trace you are actually obtaining), no matter if you are caching, pooling or whatever the prepared statements. In normal circumstances, when this happens, the connector automatically falls-back to a client prepared statement (unless 'emulateUnsupportedPstmts=false') but this is not not happening here. Actually, I bet that if you set 'emulateUnsupportedPstmts=false' you'll see this failing in other places that the batch execution, and probably faster.
So, the question is "why are you reaching the max_prepared_stmt_count value?". And this may be/is the second bug here. So, lets try to figure out this one first. From what I've found so far, a bad combination of not calling Statement.close() or marking statements as not-poolable while using caching prepared statements (in your example turned on by setting the property 'useConfigs=maxPerformance') or messing with the prepared statements cache size can cause this. But in the end it's always a server prepared statement leakage.

Regarding being a concurrency issue there is nothing leading to that conclusion yet. So, unless proved otherwise, it's not a concurrency issue, except for a possible prepared statements leakage being caused somewhere else that affects how the code here behaves.

The stack trace you provided is *not* from the latest Connector/J. For example, in 5.1.38 the method "com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1262)" doesn't exist anymore.

Thank you for the information you already provided. And no, I didn't try to run the original code from the ticket. It helps if you provide full instructions how to do it or, even better, a self contained test case just using the Connector/J.
[16 Mar 2016 19:47] Julian Reschke
Thanks for the feedback, and sorry for the confusion. The initial bug report apparently indeed used a different driver. The actual stack trace is:

Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Can't create more than max_prepared_stmt_count statements (current value: 16382)
        at sun.reflect.GeneratedConstructorAccessor17.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:404)
        at com.mysql.jdbc.Util.getInstance(Util.java:387)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:939)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3878)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2478)
        at com.mysql.jdbc.ServerPreparedStatement.serverPrepare(ServerPreparedStatement.java:1500)
        at com.mysql.jdbc.ServerPreparedStatement.<init>(ServerPreparedStatement.java:390)
        at com.mysql.jdbc.ServerPreparedStatement.prepareBatchedInsertSQL(ServerPreparedStatement.java:2822)
        at com.mysql.jdbc.PreparedStatement.executeBatchedInserts(PreparedStatement.java:1509)
        at com.mysql.jdbc.PreparedStatement.executeBatchInternal(PreparedStatement.java:1248)
        at com.mysql.jdbc.StatementImpl.executeBatch(StatementImpl.java:959)
        at org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStoreJDBC.insert(RDBDocumentStoreJDBC.java:301)

The recipe for reproducing this is: 

# 1) Checkout Apache oak trunk
svn co https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/
cd trunk

# 2) build with mysql profile
mvn clean install -DskipTests -Prdb-mysql

# 3) Run test
cd oak-run
java -cp target/oak-run-1.6-SNAPSHOT.jar org.apache.jackrabbit.oak.run.Main benchmark ConcurrentCreateNodesTest Oak-RDB --rdbjdbcuri "jdbc:mysql://localhost:3307/oak?useServerPrepStmts=true&useConfigs=maxPerformance&rewriteBatchedStatements=true" --rdbjdbcuser XXX--rdbjdbcpasswd YYY --rdbjdbctableprefix test
[16 Mar 2016 22:09] Filipe Silva
Ok, thanks.

Please allow me some time to analyze this further. In the meantime let me know if you find something relevant for this issue.
[17 Mar 2016 1:42] Filipe Silva
I followed your recipe but this isn't failing for me. At the moment I tested this your code is in revision r1735267 and that's the one I first executed.

After checking your code history I noticed that the revision r1734941 contains a patch for the affected file. Running your code from that revision also works for me, but not when running from the previous revision (r1734527). This one fails with the exact symptom you described.

So, I took the failing revision (r1734527) and analyzed the file core/src/main/java/org/apache/jackrabbit/oak/plugins/document/rdb/RDBDocumentStoreJDBC.java, where I found two leaks, in lines 421 and 617. In these lines you call the method "(...).setPoolable(false);" which fits in one of the "bad combinations" I described earlier. The test ran perfectly after commenting these two lines in this revision (actually commenting the first one was enough, but I still could observe the leak, it just wasn't enough to break the code, though).

Moving on to latest revision (r1735267) I see that the leak in the method update() is no more, but it is still present in the method read(), now in line 588. You can observe the leak by following the procedure described in the blog post I mentioned earlier, simply by comparing the number of active prepared statements on server when running the test, with and without the line I just referred, by successively executing the following query in a mysql client:
--
SELECT sp.thread_id, t.processlist_user user, t.processlist_host host, sp.count_star - sd.count_star open_com_query_ps, cp.count_star - cc.count_star open_com_prepare_ps FROM ( SELECT COUNT_STAR, THREAD_ID FROM events_statements_summary_by_thread_by_event_name WHERE event_name = 'statement/sql/prepare_sql' ) sp JOIN ( SELECT COUNT_STAR, THREAD_ID FROM events_statements_summary_by_thread_by_event_name WHERE event_name = 'statement/com/Prepare' ) cp ON (cp.THREAD_ID = sp.THREAD_ID) JOIN ( SELECT COUNT_STAR, THREAD_ID FROM events_statements_summary_by_thread_by_event_name WHERE event_name = 'statement/sql/dealloc_sql' ) sd ON (sd.THREAD_ID = sp.THREAD_ID) JOIN ( SELECT COUNT_STAR, THREAD_ID FROM events_statements_summary_by_thread_by_event_name WHERE event_name = 'statement/com/Close stmt' ) cc ON (cc.THREAD_ID = sp.THREAD_ID) JOIN threads t ON (t.thread_id = sp.thread_id) ORDER BY GREATEST(open_com_query_ps, open_com_prepare_ps) ASC;
--

You'll observe that the number of prepared statements when the line is not there is significantly lower.

As I explained before, the Connector/J has two bugs here. The second bug one I mentioned earlier causes a server prepared statement leakage by not actually closing the statement on server when .close() is called in the client side. This occurs when setting "cachePrepStmts=true&useServerPrepStmts=true" and a prepared statement is set as non-poolable ("setPoolable(false)"). By itself this doesn't cause the failure you observed because the connector has a fail-safe mechanism that uses client-side prepared statements when server-side prepared statements fail to be prepared. So, your test ends up using client-side prepared statements after a while, specifically when the number of open prepared statements on server hits the value of 'max_prepared_stmt_count'. And this is where the first bug enters the equation. By setting the options 'rewriteBatchedStatements=true&useServerPrepStmts=true' when a query happens to be rewritten a new (server-side) prepared statement is required but the fail-safe mechanism isn't implemented in this spot, so, since the previous leakage already consumed all available prepared statements on server, this ends up throwing the exception you very well know.

As an immediate solution you can, at your side, avoid setting prepared statements as non-poolable when using server-side prepared statements. I hope that helps. Please let me know if, for the time being, this works for you.

This report is considered as "verified".
[17 Mar 2016 11:15] Julian Reschke
Again, thanks a lot for the feedback. It's still strange that the problem doesn't surface with the latest code base; it does for me.

That said, your analysis sounds solid. As for workarounds, we'll probably just recommend not to add rewriteBatchedStatements=true.

Please let me know should you have something I should test...
[26 Aug 2016 14:15] Julian Reschke
So what's the status? Is it fixed? Will it be part of 5.1.40?
[29 Aug 2016 16:13] N.S. Cutler
Indeed, what is the status for 5.1.40?

IIUC this is a pretty serious bug. With HikariCP as the underlying connection pool we're seeing `open_com_prepare_ps` climb steadily higher despite calling close() on the client.

This is with prepared statement properties:
cachePrepStmts="true"
useServerPrepStmts="true"
prepStmtCacheSize="500"

and salient bits from jdbc url:
cacheServerConfiguration=true&
rewriteBatchedStatements=true&
dontTrackOpenResources=true&
useLocalSessionState=true

Is the workaround to set rewriteBatchedStatements to false? Have tried caching statements on the server with above config options during the past year or 2 against MySQL 5.6 and latest versions of ConnectorJ, but have never had success (i.e. have had to turn off server-side statement caching) due to increased memory usage and eventual OOME.
[29 Aug 2016 19:13] Filipe Silva
Connector/J 5.1.40 development is in progress. There is nothing else I can say about it, my apologies.

Regarding the workaround you're asking about, setting 'rewriteBatchedStatements=false' prevents the the errors described before but not the prepared statement leakage by itself (assuming that you're also setting your statements as non-poolable).

The fix for this issue is being prepared.
[29 Aug 2016 20:12] N.S. Cutler
Felipe, thanks (and Julian as well for posting the issue).

Looked at HikariCP source and don't see any explicit calls to `setPoolable`. Likewise in the query dsl, under the hood no calls to `setPoolable`. That sounds like the primary culprit in terms of statement leakage pre-5.1.40.

While I was away the open statements (@1,500) were all closed, and is now climbing back up again. Not sure if this is normal behavior, lots of open prepared statements which are then cleaned up en masse after a several hour delay. I would think that a call to close() would immediately close the connection and related statement(s).

Anyway, if this continues it's unlikely that we'll hit max prepared statement limit and the specific bug Julian reported. Will keep an eye out for 5.1.40 regardless.

Thanks for working on this.
[4 Oct 2016 12:02] Julian Reschke
So is this fixed in yesterday's 5.1.40? The release notes seem to imply that...
[12 Oct 2016 18:47] Filipe Silva
Yes, this bug is fixed in Connector/J 5.1.40.

Thank you for your help.
[12 Oct 2016 19:28] Daniel So
Added the following entry to the Connector/J 5.1.40 changelog:

"A memory leakage occurred when the connection properties cachePrepStmts and
useServerPrepStmts were both set to be true and server-side prepared
statements were set as non-poolable, which resulted in the prepared statement
being not closable by the client, and the number of prepared statements then
kept on increasing.

When the memory leakage described above occurred, it did not make Connector/J
fail, as Connector/J switched to using client-side prepared statements when
the maximum number of prepared statements was reached. However, when
rewriteBatchedStatements was also set to true, the switch to client-side
prepared statements did not occur, and Connector/J threw the
MySQLSyntaxErrorException (???Can't create more than max_prepared_stmt_count
statements???) when the client wanted to create more prepared statements than
allowed.

This fix corrected the way prepared statements are handled in order to avoid
both of the problems described above."
[14 Oct 2016 14:58] Daniel So
The fix has been included in Connector/J 6.0.5. The entry for the 5.1.40
changelog has been included into the 6.0.5 changelog.