MySQL Bugs: #1605: support LONGBLOB ! we get java.lang.OutOfMemoryError when having 60MB blob

Bug #1605	support LONGBLOB ! we get java.lang.OutOfMemoryError when having 60MB blob
Submitted:	20 Oct 2003 5:38	Modified:	11 Nov 2009 2:35
Reporter:	Ralf Hauser	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	Connector / J	Severity:	S2 (Serious)
Version:	3.1.0 nightly (oct 3)	OS:	Linux (RH 9)
Assigned to:	Mark Matthews	CPU Architecture:	Any

Description:
When getting a large file out of the DB, we got a java.lang.OutOfMemoryError.
It seems that reason for this is that your implementation of InputStream takes an approach that is not memory efficient, i.e. not suited for LONGBLOB since most machines do not have 4G+ RAM and ByteArrayInputStream apparently loads all into memory before continuing to do next steps... - is my analysis right?

Our Tomcat 4.1.24 crashed with "java.lang.OutOfMemoryError" while accessing a large TestBlob of ~ 60MB

How to repeat:
every time

Suggested fix:
is my analysis right?

This is a limitation of the MySQL protocol (it does not send 'chunked streams' for blobs...The client has to read the _entire_ blob into memory.

Also, realize that most JVMs only allow a default max heap of 64M, so that if you want to work around that, you need to use the '-Xmx' parameter to size the heap appropriately (-Xmx1024M for example would allow the JVM to allocate up to 1GB of heap space).

Future versions of the MySQL protcol are planned to support 'chunking' of blob data.

I retrieve large BLOBs by using repeatedly retrieving only sections of the BLOB using substring, ie:
select substring(document, 1, 10240) from documents where docid='3';
and then 
select substring(document, 10241, 10240) from documents where docid='3';
etc.
My servlet turns around and immediately writes the partial response to the client so even the servlet does not need to allocate very much memory.  In practice, I use 10MB partial queries.

Daniel,

You mentioned how you worked around reading a large blob 10meg at a time with substring; how did you store the large blob in the database?

Do you have a code sample of your work around?

Thank You!

I ran into the attached problem:  (sounds like the same issue?)
Hor ironic you reported this a year ago to the day.  Still no fix?

Hope someone can help, this is my first post. 

Im getting an out of memory error trying to store large blobs in a mysql
database. I think I have tracked the issue down to connector j. 

I have code like this to set the blob field with a java.io.InputStream: 
InputStream fstream = new FileInputStream("C:\\c.zip"); 
int a = fstream.available(); 
m_PreStmt.setBinaryStream(5,fstream,a); 

When the file size gets over 100meg, or 200 meg; I get an out of memory error. 
My system has over a gig of memory, and I have the JVM set to 500meg. 

My understanding of setting a blob to an inputstream is that this should avoid
the 
whole (chunk of data) being loaded into memory at once... that the stream would

be read a little at a time. 

I have tried several versions of connector j 3.0.11, 3.0.1.5.ga, and 3.1.4-beta,
all give the same out of memory error. 

In the source for connector j 3.1.4-beta the error occurs in the method
ensureCapacity(int additionalData) 

The line of code is: byte[] newBytes = new byte[newLength]; 

It seems that the entire chunk of memory is being allocated? 

Can anyone offer any help? 

also this seems to be the same type problem as reported in bug # 1605 
http://bugs.mysql.com/bug.php?id=1605 

Thank You! 

final void ensureCapacity(int additionalData) throws SQLException { 
if ((this.position + additionalData) > getBufLength()) { 
if ((this.position + additionalData) < this.byteBuffer.length) { 
// byteBuffer.length is != getBufLength() all of the time 
// due to re-using of packets (we don't shrink them) 
// 
// If we can, don't re-alloc, just set buffer length 
// to size of current buffer 
setBufLength(this.byteBuffer.length); 
} else { 
// 
// Otherwise, re-size, and pad so we can avoid 
// allocing again in the near future 
// 
int newLength = (int) (this.byteBuffer.length * 1.25); 

if (newLength < (this.byteBuffer.length + additionalData)) { 
newLength = this.byteBuffer.length 
+ (int) (additionalData * 1.25); 
} 

if (newLength < this.byteBuffer.length) { 
newLength = this.byteBuffer.length + additionalData; 
} 

---> byte[] newBytes = new byte[newLength]; 

System.arraycopy(this.byteBuffer, 0, newBytes, 0, 
this.byteBuffer.length); 
this.byteBuffer = newBytes; 
setBufLength(this.byteBuffer.length); 
} 
} 
} 

How to repeat:
Set a blob field to a large file (200 meg or more)
like this

InputStream fstream = new FileInputStream("C:\\c.zip"); 
int a = fstream.available(); 
m_PreStmt.setBinaryStream(5,fstream,a); 

Try inserting the row into a table.
You will get an out of memory error (I have my JVM set to 500 meg)

Suggested fix:
This is the mthod causing the out of memory error in connector j
I copied this from 3.1.14-beta version, but the same proble exists in 3.0.11 and
3.0.15.ga

The specific line of code is: byte[] newBytes = new byte[newLength]; 

final void ensureCapacity(int additionalData) throws SQLException { 
if ((this.position + additionalData) > getBufLength()) { 
if ((this.position + additionalData) < this.byteBuffer.length) { 
// byteBuffer.length is != getBufLength() all of the time 
// due to re-using of packets (we don't shrink them) 
// 
// If we can, don't re-alloc, just set buffer length 
// to size of current buffer 
setBufLength(this.byteBuffer.length); 
} else { 
// 
// Otherwise, re-size, and pad so we can avoid 
// allocing again in the near future 
// 
int newLength = (int) (this.byteBuffer.length * 1.25); 

if (newLength < (this.byteBuffer.length + additionalData)) { 
newLength = this.byteBuffer.length 
+ (int) (additionalData * 1.25); 
} 

if (newLength < this.byteBuffer.length) { 
newLength = this.byteBuffer.length + additionalData; 
} 

---> byte[] newBytes = new byte[newLength]; 

System.arraycopy(this.byteBuffer, 0, newBytes, 0, 
this.byteBuffer.length); 
this.byteBuffer = newBytes; 
setBufLength(this.byteBuffer.length); 
} 
} 
}

One option certainly would be to immediately stream it to the disk like org.apache.commons.fileupload are doing it.
If doing so, please do not forget security (see http://issues.apache.org/bugzilla/show_bug.cgi?id=32145 and http://issues.apache.org/bugzilla/show_bug.cgi?id=32144)