Streaming large files in a java servlet

java java-io

When possible, you should not store the entire contents of a file to be served in memory. Instead, aquire an InputStream for the data, and copy the data to the Servlet OutputStream in pieces. For example:

ServletOutputStream out = response.getOutputStream();InputStream in = [ code to get source input stream ];String mimeType = [ code to get mimetype of data to be served ];byte[] bytes = new byte[FILEBUFFERSIZE];int bytesRead;response.setContentType(mimeType);while ((bytesRead = in.read(bytes)) != -1) {    out.write(bytes, 0, bytesRead);}// do the following in a finally block:in.close();out.close();

I do agree with toby, you should instead "point them to the S3 url."

As for the OOM exception, are you sure it has to do with serving the image data? Let's say your JVM has 256MB of "extra" memory to use for serving image data. With Google's help, "256MB / 200KB" = 1310. For 2GB "extra" memory (these days a very reasonable amount) over 10,000 simultaneous clients could be supported. Even so, 1300 simultaneous clients is a pretty large number. Is this the type of load you experienced? If not, you may need to look elsewhere for the cause of the OOM exception.

Edit - Regarding:

In this use case the images can contain sensitive data...

When I read through the S3 documentation a few weeks ago, I noticed that you can generate time-expiring keys that can be attached to S3 URLs. So, you would not have to open up the files on S3 to the public. My understanding of the technique is:

Initial HTML page has download links to your webapp
User clicks on a download link
Your webapp generates an S3 URL that includes a key that expires in, lets say, 5 minutes.
Send an HTTP redirect to the client with the URL from step 3.
The user downloads the file from S3. This works even if the download takes more than 5 minutes - once a download starts it can continue through completion.

java java-io

Why wouldn't you just point them to the S3 url? Taking an artifact from S3 and then streaming it through your own server to me defeats the purpose of using S3, which is to offload the bandwidth and processing of serving the images to Amazon.

java java-io

I've seen a lot of code like john-vasilef's (currently accepted) answer, a tight while loop reading chunks from one stream and writing them to the other stream.

The argument I'd make is against needless code duplication, in favor of using Apache's IOUtils. If you are already using it elsewhere, or if another library or framework you're using is already depending on it, it's a single line that is known and well-tested.

In the following code, I'm streaming an object from Amazon S3 to the client in a servlet.

import java.io.InputStream;import java.io.OutputStream;import org.apache.commons.io.IOUtils;InputStream in = null;OutputStream out = null;try {    in = object.getObjectContent();    out = response.getOutputStream();    IOUtils.copy(in, out);} finally {    IOUtils.closeQuietly(in);    IOUtils.closeQuietly(out);}

6 lines of a well-defined pattern with proper stream closing seems pretty solid.

CodeHunter

Streaming large files in a java servlet

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last