Serving files stored in S3 in express/nodejs app Serving files stored in S3 in express/nodejs app express express

Serving files stored in S3 in express/nodejs app


i would just stream it from S3. it's very easy, and signed URLs are much more difficult. just make sure you set the content-type and content-length headers when you upload the images to S3.

var aws = require('knox').createClient({  key: '',  secret: '',  bucket: ''})app.get('/image/:id', function (req, res, next) {  if (!req.user.is.authenticated) {    var err = new Error()    err.status = 403    next(err)    return  }  aws.get('/image/' + req.params.id)  .on('error', next)  .on('response', function (resp) {    if (resp.statusCode !== 200) {      var err = new Error()      err.status = 404      next(err)      return    }    res.setHeader('Content-Length', resp.headers['content-length'])    res.setHeader('Content-Type', resp.headers['content-type'])    // cache-control?    // etag?    // last-modified?    // expires?    if (req.fresh) {      res.statusCode = 304      res.end()      return    }    if (req.method === 'HEAD') {      res.statusCode = 200      res.end()      return    }    resp.pipe(res)  })})


If you'll redirect user to a signed url using 302 Found browser will cache the resulting image according to its cache-control header and won't ask it the second time.

To prevent browser from caching the signed url itself you should send proper Cache-Control header along with it:

Cache-Control: private, no-cache, no-store, must-revalidate

So the next time it'll send request to the original url and will be redirected to a new signed url.

You can generate signed url with knox using signedUrl method.

But don't forget to set proper headers to every uploaded image. I'd recommend you to use both Cache-Control and Expires headers, because some browser have no support for Cache-Control header and Expires allows you to set only an absolute expiration time.

With the second option (streaming images through your app) you'll have better control over the situation. For example, you'll be able to generate Expires header for each response according to current date and time.

But what about speed? Using signed urls have two advantages which may affect page load speed.

First, you won't overload your server. Generating signed urls if fast because you're just hashing your AWS credentials. And to stream images through your server you'll need to maintain a lot of extra connections during the page load. Anyway, it won't make any actual difference unless your server is hard loaded.

Second, browsers keeps only two parallel connections per hostname during page load. So, browser will keep resolving images urls in parallel while downloading them. It'll also keep images downloading from blocking downloading of any other resources.

Anyway, to be absolutely sure you should run some benchmarks. My answer was based on my knowledge of HTTP specification and on my experience in web developing, but I never tried to serve images that way myself. Serving public images with long cache lifetime directly from S3 increases page speed, I believe the situation won't change if you'll do it through redirects.

And you should keep in mind that streaming images through your server will bring all the benefits of Amazon CloudFront to naught. But as long as you're serving content directly from S3 both options will work fine.

Thus, there are two cases when using signed urls should speedup your page:

  • If you have a lot of images on a single page.
  • If you serving images using CloudFront.

If you have only few images on each page and serving them directly from S3, you'll probably won't see any difference at all.

Important Update

I ran some tests and found that I was wrong about caching. It's true that browsers caches images they was redirected to. But it associates cached image with the url it was redirected to and not with the original one. So, when browser loads the page second time it requests image from the server again instead of fetching it from the cache. Of course, if server responds with the same redirect url it responded the first time, browser will use its cache, but it's not the case for signed urls.

I found that forcing browser to cache signed url as well as the data it receives solves the problem. But I don't like the idea of caching invalid redirect URL. I mean, if browser will miss the image somehow it'll try to request it again using invalid signed url from the cache. So, I think it's not an option.

And it doesn't matter if CloudFront serve images faster or if browsers limits the number of parallel downloads per hostname, the advantage of using browser cache exceeds all the disadvantages of piping images through your server.

And it looks like most social networks solves the problem with private images by hiding its actual urls behind some private proxies. So, they store all their content on public servers, but there is no way to get an url to a private image without authorization. Of course, if you'll open private image in a new tab and send the url to your friend, he'll be able to see the image too. So, if it's not an option for you then it'll be best for you to use Jonathan Ong's solution.


I would be concerned with using the CloudFront option if the photos really do need to remain private. It seems like you'll have a lot more flexibility in administering your own security policy. I think the nginx setup may be more complex than is necessary. Express should give you very good performance working as a remote proxy where it uses request to fetch items from S3 and streams them through to authorized users. I would highly recommend taking a look at Asset Rack, which uses hash signatures to enable permanent caching in the browser. You won't be able to use the default Racks because you need to calculate the MD5 of each file (perhaps on upload?) which you can't do when it's streaming. But depending on your application, it could save you a lot of effort for browsers never to need to refetch the images.