Gems/Services for autoscaling Heroku's dynos and workers Gems/Services for autoscaling Heroku's dynos and workers heroku heroku

Gems/Services for autoscaling Heroku's dynos and workers


We ran into this a while ago and I spent quite a bit of time on this to my great frustration. I'll try to stick to the salient point. There are several Heroku autoscaling solution that seem decent at first glance.

The example that has already been given heroku-autoscaler is actually for autoscaling dynos and is pretty much the only solution out there that claims to do this (and it certainly doesn't do it well). Most others will only claim to autoscale workers for you. So, let's focus on that first. The autoscalers you'll look at for workers depend on what you're actually using for you background workers e.g. delayed_job, resque. Those are the most common background processing libs that people use, so the autoscalers will try to hook into one of them. You can use things like:

Some of these work on the Cedar stack some might need a bit of tweaking. The problem with all of them is that it's like trying to pull yourself out of the swamp by your own hair. Let's take hirefire as an example (it's probably the best one of the lot). It modifies delayed_job so that the workers themselves can look at the queue and spin up more workers if necessary, if there are no more jobs in the queue, the workers will all shut each other down. There are several problems:

  • if you want to put a job on the queue to be executed in the future as opposed to right now, you're out of luck. A worker starts up when jobs enter the queue, but since the job is to be executed in the future the worker will shut down and will not start up unless another job enters the queue (that's the only thing that prompts workers to start up)
  • you loose the ability to retry failed jobs, this is possible by default in delayed_job, but it takes a little while before a failed job is to be retried (and progressively longer) if it fail multiple times, but the workers will shut down during this time delay and there is nothing to prompt them to start up again (in essence this is the same issue as in the first scenario)

The thing that solves this problem is to have one worker running continuously it can therefore monitor the queue periodically and can execute jobs when necessary or even spin up more workers. But if you do that, you're not saving any money (you have a worker running continuously 24/7 and have to pay for that) and that's the whole premise behind autoscalers on heroku. In essence, if you only have occasional background processing to do, or you have background jobs that are likely to fail but succeed on retry, or you have background jobs that don't need to be executed instantly, there is no autoscaling library you can use that will work for you.

Here is one alternative. The guy who wrote Hirefire, later spun it off into a webapp (Hirefire app), the essence of which is to externally monitor your Heroku workers/dynos for you and spin up/shut down workers dynos as necessary. This was free in beta but it now costs money, less than what you'd pay to run a worker 24/7 but still not insignificant if you only need a few background jobs once in a while. Either way this is the only viable way to make sure your background job infrastructure does what you want (well that and rolling your own solution which means having a machine like an EC2 instance where you can put some scripts which will ping your heroku app and spin up/shut down workers as needed - a non-trivial amount of effort).

Now Hirefire app does offer to autoscale your dynos for you as well, it does this based on hooking in to the latency of your heroku request queue. However I found that this didn't work well, perhaps if you're close to the Amazon datacenter where your heroku app actually lives (we weren't), you might have a different experience. But, for us it unnecessarily spun up a whole bunch of dynos and would never spin them down no matter how much I tweaked the settings. You can put it down to the fact that it was a beta it may have improved since then, but that's the experience that I had.

Long story short, if you want to autoscale your workers, use Hirefire app, you'll be saving a lot less money than you thought, but it is still the cheapest option. If you want to autoscale dynos you're basically out of luck. This is just one of those limitations you live with for having the convenience of a platform like Heroku.


Heroku is offering a new add-on called AdeptScale which is now just out of Beta.

Here is the add-on page for AdeptScale

Here is the more detailed documentation for AdeptScale

Here is the form to sign up for Heroku's Beta Program

Hopefully this will be a robust solution for autoscaling Heroku Dynos, as I'm not still not happy with the current options.

Update (2/4/13): I signed up for Heroku's Beta program to try out this add-on, and its worked really well for me. Occasionally scaling up with traffic, but mostly sitting on the minimum number of dynos I've set of 2. It's greatly reduced my bill, and eliminated worry that I might be slow during peak usage times.

Update (3/6/13): Added link to Heroku's Sign up page for their beta program.

Update (4/14/13): Looks like auto-scaling is out of Beta. It's still working really well for me.


HireFire.io (The Service, not the Open Source Project) now allows you to use your New Relic metrics to auto-scale your web dynos. New Relic is a performance monitoring tool provided as an add-on through Heroku. They have a free tier and it's sufficient to use with HireFire.

You can auto-scale based on:

  • Response Time
    • This is the Response Time you find on the New Relic Dashboard. It's a combination of various factors including Request Queuing, Database Performance, App-Layer, Router, etc.
  • Apdex Score
    • This allows you to scale based on your New Relic Apdex Score, enabling you to scale based on user experience/satisfaction, which is determined by this score.

Aside of this we have become language/framework agnostic. For worker dynos all you have to do to get auto-scaling working is to setup a JSON end-point at a certain path in your app that returns a very simple JSON string containing the queue size (we provide convenient, but not required, macros for the Ruby language and some out-of-the-box support for Django apps, but like I said it works for any language/framework by manually setting up a JSON end-point - it's very easy). For web dynos, you can use the HireFire Metric Source with basically any language/framework, and the above mentioned New Relic Metric Source for languages/frameworks that are supported by New Relic (these are common languages such as Ruby, Python, Java, etc).

Disclaimer: I built HireFire.