Our goal for the Heroku platform has been to create a totally smooth and seamless experience for getting your Ruby web application online. Web apps revolve around one or more dynamic web processes: what Rubyists often call a mongrel, and what we call a dyno. When it comes to dynos, we think we’ve really nailed it, and nothing makes that more tangible than the ease of scaling your app with the dyno slider.
But most serious web apps have a second aspect: one that gets less attention, but one that is often just as important as the web process. Like the dark side of the moon, this second half of your application is not directly visible to users, but without it the app would not be whole. This aspect of the application is often referred to as background jobs. The processes which handle background jobs are called workers.
A worker is used for any heavy lifting your app needs to do. Some examples include: sending email, accessing a remote API (like posting something to Twitter), fetching posts from an RSS feed, converting an image thumbnail, or rendering a PDF. Anything that will take more than 100ms should go in a worker process, not your web process.
Ad-hoc Background Jobs
There are many ways to create worker processes that run asynchronously from your web requests. Examples include forking a web process, or running a cron job every minute to check for work. Those ad-hoc techniques work ok in some situations, but they not highly maintainable or scalable. At Heroku, we don’t want to do something unless we can do it right. We want our workers to be as smooth, powerful, scalable, and easy to use as our dynos.
Background Jobs Done Right
Worker processes on Heroku run completely independently of your dynos. They can be scaled out horizontally across our grid of instances to any size, independent of the number of dynos you’re running. (Some apps need more concurrency for the web front, some for the background workers – scaling the two should be orthogonal.)
One way to describe a mongrel or dyno is as a share-nothing process that consumes web requests. Likewise, a worker can be described as a share-nothing process that consumes jobs from a work queue.
Work Queues
Work queues are currently an area of academic debate among thought leaders in the Ruby community. Work queues consist of two parts: a queueing system and a message bus. The message bus may be an implementation of a messaging standard, or a custom protocol.
Some examples of queueing systems include: Delayed::Job and Starling.
Some examples of message buses include: RabbitMQ (implementing the AMQP protocol), Beanstalkd, Amazon SQS, and Kestrel.
These are some great tools, many of which are in production use by major Ruby sites. For example, Twitter uses Kestrel, while Scribd uses loops and ActiveMQ. But when the largest Rails sites in the world don’t have a consensus on the best tool for the job, what should the rest of us be using?
Luckily, there’s an excellent solution for medium-sized apps that has been quietly gaining momentum in the Rails world. That solution is Delayed::Job.
Delayed Job
DJ is an easy-as-pie plugin for Rails written by Tobias Lütke. (It can be used with Sinatra, too). It uses the database as a message bus instead of an external daemon. Using the database as the message bus isn’t as high-speed or featureful as a dedicated daemon like RabbitMQ, but it’s easier to set up, and more than enough for most sites – including Github.
John Nunemaker has an excellent tutorial on DJ, and I’ve previously illustrated the steps for building a queue-backed feed reader. So there’s plenty of material to get you going with this plugin.
DJ on Heroku
We’re pleased to announce the first iteration of support for background jobs on Heroku, in the form of DJ support. This has been in heavy beta use by a large group of Heroku beta users (thanks guys!) for the last four months, so we feel very confident that this is a solid solution, ready for real production use today.
Our first publicly-priced offering is a single DJ worker, at $15/mo. You can activate it for your app through the usual add-on interface:
$ heroku addons:add dj
Adding dj to myapp...done.
Follow the instructions for installing the DJ plugin and creating the delayed_jobs table, and you’ll be up and running in minutes. (That’s the smooth, seamless experience you’ve come to expect from Heroku.)
More to Come
If you need something beyond the publicly-priced single worker DJ, contact us. Multi-worker setup, and a more powerful message bus (RabbitMQ) are things we already have in late alpha / early beta status. If you’ve got an app that you think can put these technologies to the test, we’d love to hear from you.