Hackernoon logoHow To Make Amazing Background Jobs In Ruby by@davidmles

How To Make Amazing Background Jobs In Ruby

David Morales Hacker Noon profile picture

@davidmlesDavid Morales

Computer engineer. Working as a web developer since 2000.

You are developing a Ruby application where the user can sign up, and after submitting the form, the user has to receive an email. Will you send it immediately? If so, the user will have to wait while the application connects to the email server and sends the actual email. That’s bad for the user experience.

To solve that problem, the email should be sent in the background. That way it could be enqueued to be sent later, and the user sees the confirmation page quickly. Much better!

This is accomplished by a queue system, which runs in the background waiting for new jobs to be executed. That’s the term used to refer to the tasks. In our case, the email is the job to be executed.

There are a few of popular systems. A few need a database, such as Delayed::Job, while others prefer Redis, such as Resque and Sidekiq.

Some common jobs that should be executed in the background are:

  • Sending emails
  • Resizing images
  • Importing a batch of data
  • Updating a search server

Delayed::Job

This system needs a database, because it uses a table to manage jobs. The “delayed” part in its name comes from the way it enqueues a job: using the delay method. So if we have an object to be run like this:

object.method!

With Delayed::Job it can be enqueued this way:

object.delay.method!

It’s only one method in the middle, very handy. However, the class can also be adapted so a method is always processed by Delayed::Job asynchronously. That’s accomplished using the

handle_asynchronously
helper:

class StatsWorker
  def calculate_totals(param1, param2)
    # long running method
  end

  handle_asynchronously :calculate_totals
end

stats_worker = StatsWorker.new
stats_worker.calculate_totals(param1, param2) # no need to call delay

Delayed::Job even can assign priorities, run at a given time, and enqueue to multiple queues. For instance, to run a job in 5 minutes with a priority of 2, enqueue it like this:

object.delay(run_at: 5.minutes.from_now, priority: 2).method!

Or using the 

handle_asynchronously
 feature, we could write it as follows:

handle_asynchronously :calculate_totals,
                      run_at: proc { 5.minutes.from_now },
                      priority: 2

And then calling the method as before:

stats_worker.calculate_totals(param1, param2)

Resque

Resque is based on Delayed::Job, but it uses Redis instead of a database. It also provides a Sinatra application to monitor jobs so you can see which ones are running and their queues.

To make a class compatible with Resque so it can be run in the background, a new method must be implemented: 

perform
. This is the method that will be called by Resque:

class StatsWorker
  @queue = :stats_queue

  def self.perform(param1, param2)
    calculate_totals(param1, param2)
  end

  private

  def calculate_totals(param1, param2)
    # long running method
  end
end

Resque.enqueue(StatsWorker, param1, param2)

The 

self.perform
 method can do anything, it doesn’t need to be a “caller” like in this case. I could have moved the long running method inside.

To schedule a job to be run at a specified time, Resque needs the

resque-scheduler
 gem. Once installed, the jobs can be enqueued like this:

Resque.enqueue_in(5.minutes, StatsWorker, param1, param2)

Resque doesn’t support numeric priorities as Delayed::Job, but it’s based in the order the queues are defined. So the first ones have the higher priority. This is defined when launching the process:

QUEUES=high,low rake resque:work

Resque will be checking the “high” queue and running its jobs. When that queue is empty, it will start checking the “low” queue.

Then the class must define the queue to use:

class StatsWorker
  @queue = :high

  # rest of the code
end

Sidekiq

Sidekiq also needs Redis to work, but its main difference is that it uses threads, so several jobs can be executed in parallel using the same Ruby process. Also, it uses the same message format than Resque, so the migration should be easy.

Knowing its use of threads, it’s clear that its main intention was to be the fastest job processing system for Ruby. It’s several times faster than Resque and Delayed::Job.

Like Resque, it includes a Sinatra application to monitor jobs, which ones are scheduled, which ones failed and are pending to retry, and so on.

Since it’s compatible with Resque, classes to be processed by Sidekiq must implement the 

perform
 method:

class StatsWorker
  include Sidekiq::Worker

  def perform(param1, param2)
    calculate_totals(param1, param2)
  end

  private

  def calculate_totals(param1, param2)
    # long running method
  end
end

StatsWorker.perform_async(param1, param2)

It’s also very similar to run this job in the future:

StatsWorker.perform_in(5.minutes, param1, param2)

To assign a priority to a job, it uses the same approach than Resque: strict queue ordering. When creating them, the first ones have the higher priority. This is defined when launching Sidekiq:

sidekiq -q critical -q default

This is one way to define queues. They can also be defined as weighted queues, so if a queue is defined as weight 3, it will be 3 times more checked:

sidekiq -q critical,3 -q default

Like in Resque, the class must define the queue to use:

class Stats
  queue_as :critical

  # rest of the code
end

ActiveJob

Because each system has its own syntax, Ruby on Rails integrates ActiveJob, a standard way to deal with these systems, so the application can be agnostic and able to switch among systems, like when using ActiveRecord for the database, keeping the same syntax.

The first step is to configure it with the chosen system. Let’s assume that we use Sidekiq.

A job looks like this:

class StatsJob < ApplicationJob
  queue_as :default

  def perform(param1, param2)
    calculate_totals(param1, param2)
  end

  private

  def calculate_totals(param1, param2)
    # long running method
  end
end

Stats.perform_later(param1, param2)

As you see, the way to define a queue for a class is the same as Sidekiq. It can also be set when enqueuing the job:

Stats.set(queue: :high).perform_later(param1, param2)

Note that these queues must be created using the queuing system of choice when starting the process.

Of course, it supports scheduling:

Stats.set(wait: 5.minutes).perform_later(param1, param2)

Conclusion

Background jobs are an essential part in any medium or big project. Choosing a queue managing system is a decision to be made sooner or later. Although there are other alternatives out there that are active as well, Delayed::Job, Resque and Sidekiq are the most popular ones.

I recommend Sidekiq because it’s easy to use and efficient. It’s also very fast compared to Delayed::Job and Resque. Its excellent user interface to manage jobs and queues is also a good point.

Nonetheless it requires Redis, so if your application runs short of RAM, maybe it would be a good idea to stick with Delayed::Job and your database. It would be a good start, and by using ActiveJob you can always migrate to Sidekiq in the future without changing a single line of code in your jobs.

Also published at https://davidmles.medium.com/a-quick-look-at-background-jobs-in-ruby-43e6176aeee5

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.