sidekiq_search
This gem provides a uniform way to programmatically access Sidekiq jobs.
Often, there is a need to retrieve a job from Sidekiq and do something with it. For example, in a Rails app to reschedule an already scheduled job. Sidekiq has an API for these things, but it's not that easy to use for the following reasons:
- enqueued, scheduled/retried/dead and running job sets (and job objects) have slightly different interfaces
- the API and job data structures can change from version to version
- the API details are a bit hard to recall for most of us mere mortals
sidekiq_search
is an answer to these problems.
A word of caution
How and what this gem does may, in certain circumstances, be suboptimal, but it seems like there is no better way to do this in the OSS version of Sidekiq. Later in this document I will explain the shortcomings of the approach that is taken, but please be warned that if your application has a lot of jobs you might run into memory or/and performance issues.
Usage
# The gem has a single method, `.jobs`. It simply returns an array of hashes
# where each hash contains job parameters.
jobs = SidekiqSearch.jobs(
# Specify the categories where you want to look up the jobs.
# For the full list, see `SidekiqSearch::JOB_CATEGORIES`
from_categories: ['scheduled', 'dead']
# Specify the queues where you want to look up the jobs.
from_queues: ['default', 'your_custom_queue'],
)
#=>
# [
# {
# job_object: #<Sidekiq::SortedEntry:…>,
# class: "YourCustomJob",
# arguments: ['foo', 3],
# …
# },
# …
# ]
# For the full list of attributes see the `serialize_*` methods
# in the source code.
# Finally, process the collection:
that_job = jobs.find do |job|
job[:class] == 'YourCustomJob' && job.dig(:arguments, 1) == 3
end
that_job[:job_object]
#=> #<Sidekiq::…>
# Will return either a `Sidekiq::JobRecord` (if the job is from enqueued
# category) or `Sidekiq::SortedEntry` (if the job is from scheduled,
# retried or dead categories), or `Sidekiq::JobRecord` (if the job is from
# running category).
# For the jobs from running category, the job hash will contain extra
# fields; please refer to the source code for more details.
Shortcomings
There are two ways to retrieve jobs from Sidekiq, either to call #to_a
on a category set (like Sidekiq::ScheduledSet
, for example), or to use the scan method. The latter is much more efficient, both in terms of memory and performance, but unfortunately it is only available for the scheduled, retried and dead categories. If one needs to find out what jobs are currently executing or are in the enqueued state, getting an array of all the jobs in the category is the only way.
Another difficulty with scan
is that it is quite low-level (down to Redis and job JSON payload) and implementation-dependent. So using the "to_a-way" was the natural, and only, choice.
The problem with it is that we need to have a copy of all the job data in memory first, and only then we can start searching/filtering/mapping/etc. This is why the gem uses the opt-in approach and requires you to explicitly specify the queues and categories where you want the job to be searched.
Still, on a large system even getting jobs for just one queue and one category may take too much memory, so make sure you understand the risks.
Development
To develop and experiment you will most likely need some jobs. Development scripts expect them to be in the jobs
folder, in the gem's root folder. It's added to .gitignore
so that everyone could have their jobs as they like.
bin/console
gives you a dev console with a _flush_all
method to quickly wipe all existing jobs.
bin/sidekiq
launches Sidekiq locally. It expects Sidekiq configuration to be present in bin/sidekiq_config.yml
.
bin/ui
starts Sidekiq's UI at http://localhost:3000
.