No release in over a year
Gem that allows to process huge amount of tasks in parallel using batches
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Runtime

>= 0
 Project Readme

BatchesTaskProcessor

Ruby Gem that allows to process huge amount of any kind of tasks in parallel using batches with the ability to cancel at any time and rerun later (excludes the already processed ones when rerunning) which reduces the process time dramatically. The jobs created can be processed in background (via background jobs) or in the foreground (inline).

Installation

Add this line to your application's Gemfile:

gem "batches_task_processor"

And then execute: bundle install && bundle exec rake db:migrate

Usage

  • Register a new task:
    The following will process 200k items with 10 jobs parallelly each one in charge of 20k items (recommended preload_job_items for performance reasons):
    task = BatchesTaskProcessor::Model.create!(
      key: 'my_process',
      data: Article.all.limit(200000).pluck(:id),
      qty_jobs: 10,
      queue_name: 'default',
      preload_job_items: 'Article.where(id: items)',
      process_item: %{
        puts "Processing article #{item.id}..."
        HugeArticleProcessor.new(item).call
      }
    )
    task.start!

Photo

Task api

  • task.start! starts the task (initializes the jobs)
  • task.cancel cancels the task at any time and stops processing the items
  • task.export exports the items that were processed in a csv file
  • task.status prints the current status of the task
  • task.items returns the items that were processed so far
    Each item includes the following attributes: # { key: 'value from items', result: "value returned from the process_item callback", error_details: "error message from the process_message callback if failed" }

TODO

  • update tests

Api

Settings:

  • data (Array<Integer|String>) Array of whole items to be processed.
  • key (Mandatory) key to be used to identify the task.
  • queue_name (String, default default) name of the background queue to be used (If nil, will run the process inline).
  • qty_jobs (Optional) number of jobs to be created (all data items will be distributed across this qty of jobs). Default: 10
  • process_item (Mandatory) callback to be called to perform each item where item variable holds the current item value. Sample: 'Article.find(item).update_column(:title, "changed")'
  • preload_job_items (Optional) callback that allows to preload items list and/or associations where items variable holds the current chunk of items to be processed (by default returns the same list). Sample: Article.where(id: items)

Contributing

Contribution directions go here.

License

The gem is available as open source under the terms of the MIT License.