Project

jp

0.0
Repository is archived
No commit activity in last 3 years
No release in over 3 years
Ruby components of jp - client and server
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Runtime

~> 1.3.1
~> 0.3.2
~> 0.6.0
 Project Readme

Overview¶ ↑

jp (Job Pool) is a producer/consumer job pooling system; I’m calling it a job pool rather than a job queue, because it *does not enforce strict ordering*.

A few of the basic principles behind this project:

  • A producer/consumer system

  • Easily usable from multiple languages

  • If a consumer starts working on a message in the pool, then crashes, eventually, without human interaction, the message will eventually be given to another consumer - jp uses a lock-work-purge system, with a timeout on the lock

  • All messages must be persistent

  • It’s not required that messages are consumed in /exactly/ the order they are added to the pool

  • It must be easy to setup an HA installation

  • It must be horizontally scalable

Supported languages¶ ↑

jp uses Thrift for communication with consumers and producers, and should work with any of the languages Thrift supports. At the time of writing, this includes:

  • C++

  • Java

  • Python

  • PHP

  • Ruby

  • Erlang

  • Perl

  • Haskell

  • C#

  • Cocoa

  • Smalltalk

  • Ocaml

Consumers and producers can be written in different languages.

Additionally, as an alternative to using the Thrift interface directly, simplified client libraries are currently provided for:

  • C++

  • Ruby

Requirements¶ ↑

  • Ruby 1.9

  • Thrift (with Ruby support)

  • mongo rubygem

  • bson rubygem

  • rev rubygem

  • A mongodb server/cluster to connect to

  • Ant (simple java client build only)

You might additionally want the bson_ext rubygem, which normally increases performance - however, it appears to be incompatible with Ruby 1.9.0 on Lenny.

We strongly recommend using Ruby 1.9.1 or later - this is also available for Lenny via backports.

Additional requirements for the C++ simple library and examples¶ ↑

  • libevent (and development headers - normally libevent-dev or similar)

Additional requirements for the Java simple library and examples¶ ↑

  • ant

Message lifecycle¶ ↑

  • You can prevent the locked -> unlocked transition from occuring by using a consumer-side timeout

  • Without a consumer-side timeout, there is a possibility that multiple consumers will be processing the same message at the same time

Restrictions¶ ↑

  • Messages must be idempotent (i.e. it doesn’t matter how many times they are consumed, as long as it’s at least once - may be twice, may be hundreds). jp will run them only once in the normal case, but there are other possibilities.

  • Unless you only run one consumer per pool, or you implement your own locking, it must be safe for multiple consumers to process the same message at the same time.

  • Message delivery order can’t matter. In some circumstances, you might be able to work around this with a ‘version’ field serialized into the message (if later messages do not depend upon previous messages having been acted on).

Serialization¶ ↑

jp simply passes around BLOBs as messages; it’s entirely up to your application what serialization format you use - here’s a few suggestions that are fairly portable between different languages, though you can use any other method you like:

  • Thrift via memory buffer, and your choice of protocol - the Ruby library provides a Thrift::Serialize class using the memory buffer transport and the binary protocol

  • UTF-8 if it’s simple character data

  • UTF-8 JSON

  • XML

  • UTF-8 YAML

  • Google’s Protocol Buffers

If you’re willing to be locked into using the same language for producers and consumers, you can, of course use platform-specific functionality such as:

  • serialize/unserialize in PHP

  • Marshall in Ruby

  • Data::Dumper in Perl

  • writeObject/readObject (from Serializable) in Java

It is, however, generally preferable to have more flexibility for the future by using a serialization method from the first list, or similar.

Reasons for out-of-order messages¶ ↑

There are (at least) three ways messages can be consumed out of order:

  • A consumer fails to complete processing within a per-pool time limit, so the message is re-queued, and may then be executed after other items that were queued while it was being processed

  • Two items end up being inserted on different mongodb servers, in the same second (then, within that second, the ordering is decided by a byte-by-byte comparison of the mongodb-generated ‘_id’ field)

  • Two consumers are running against the same queue, and one runs somewhat faster than another; depending on what your consumers do, this may lead to out-of-order execution

The first of these is unavoidable if multiple concurrent consumers are allowed per pool.

Getting started¶ ↑

Follow the instructions on the mongodb website, or, for a quick start, download one of the tarballs from www.mongodb.org/downloads, then:

tar xfv /path/to/mongodb-OS-arch-version.tgz
cd mongodb-OS-arch-version
mkdir /var/tmp/mongodata
bin/mongod --dbpath /var/tmp/mongodata

There are also packages for various distributions linked from the bottom of that page; use them in preference to the version included in your distribution. For example, the ‘mongodb’ version in Ubuntu at the time of writing doesn’t support the find_and_modify operation (too old), however mongodb-stable from the repositories listed on the mongodb site does.

Build/generate the Thrift bindings¶ ↑

To build all the generated components, run:

make

Run jp¶ ↑

./jp examples/jp-config.rb

This starts jp using the configuration file in the examples directory. This:

  • connects to a mongodb server on localhost

  • sets up ‘text’, ‘json’, and ‘thrift’ pools for the examples

Examples¶ ↑

Look at the contents of examples/ for an idea of how to use this - to build them, and their dependencies, run ‘make examples’ from the top-most directory.

Producer and consumer examples using the Thrift API are available for:

  • C++

  • Java

  • Ruby

Producer and consumer examples using the simplified libraries are available for:

  • C++

  • Ruby

  • Java