Cuboid Framework -- a decentralized & distributed computing framework in Ruby.

Summary

The Cuboid Framework offers the possibility of easily creating decentralized and distributed applications in Ruby.

In hipper terms, you can very easily setup your own specialized Cloud or Cloud within a Cloud.

In older-fashioned terms you can build load-balanced, on-demand, clustered applications and even super-computers -- see Peplum.

It offers:

Load-balancing of Instances via a network (Grid) of Agents.
- No need to setup a topology manually, Agents will reach convergence on their own, just point them to an existing Grid member.
- Scaling up and down can be easily achieved by plugging or unplugging nodes.
- Horizontal (default) and vertical workload distribution strategies available.
- Fault tolerant -- one application per process (Instance).
- Self-healing -- keeps an eye out for disappearing and also re-appearing members.
A clean and simple framework for application development.
- Basically, Ruby -- and all the libraries and extensions that come with it.
Events (pause, resume, abort, suspend, restore).
- Suspend to disk is a cinch by automatically creating a snapshot of the runtime environment of the application and storing it to disk for later restoration of state and data.
  - Also allows for running job transfers.
Management of Instances via RPC or REST APIs.
- Aside from what Cuboid uses, custom serializers can be specified for application related objects.
Developer freedom.
- Apart from keeping Data and State separate not many other rules to follow.
  - Only if interested in suspensions and can also be left to the last minute if necessary -- in cases of Ractor enforced isolation for example.

Entities

Application

A Ruby class which inherits from Cuboid::Application and complies with a few simple specifications; at the least, a #run method, serving as the execution entry point.

The application can use the following methods to better take advantage of the framework:

#validate_options( options ) -- Validates Application options.
#provision_cores( Fixnum ) -- Specifies the maximum amount of cores the Application will be using.
#provision_memory( Fixnum ) -- Specifies the maximum amount of RAM the Application will be using.
#provision_disk( Fixnum ) -- Specifies the maximum amount of disk space the Application will be using.
#handler_for( Symbol, Symbol ) -- Specifies methods to handle the following events:
- :pause
- :resume
- :abort
- :suspend
- :restore
instance_service_for( Symbol, Class ) -- Adds a custom Instance RPC API.
rest_service_for( Symbol, Module ) -- Hooks-up to the REST service to provide a custom REST API.
agent_service_for( Symbol, Class ) -- Hooks-up to the Agent to provide a custom RPC API.
serialize_with( Module ) -- A serializer to be used for:
- #options
- Report#data
- Runtime Snapshot

Access is also available to:

#options -- Passed options.
#runtime -- The Application's Runtime environment, as a way to store and access state and data. Upon receiving a suspend event, the Runtime will be stored to disk as a Snapshot for later restoration.
- Runtime#state -- State accessor.
- Runtime#data -- Data accessor.
#report( data ) -- Stores given data, to be included in a later generated Report and accessed via Report#data.

Instance

An Instance is a process container for a Cuboid Application; Cuboid is application-centric and follows the one process-per-application principle.

This is in order to enforce isolation (state, data, fault) between Applications, take advantage of OS task management and generally keep things simple.

Agent

A Agent is a server which awaits for Instance spawn requests (spawn calls) upon which it spawns and passes the Instance's connection info to the client.

The client can then proceed to use the Instance to run and generally manage the contained Application.

Grid

A Agent Grid is a software mesh network of Agent servers, aimed towards providing automated load-balancing based on available system resources and each Application's provisioning configuration.

No topology needs to be specified, the only configuration necessary is providing any existing Grid member upon start-up and the rest will be sorted out automatically.

The network is self-healing and will monitor node connectivity, taking steps to ensure that neither server nor network conditions will disrupt spawning.

Scalability

Agents can be easily plugged to or unplugged from the Grid to scale up or down as necessary.

Plugging happens at boot-time and unplugging can take place via the available APIs.

Scheduler

The Scheduler is a server which:

Accepts Application options.
Stores them in a queue.
Pops options and passes them to spawned Instances.
Monitors Instance progress.
Upon Application completion stores report to disk.
Shuts down the Instance.

Agent

The Scheduler can be configured with a Agent, upon which case, it will use it to spawn Instances.

If the Agent is a Grid member then the Scheduler will also enjoy load-balancing features.

APIs

Local

Local access can call upon via the Cuboid::Application API and the API defined by the Application itself.

RPC

A simple RPC is employed, specs for 3rd party implementations can be found at:

https://github.com/toq/arachni-rpc/wiki

Each Application can extend upon this and expose an API via its Instance's RPC interface.

REST

A REST API is also available, taking advantage of HTTP sessions to make progress tracking easier.

The REST interface is basically a web Agent and centralised point of management for the rest of the entities.

Each Application can extend upon this and expose an API via its REST service's interface.

Examples

MyApp

Tutorial application going over different APIs and Cuboid Application options and specification.

See examples/my_app.

Parallel code on same host

To run code in parallel on the same machine utilising multiple cores, with each instance isolated to its own process, you can use something like the following:

sleeper.rb:

require 'cuboid'

class Sleeper < Cuboid::Application

    def run
        sleep options['time']
    end

end

require_relative 'sleeper'

sleepers = []
sleepers << Sleeper.spawn( :instance, daemonize: true )
sleepers << Sleeper.spawn( :instance, daemonize: true )
sleepers << Sleeper.spawn( :instance, daemonize: true )

sleepers.each do |sleeper|
    sleeper.run( time: 5 )
end

sleep 0.1 while sleepers.map(&:busy?).include?( true )

time bundle exec ruby same_host.rb
[...]
real    0m6,506s
user    0m0,423s
sys     0m0,063s

Parallel code on different hosts

In this example we'll be using Agents to spawn instances from 3 different hosts.

Host 1

require_relative 'sleeper'

Sleeper.spawn( :agent, port: 7331 )

bundle exec ruby multiple_hosts_1.rb

Host 2

require_relative 'sleeper'

Sleeper.spawn( :agent, port: 7332, peer: 'host1:7331' )

bundle exec ruby multiple_hosts_2.rb

Host 3

require_relative 'sleeper'

grid_agent = Sleeper.spawn( :agent, port: 7333, peer: 'host1:7331', daemonize: true )

sleepers = []
3.times do
    connection_info = grid_agent.spawn
    sleepers << Sleeper.connect( connection_info )
end

sleepers.each do |sleeper|
    sleeper.run( time: 5 )
end

sleep 0.1 while sleepers.map(&:busy?).include?( true )

time bundle exec ruby multiple_hosts_3.rb
real    0m7,318s
user    0m0,426s
sys     0m0,091s

You can replace host1 with localhost and run all examples on the same machine.

Users

QMap -- A distributed network mapper/security scanner powered by nmap.
Peplum -- A distributed parallel processing solution -- allows you to build Beowulf (or otherwise) clusters and even super-computers.

License

Please see the LICENSE.md file.

cuboid

Runtime

Cuboid Framework -- a decentralized & distributed computing framework in Ruby.

Summary

Entities

Application

Instance

Agent

Grid

Scalability

Scheduler

Agent

APIs

Local

RPC

REST

Examples

MyApp

Parallel code on same host

Parallel code on different hosts

Host 1

Host 2

Host 3

Users

License