Cuboid Framework -- a decentralized & distributed computing framework in Ruby.
Summary
The Cuboid Framework offers the possibility of easily creating decentralized and distributed applications in Ruby.
In hipper terms, you can very easily setup your own specialized Cloud or Cloud within a Cloud.
In older-fashioned terms you can build load-balanced, on-demand, clustered applications and even super-computers -- see Peplum.
It offers:
- Load-balancing of Instances via a network (Grid) of Agents.
- No need to setup a topology manually, Agents will reach convergence on their own, just point them to an existing Grid member.
- Scaling up and down can be easily achieved by plugging or unplugging nodes.
- Horizontal (
default
) and vertical workload distribution strategies available. - Fault tolerant -- one application per process (Instance).
- Self-healing -- keeps an eye out for disappearing and also re-appearing members.
- A clean and simple framework for application development.
- Basically, Ruby -- and all the libraries and extensions that come with it.
- Events (pause, resume, abort, suspend, restore).
- Suspend to disk is a cinch by automatically creating a snapshot
of the runtime environment of the application and storing it to disk
for later restoration of state and data.
- Also allows for running job transfers.
- Suspend to disk is a cinch by automatically creating a snapshot
of the runtime environment of the application and storing it to disk
for later restoration of state and data.
- Management of Instances via RPC or REST APIs.
- Aside from what Cuboid uses, custom serializers can be specified for application related objects.
- Developer freedom.
- Apart from keeping Data and State separate not many other rules to follow.
- Only if interested in suspensions and can also be left to the last minute
if necessary -- in cases of
Ractor
enforced isolation for example.
- Only if interested in suspensions and can also be left to the last minute
if necessary -- in cases of
- Apart from keeping Data and State separate not many other rules to follow.
Entities
Application
A Ruby class
which inherits from Cuboid::Application
and complies with a few
simple specifications; at the least, a #run
method, serving as the execution
entry point.
The application can use the following methods to better take advantage of the framework:
-
#validate_options( options )
-- Validates Application options. -
#provision_cores( Fixnum )
-- Specifies the maximum amount of cores the Application will be using. -
#provision_memory( Fixnum )
-- Specifies the maximum amount of RAM the Application will be using. -
#provision_disk( Fixnum )
-- Specifies the maximum amount of disk space the Application will be using. -
#handler_for( Symbol, Symbol )
-- Specifies methods to handle the following events::pause
:resume
:abort
:suspend
:restore
-
instance_service_for( Symbol, Class )
-- Adds a custom Instance RPC API. -
rest_service_for( Symbol, Module )
-- Hooks-up to the REST service to provide a custom REST API. -
agent_service_for( Symbol, Class )
-- Hooks-up to the Agent to provide a custom RPC API. -
serialize_with( Module )
-- A serializer to be used for:#options
Report#data
-
Runtime
Snapshot
Access is also available to:
-
#options
-- Passed options. -
#runtime
-- The Application'sRuntime
environment, as a way to store and access state and data. Upon receiving a suspend event, theRuntime
will be stored to disk as aSnapshot
for later restoration.-
Runtime#state
-- State accessor. -
Runtime#data
-- Data accessor.
-
-
#report( data )
-- Stores givendata
, to be included in a later generatedReport
and accessed viaReport#data
.
Instance
An Instance is a process container for a Cuboid Application; Cuboid is application-centric and follows the one process-per-application principle.
This is in order to enforce isolation (state, data, fault) between Applications, take advantage of OS task management and generally keep things simple.
Agent
A Agent is a server which awaits for Instance spawn requests
(spawn
calls) upon which it spawns and passes the Instance's
connection info to the client.
The client can then proceed to use the Instance to run and generally manage the contained Application.
Grid
A Agent Grid is a software mesh network of Agent servers, aimed towards providing automated load-balancing based on available system resources and each Application's provisioning configuration.
No topology needs to be specified, the only configuration necessary is providing any existing Grid member upon start-up and the rest will be sorted out automatically.
The network is self-healing and will monitor node connectivity, taking steps to ensure that neither server nor network conditions will disrupt spawning.
Scalability
Agents can be easily plugged to or unplugged from the Grid to scale up or down as necessary.
Plugging happens at boot-time and unplugging can take place via the available APIs.
Scheduler
The Scheduler is a server which:
- Accepts Application options.
- Stores them in a queue.
- Pops options and passes them to spawned Instances.
- Monitors Instance progress.
- Upon Application completion stores report to disk.
- Shuts down the Instance.
Agent
The Scheduler can be configured with a Agent, upon which case, it will use it to spawn Instances.
If the Agent is a Grid member then the Scheduler will also enjoy load-balancing features.
APIs
Local
Local access can call upon via the Cuboid::Application
API and the API defined by the
Application itself.
RPC
A simple RPC is employed, specs for 3rd party implementations can be found at:
https://github.com/toq/arachni-rpc/wiki
Each Application can extend upon this and expose an API via its Instance's RPC interface.
REST
A REST API is also available, taking advantage of HTTP sessions to make progress tracking easier.
The REST interface is basically a web Agent and centralised point of management for the rest of the entities.
Each Application can extend upon this and expose an API via its REST service's interface.
Examples
MyApp
Tutorial application going over different APIs and Cuboid Application options and specification.
See examples/my_app
.
Parallel code on same host
To run code in parallel on the same machine utilising multiple cores, with each instance isolated to its own process, you can use something like the following:
sleeper.rb
:
require 'cuboid'
class Sleeper < Cuboid::Application
def run
sleep options['time']
end
end
require_relative 'sleeper'
sleepers = []
sleepers << Sleeper.spawn( :instance, daemonize: true )
sleepers << Sleeper.spawn( :instance, daemonize: true )
sleepers << Sleeper.spawn( :instance, daemonize: true )
sleepers.each do |sleeper|
sleeper.run( time: 5 )
end
sleep 0.1 while sleepers.map(&:busy?).include?( true )
time bundle exec ruby same_host.rb
[...]
real 0m6,506s
user 0m0,423s
sys 0m0,063s
Parallel code on different hosts
In this example we'll be using Agents
to spawn instances from 3 different hosts.
Host 1
require_relative 'sleeper'
Sleeper.spawn( :agent, port: 7331 )
bundle exec ruby multiple_hosts_1.rb
Host 2
require_relative 'sleeper'
Sleeper.spawn( :agent, port: 7332, peer: 'host1:7331' )
bundle exec ruby multiple_hosts_2.rb
Host 3
require_relative 'sleeper'
grid_agent = Sleeper.spawn( :agent, port: 7333, peer: 'host1:7331', daemonize: true )
sleepers = []
3.times do
connection_info = grid_agent.spawn
sleepers << Sleeper.connect( connection_info )
end
sleepers.each do |sleeper|
sleeper.run( time: 5 )
end
sleep 0.1 while sleepers.map(&:busy?).include?( true )
time bundle exec ruby multiple_hosts_3.rb
real 0m7,318s
user 0m0,426s
sys 0m0,091s
You can replace host1
with localhost
and run all examples on the same machine.
Users
- QMap -- A distributed network mapper/security scanner powered by nmap.
- Peplum -- A distributed parallel processing solution -- allows you to build Beowulf (or otherwise) clusters and even super-computers.
License
Please see the LICENSE.md file.