Fragmenter
Fragmenter is a library for multipart upload support backed by Redis. Fragmenter handles storing multiple parts of a larger binary and rebuilding it back into the original after all parts have been stored.
Why Fragment?
It alleviates the problems posed by uploading large blocks of data from slow clients, notably mobile apps, by allowing the device to send multiple smaller blocks of data independently. Once all of the smaller blocks have been received they can quickly be rebuilt into the original file on the server.
Think of multipart uploading as blocky streaming. Nginx, Rack and Rails all make it impossible to stream binary uploads. Breaking them into manageable pieces is the simplest workaround.
Fragments are intended to be rather small, anywhere from 10-50k depending on the underlying data size. There is a balance between connection overhead from repeated server calls, being connection error tolerant, and not blocking the server from handling other connections.
Heroku
Due to the way the service queues requests we do not recommend using Fragmenter for apps on Heroku. Fragmenter is designed to make uploads from slow mobile clients easier and fault tolerant. Heroku kills all requests after 30 seconds from the point where the request started, not from the point where it was handed to your application. Fragmenting uploads uses a fair number of additional requests and can cause rampant timeout errors (H12 on Heroku).
That isn't to say that Fragmenter just doesn't work on Heroku, it is just sub-optimal—particularly when compared to a properly configured Nginx proxy.
Requirements
Fragmenter is tested on Ruby 1.9.3, 2.0, and 2.1. However, any ruby implementation with 1.9 syntax will be supported.
Redis 2.0 or greater is required and version 2.6 is recommended.
Installation
Add this to your Gemfile:
gem 'fragmenter'
Configuration
You can configure the following components of Fragmenter
:
-
redis - Redis instance to use for IO. Defaults to a new instance connected to
localhost
. -
logger - Logger instance to write out to. Defaults to
STDOUT
at theINFO
level. - expiration - The number of seconds until fragments will expire. Defaults to 86400, or 1 day.
Fragmenter.configure do |config|
config.redis = $redis
config.logger = Rails.logger
config.expiration = 2.days.to_i
end
Using Fragmenter with Rails
However, it is designed to be used from within a Rails controller. Include the
provided Fragmenter::Controller
module into any controller you wish to have
process uploads:
class UploadControler < ApplicationController
include Fragmenter::Rails::Controller
private
def resource
@resource ||= Avatar.find(:avatar_id)
end
end
The module adds methods for handling the GET, PUT, and DELETE requests needed
for handling fragment uploads. You must define a resource
method that returns
an object implementing fragmenter
. In the example above the resource
is an
instance of the Avatar
model, which could look something like this:
class Avatar < ActiveRecord::Base
include Fragmenter::Rails::Model
def rebuild_fragments
self.avatar = Fragmenter::DummyIO.new(fragmenter.rebuild).tap do |io|
io.content_type = fragmenter.meta['content_type']
end
save!
end
end
You must provide a concrete rebuild_fragments
method that will perform
rebuilding, saving, persisting etc. Without overriding rebuild_fragments
a
Fragmenter::AbstractMethodError
will be raised when storage is complete and
it attempts to rebuild.
The example above synchronous storage using a mounted CarrierWave style uploader. You may want to perform rebuilding with a background worker instead to keep response times speedy.
After you have configured your routes to map show
, update
and destroy
to
the uploads controller:
MyApp::Application.routes.draw do
resource :avatar do
resource :upload, only: [:show, :update, :destroy]
end
end
Then you can start sending PUT
requests with successive fragments of data.
Each fragment will be stored uniquely to the parent object, an instance of
Avatar in this case. For each fragment that is stored the response will be the
JSON representation of the fragments along with a 200 OK
status code:
curl -i
-X PUT /
-H 'X-Fragment-Number: 1' /
-H 'X-Fragment-Total: 2' /
--data-binary @blob-1 /
http://example.com/avatar/1/upload
#=> HTTP/1.1 200 OK
#=> { "content_type": "image/jpeg", "fragments": [1], "total": 2 }
When the final part is uploaded the status code will be 202 Accepted
if the
fragment is valid and can be rebuilt:
curl -i
-X PUT /
-H 'X-Fragment-Number: 2' /
-H 'X-Fragment-Total: 2' /
--data-binary @blob-2 /
http://example.com/avatar/1/upload
#=> HTTP/1.1 202 Accepted
#=> { "content_type": "image/jpeg", "fragments": [1,2], "total": 2 }
If you need to customize the status codes for partial or complete PUT
requests you can override the update_status
method within the controller:
private
# Return 201 Created instead of 202 Accepted
def update_status
uploader.complete? ? 201 : 200
end
Validation
Often you will want to be sure that all of the data is being stored without any
bytes missing. A standard way to handle this is by sending a checksum that is
verified after transfer. Fragmenter handles checksum matching using a validator
that verifies each fragment that is uploaded. Validation is handled for any request
where the Content-MD5
header has been sent:
curl -X PUT /
-H 'Content-MD5: ceba1b1ffc89e99abb54c1f8ab0c4157' /
-H 'X-Fragment-Number: 1' /
-H 'X-Fragment-Total: 1' /
--data-binary @blob /
http://example.com/avatar/1/upload
Failure to match the checksum will result in a 422 Unprocessable Entity
response with an accompanying message and errors:
{ "message": "Upload of part failed.",
"errors": [
"Expected checksum {{expected}} to match {{calculated}}"
]
}
As images uploads are a common use-case for fragmented uploading an
ImageValidator is included, but not as one of the defaults. You can control
which validators are used by overriding the validators
method within the
controller:
class AvatarUploader < ApplicationController
...
private
def validators
super + [ImageValidator, CustomValidator]
end
end
To add a custom validator you must add it at some point in the validator chain.
A validator can be any class that responds to valid?
, part?
, and provides a
list of errors. See the ImageValidator for an example validator that only
performs validation when all fragments are complete.
Note that ImageMagick is required for the ImageValidator to work, but it
doesn't require RMagick
or MiniMagick
.