Project

longleaf

0.0
No release in over 3 years
Longleaf is a command-line tool which allows users to configure a set of storage locations and define custom sets of preservation services to run on their contents. These services are executed in response to applicable preservation events issued by clients. Its primary goal is to provide tools to create a simple and customizable preservation environment.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

Runtime

~> 1.56
~> 5.20
~> 0.20.0
~> 0.9.16
 Project Readme

Longleaf

Code: CI

Longleaf is a command-line tool which allows users to configure a set of storage locations and define custom sets of preservation services to run on their contents. These services are executed in response to applicable preservation events issued by clients. Its primary goal is to provide tools to create a simple and customizable preservation environment. Longleaf:

  • Offers a predictable command-line interface and integrates with standard command-line tools.
  • Offers configurable and customizable criteria based preservation workflows.
  • Provides a base set of tools and a framework for building extensions.
  • Provides activity logging and notifications.
  • Performs preservation services only when required.

Installation

There are two primary ways to install Longleaf, depending on how you intend to use it:

Standalone gem

To use Longleaf as a command-line application, the gem can be installed using:

$ gem install longleaf

Or it may be built from source:

$ git clone git@github.com:UNC-Libraries/longleaf-preservation.git
$ cd longleaf-preservation
$ bin/setup
$ bundle exec rake install # builds the gem
$ gem install --local pkg/longleaf* # installs gem

Applicaton dependency

To make use of longleaf as a dependency of your application, add this line to your application's Gemfile:

gem 'longleaf'

And then execute:

$ bundle

Usage

Register a file

In order to register a new file with Longleaf, use the register command:

longleaf register -c <config.yml> -f <path to file>

In the case that a file's content is replaced, the file can be re-registered by providing the --force flag.

Validate configuration files

Application configuration files can be validated prior to usage with the following command:

longleaf validate_config -c <config.yml>

Output and logging

The primary output from Longleaf is directed to STDOUT, and contains both success and failure messages. If you would like to only return failure messages, you may provide the --failure_only flag.

Additional logging is sent to STDERR. To control the level of logging, you may provide the --log-level parameter, which expects the standard Ruby Logger levels. The default log level is 'WARN'.

Messages sent to STDOUT are duplicated to STDERR at 'INFO' level, so they are excluded by default. In order to store an ongoing log of activity and errors, you would perform the following:

longleaf <command> --log-level 'INFO' 2> /logs/longleaf.log

Web Server

Longleaf can also run as an HTTP API server using Puma and Roda, exposing the same preservation operations that are available on the command line.

Configuration

Each server instance is bound to a single application configuration file, specified via the LONGLEAF_CFG environment variable — mirroring the -c flag used by the CLI. If you have multiple configuration files, run one server instance per config.

Starting the server

LONGLEAF_CFG=/path/to/config.yml bundle exec puma -C config/puma.rb

The following environment variables control server behaviour (all optional):

Variable Default Description
LONGLEAF_CFG (none) Path to the Longleaf application configuration file
LONGLEAF_API_KEYS (none) Comma-separated list of accepted API keys (see Authentication)
PORT 3000 Port to listen on
RACK_ENV development Rack environment (development, production)
PUMA_THREADS 5 Min and max threads per worker
WEB_CONCURRENCY 1 Number of Puma worker processes, only for CRuby

For production use, set WEB_CONCURRENCY to the number of CPU cores available and RACK_ENV=production:

LONGLEAF_CFG=/path/to/config.yml \
  RACK_ENV=production \
  PORT=3000 \
  PUMA_THREADS=16 \
  bundle exec puma -C config/puma.rb

To run a second instance against a different config on a different port:

LONGLEAF_CFG=/path/to/other_config.yml PORT=3001 bundle exec puma -C config/puma.rb

Authentication

API key authentication is optional but recommended for any network-accessible deployment. When LONGLEAF_API_KEYS is set, every request to /api/* must include a matching key in the X-Api-Key header. Requests with a missing or unrecognised key receive a 401 Unauthorized response. If no keys are configured, all requests are allowed through.

Set one or more accepted keys (comma-separated) at server startup:

LONGLEAF_CFG=/path/to/config.yml \
  LONGLEAF_API_KEYS=key-one,key-two \
  bundle exec puma -C config/puma.rb

Clients supply the key as a request header:

curl -X POST http://localhost:3000/api/register \
  -H 'Content-Type: application/json' \
  -H 'X-Api-Key: key-one' \
  -d '{"file": "/storage/loc1/image.tif"}'

API endpoints

All endpoints accept and return JSON. A 200 OK response indicates success. Non-2xx responses include an error key in the JSON body.


POST /api/register — Register one or more files.

Parameter Type Description
file string Comma-separated logical file paths to register. Mutually exclusive with manifest and from_list.
manifest array of strings Checksum manifest values (same format as the CLI -m option). Mutually exclusive with file and from_list.
from_list string Path to a newline-separated file list on the server filesystem. Mutually exclusive with file and manifest.
physical_path string Comma-separated physical paths, paired with file, for files where the logical and physical paths differ.
checksums string Comma-separated algorithm:digest pairs to associate with the file, e.g. "md5:abc123,sha1:def456". Only applicable with file.
force boolean Re-register already-registered files.
ocfl boolean Treat targets as OCFL object directories.

Example:

curl -X POST http://localhost:3000/api/register \
  -H 'Content-Type: application/json' \
  -H 'X-Api-Key: <your-api-key>' \
  -d '{"file": "/storage/loc1/image.tif"}'

DELETE /api/deregister — Deregister one or more files.

Parameter Type Description
file string Comma-separated logical file paths to deregister. Mutually exclusive with location and from_list.
location string Comma-separated storage location names; deregisters all registered files within those locations. Mutually exclusive with file and from_list.
from_list string Path to a newline-separated file list on the server filesystem. Mutually exclusive with file and location.
force boolean Deregister files that are already deregistered.

Example:

curl -X POST http://localhost:3000/api/deregister \
  -H 'Content-Type: application/json' \
  -H 'X-Api-Key: <your-api-key>' \
  -d '{"file": "/storage/loc1/image.tif"}'

Development

After checking out the repo, run bin/setup to install dependencies.

To perform the tests, run:

bundle exec rspec

To run Longleaf with local changes without needing to do a local install, you may run:

bundle exec exe/longleaf <command>

To install this gem onto your local machine, run:

bundle exec rake install

This places a newly built gem into the pkg/ directory. This gem may then be installed in order to run commands in the longleaf <command> form. Note: Only files committed to git will be included in the installed gem.

To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Indexing

To use an index to improve performance, you will need to install the database drivers separately or bundle longleaf with the driver you wish to use:

bundle install --with postgres

Options include: postgres, mysql2, mysql, sqlite, amalgalite

To setup an index, you will need to add a system > index section to your configuration with the details of the database to use for the index. Then to setup the database, run:

longleaf setup_index -c <config_file>

And for a one-time indexing:

longleaf reindex -c <config_file>

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/UNC-Libraries/longleaf-preservation.

License

The gem is available as open source under the terms of the Apache License 2.0.