The project is in a healthy, maintained state
Glossarist component to retrieve content from remote sources
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

>= 0
>= 0
>= 0
 Project Readme

Glossarist Agent

Gem Version Build Status Code Climate

Purpose

The Glossarist Agent is a Ruby gem designed to retrieve remotely located concepts.

Currently, it allows the bulk retrieval of the IHO S-32 Hydrographic Dictionary into the Glossarist format.

Installation

Add this line to your application’s Gemfile:

gem 'glossarist-agent'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install glossarist-agent

Usage

Downloading IHO S-32 Hydrographic Dictionary data

General

The Glossarist Agent can download and process IHO (International Hydrographic Organization) S-32 Hydrographic Dictionary data from available CSV files.

The official site is located at:

The Glossarist dataset incorporates all available languages, including:

  • English

  • French

  • Spanish

  • Chinese

  • Indonesian

Note
If additional languages become available, minor code change is needed.

Glossarist Agent uses a caching mechanism to efficiently manage downloads and reduce unnecessary network requests.

To retrieve these concepts and generate a Glossarist dataset, use the following command:

$ glossarist-agent iho retrieve-concepts

This command performs the following actions:

  1. Downloads the required CSV files from IHO sources.

  2. Caches the downloaded files for future use.

  3. Processes the CSV data to generate a Glossarist-compatible dataset.

Command Options

$ glossarist-agent iho help retrieve-concepts
Usage:
  glossarist-agent iho retrieve-concepts

Options:
  -o, [--output=OUTPUT]                        # Directory to output generated files
                                               # Default: ./output
  -c, [--cache=CACHE]                          # Directory to store cached files
                                               # Default: ~/.glossarist-agent/cache
      [--fetch], [--no-fetch], [--skip-fetch]  # Fetch new data (default: true)
                                               # Default: true

Download IHO CSV files and generate concepts
--output

Specifies the directory where the generated Glossarist dataset will be saved. Default is ./output.

--cache

Sets the directory for storing cached files. Default is ~/.glossarist-agent/cache.

--fetch

Controls whether to fetch new data or use existing cached data. Default is true.

The following command saves the IHO S-32 Glossarist dataset at ./iho-s32-glossarist and prioritizes using the existing cache without communicating with the server.

$ glossarist-agent iho retrieve-concepts --no-fetch -o iho-s32-glossarist

Caching mechanism

The Glossarist Agent employs a sophisticated caching system to optimize performance and reduce unnecessary downloads:

  1. Downloaded files are stored in the specified cache directory.

  2. Each cached file is associated with metadata, including the download time and ETag.

  3. When fetching data, the agent checks:

    1. If the cached file exists and is within the expiry period (default 7 days).

    2. If the server’s ETag matches the cached ETag.

  4. If either condition is not met, the agent downloads a fresh copy of the file.

This approach ensures that the agent always works with up-to-date data while minimizing network usage.

Generating Glossarist Dataset

After downloading and caching the IHO CSV files, the agent processes the data to generate a Glossarist-compatible dataset:

  1. It parses the CSV files to extract concept information.

  2. The extracted data is transformed into the Glossarist data model.

  3. The resulting dataset is saved in the specified output directory.

This generated dataset can then be used with other Glossarist tools for further processing or integration into concept management systems.

Features

  • Automated downloading and caching of IHO CSV files

  • ETag-based cache validation

  • Customizable cache expiry period

  • Generation of Glossarist-compatible datasets from IHO data

  • Command-line interface for easy integration into workflows

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install.

License

Copyright Ribose.

The gem is available as open source under the terms of the MIT License.