Project

trinamo

0.0
No commit activity in last 3 years
No release in over 3 years
DDL Generator for Hive from YAML
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 1.11
~> 10.0
~> 3.0

Runtime

 Project Readme

Trinamo

Build Status Coverage Status

Trinamo generates HiveQL using YAML to mount tables of DynamoDB, S3 and local HDFS.

Installation

Add this line to your application's Gemfile:

gem 'trinamo'

And then execute:

$ bundle

Or install it yourself as:

$ gem install trinamo

Usage

Table Definition

Generate a template for DDL

  • RUN:
Trinamo::Converter.generate_ddl_template(out_file_path = 'ddl.yml')
  • OUTPUT:
tables:
  - name: comments
    s3_location: s3://path/to/s3/table/location
    s3_partition:
      - name: date
        type: string
    hash_key:
      - name: user_id
        type: bigint
    range_key:
      - name: comment_id
        type: bigint
    attributes:
      - name: title
        type: string
      - name: content
        type: string
      - name: rate
        type: double
  - name: authors
    hash_key:
      - name: author_id
        type: bigint
    attributes:
      - name: name
        type: string

Generate a template for hive options

  • RUN:
Trinamo::Converter.generate_options_template(out_file_path = 'ddl.yml')
  • OUTPUT:
options:
  dynamodb.throughput.read.percent: 0.5
  hive.exec.compress.output: true
  io.seqfile.compression.type: BLOCK
  mapred.output.compression.codec: com.hadoop.compression.lzo.LzoCodec

Then, modify table-definitions and hive-settings as you like.

Create DDLs in HiveQL

For Options

  • RUN:
Trinamo::Converter.load('ddl.yml').convert(:option)
  • OUTPUT:
SET dynamodb.throughput.read.percent = 0.5;
SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec = com.hadoop.compression.lzo.LzoCodec;

For DynamoDB

  • RUN:
Trinamo::Converter.load('ddl.yml').convert(:dynamodb)
  • OUTPUT:
-- comments_ddb
CREATE EXTERNAL TABLE comments_ddb (
  user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES (
  'dynamodb.table.name' = 'comments',
  'dynamodb.column.mapping' = 'user_id:user_id,comment_id:comment_id,title:title,content:content,rate:rate'
);

-- authors_ddb
CREATE EXTERNAL TABLE authors_ddb (
  author_id BIGINT,name STRING
)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES (
  'dynamodb.table.name' = 'authors',
  'dynamodb.column.mapping' = 'author_id:author_id,name:name'
);

For S3

  • RUN:
Trinamo::Converter.load('ddl.yml').convert(:s3)
  • OUTPUT:
-- comments_s3
CREATE EXTERNAL TABLE comments_s3 (
  user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
) PARTITIONED BY (date STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
LOCATION 's3://path/to/s3/table/location';

For HDFS

  • RUN:
Trinamo::Converter.load('ddl.yml').convert(:hdfs)
  • OUTPUT:
-- comments_hdfs
CREATE TABLE comments_hdfs (
  user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
);

-- authors_hdfs
CREATE TABLE authors_hdfs (
  author_id BIGINT,name STRING
);

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/cignoir/trinamo.

License

The gem is available as open source under the terms of the MIT License.