Trinamo
Trinamo generates HiveQL using YAML to mount tables of DynamoDB, S3 and local HDFS.
Installation
Add this line to your application's Gemfile:
gem 'trinamo'
And then execute:
$ bundle
Or install it yourself as:
$ gem install trinamo
Usage
Table Definition
Generate a template for DDL
- RUN:
Trinamo::Converter.generate_ddl_template(out_file_path = 'ddl.yml')
- OUTPUT:
tables:
- name: comments
s3_location: s3://path/to/s3/table/location
s3_partition:
- name: date
type: string
hash_key:
- name: user_id
type: bigint
range_key:
- name: comment_id
type: bigint
attributes:
- name: title
type: string
- name: content
type: string
- name: rate
type: double
- name: authors
hash_key:
- name: author_id
type: bigint
attributes:
- name: name
type: string
Generate a template for hive options
- RUN:
Trinamo::Converter.generate_options_template(out_file_path = 'ddl.yml')
- OUTPUT:
options:
dynamodb.throughput.read.percent: 0.5
hive.exec.compress.output: true
io.seqfile.compression.type: BLOCK
mapred.output.compression.codec: com.hadoop.compression.lzo.LzoCodec
Then, modify table-definitions and hive-settings as you like.
Create DDLs in HiveQL
For Options
- RUN:
Trinamo::Converter.load('ddl.yml').convert(:option)
- OUTPUT:
SET dynamodb.throughput.read.percent = 0.5;
SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec = com.hadoop.compression.lzo.LzoCodec;
For DynamoDB
- RUN:
Trinamo::Converter.load('ddl.yml').convert(:dynamodb)
- OUTPUT:
-- comments_ddb
CREATE EXTERNAL TABLE comments_ddb (
user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES (
'dynamodb.table.name' = 'comments',
'dynamodb.column.mapping' = 'user_id:user_id,comment_id:comment_id,title:title,content:content,rate:rate'
);
-- authors_ddb
CREATE EXTERNAL TABLE authors_ddb (
author_id BIGINT,name STRING
)
STORED BY 'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES (
'dynamodb.table.name' = 'authors',
'dynamodb.column.mapping' = 'author_id:author_id,name:name'
);
For S3
- RUN:
Trinamo::Converter.load('ddl.yml').convert(:s3)
- OUTPUT:
-- comments_s3
CREATE EXTERNAL TABLE comments_s3 (
user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
) PARTITIONED BY (date STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
LOCATION 's3://path/to/s3/table/location';
For HDFS
- RUN:
Trinamo::Converter.load('ddl.yml').convert(:hdfs)
- OUTPUT:
-- comments_hdfs
CREATE TABLE comments_hdfs (
user_id BIGINT,comment_id BIGINT,title STRING,content STRING,rate DOUBLE
);
-- authors_hdfs
CREATE TABLE authors_hdfs (
author_id BIGINT,name STRING
);
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/cignoir/trinamo.
License
The gem is available as open source under the terms of the MIT License.