jetel
Status
CLI - Command Line Interface
Run jetel
$ jetel
NAME
jetel - Simple custom made tool for data download and basic ETL
SYNOPSIS
jetel [global options] command [command options] [arguments...]
VERSION
0.0.15
GLOBAL OPTIONS
-d, --download_dir=download-dir - Download directory (default: data)
--help - Show this message
-l, --data_loader=data-loader - Data Loader (default: pg://jetel:jetel@localhost:5432/jetel)
-t, --timeout=download-timeout - Download timeout (default: 600)
--version - Display the program version
COMMANDS
alexa, Alexa - Module alexa
config - Show config
downloaders - Print downloaders info
gadm, Gadm - Module gadm
geolite, Geolite - Module geolite
help - Shows a list of commands or help for one command
ip, Ip - Module ip
iso3166, Iso3166 - Module iso3166
loaders - Print loaders info
modules - Print modules info
nga, Nga - Module nga
sfpd, Sfpd - Module sfpd
tiger, Tiger - Module tiger
version - Print version info
wifileaks, Wifileaks - Module wifileaks
Show help for module command
$ jetel help tiger
NAME
tiger - Module tiger
SYNOPSIS
jetel [global options] tiger download
jetel [global options] tiger extract
jetel [global options] tiger load [--analyze_num_rows num] [--column_type column-name=column-type]
jetel [global options] tiger sources [--format format=format]
jetel [global options] tiger transform
COMMANDS
download - download tiger
extract - extract tiger
load - load tiger
sources - sources tiger
transform - transform tiger
Show help for subcommand
$ jetel help geolite download
NAME
download - download geolite
SYNOPSIS
jetel [global options] geolite download
Show modules/sources
$ jetel modules
+-----------+---------------------------+
| Name | Class |
+-----------+---------------------------+
| alexa | Jetel::Modules::Alexa |
| gadm | Jetel::Modules::Gadm |
| geolite | Jetel::Modules::Geolite |
| ip | Jetel::Modules::Ip |
| iso3166 | Jetel::Modules::Iso3166 |
| nga | Jetel::Modules::Nga |
| sfpd | Jetel::Modules::Sfpd |
| tiger | Jetel::Modules::Tiger |
| wifileaks | Jetel::Modules::Wifileaks |
+-----------+---------------------------+
Show downloaders
$ jetel downloaders
+------+--------------------------+
| Name | Class |
+------+--------------------------+
| aria | Jetel::Downloaders::Aria |
| curl | Jetel::Downloaders::Curl |
| ruby | Jetel::Downloaders::Ruby |
| wget | Jetel::Downloaders::Wget |
+------+--------------------------+
Show loaders
$ jetel loaders
+---------------+-------------------------------+
| Name | Class |
+---------------+-------------------------------+
| couchbase | Jetel::Loaders::Couchbase |
| elasticsearch | Jetel::Loaders::Elasticsearch |
| pg | Jetel::Loaders::Pg |
+---------------+-------------------------------+
Download source
$ jetel geolite download
Downloading http://geolite.maxmind.com/download/geoip/database/GeoLite2-City-CSV.zip
aria2c -j 4 -t 600 -d "data/Geolite/geolite/downloaded" -o "GeoLite2-City-CSV.zip" http://geolite.maxmind.com/download/geoip/database/GeoLite2-City-CSV.zip
11/06 17:51:35 [NOTICE] File already exists. Renamed to data/Geolite/geolite/downloaded/GeoLite2-City-CSV.zip.1.
11/06 17:51:35 [NOTICE] Allocating disk space. Use --file-allocation=none to disable it. See --file-allocation option in man page for more details.
11/06 17:51:48 [NOTICE] Download complete: data/Geolite/geolite/downloaded/GeoLite2-City-CSV.zip.1
Download Results:
gid |stat|avg speed |path/URI
======+====+===========+=======================================================
d0bf04|OK | 2.4MiB/s|data/Geolite/geolite/downloaded/GeoLite2-City-CSV.zip.1
Status Legend:
(OK):download completed.
Extract source
$ jetel geolite extract
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Blocks-IPv6.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-ja.csv
Extracting GeoLite2-City-CSV_20151103/COPYRIGHT.txt
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-zh-CN.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Blocks-IPv4.csv
Extracting GeoLite2-City-CSV_20151103/LICENSE.txt
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-fr.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-ru.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-en.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-pt-BR.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-de.csv
Extracting GeoLite2-City-CSV_20151103/GeoLite2-City-Locations-es.csv
Transform source
$ jetel geolite transform
Transforming data/Geolite/geolite/extracted/GeoLite2-City-Blocks-IPv4.csv
Load source
$ jetel geolite load --analyze_num_rows 50000
DROP TABLE IF EXISTS "geolite";
CREATE TABLE "geolite"
(
"network" CIDR NOT NULL,
"geoname_id" BIGINT,
"registered_country_geoname_id" BIGINT,
"represented_country_geoname_id" TEXT,
"is_anonymous_proxy" BOOLEAN NOT NULL,
"is_satellite_provider" BOOLEAN NOT NULL,
"postal_code" TEXT,
"latitude" DECIMAL,
"longitude" DECIMAL
)
WITH (
OIDS=FALSE
);
COPY "geolite"
FROM STDIN
WITH DELIMITER ','
CSV HEADER
;
3037320 row(s) affected
Structure
.
├── bin
├── lib
│ └── jetel
│ ├── cli
│ │ └── cmd
│ ├── config
│ ├── downloaders
│ │ ├── aria
│ │ ├── curl
│ │ ├── ruby
│ │ └── wget
│ ├── extensions
│ ├── helpers
│ ├── loaders
│ │ ├── couchbase
│ │ ├── elasticsearch
│ │ └── pg
│ │ └── sql
│ └── modules
│ ├── alexa
│ ├── geolite
│ ├── ip
│ ├── iso3166
│ ├── nga
│ ├── sfpd
│ └── wifileaks
└── test
Rake
$ rake -T
rake gem:build # Build jetel-0.0.16.gem into the pkg directory
rake gem:install # Build and install jetel-0.0.16.gem into system gems
rake gem:install:local # Build and install jetel-0.0.16.gem into system gems without network access
rake gem:release # Create tag v0.0.16 and build and push jetel-0.0.16.gem to Rubygems
rake spec # Run RSpec code examples