ruby-druid
A Ruby client for Druid. Includes a Squeel-like query DSL and generates a JSON query that can be sent to Druid directly.
Installation
Add this line to your application's Gemfile:
gem 'ruby-druid'
And then execute:
bundle
Or install it yourself as:
gem install ruby-druid
Usage
A query can be constructed and sent like so:
data_source = Druid::Client.new('zk1:2181,zk2:2181/druid').data_source('service/source')
query = Druid::Query::Builder.new.long_sum(:aggregate1).last(1.day).granularity(:all)
result = data_source.post(query)
The post
method on the DataSource
returns the parsed response from the Druid server as an array.
If you don't want to use ZooKeeper for broker discovery, you can explicitly construct a DataSource
:
data_source = Druid::DataSource.new('service/source', 'http://localhost:8080/druid/v2')
GroupBy
A GroupByQuery sets the dimensions to group the data.
queryType
is set automatically to groupBy
.
Druid::Query::Builder.new.group_by([:dimension1, :dimension2])
TimeSeries
A TimeSeriesQuery returns an array of JSON objects where each object represents a value asked for by the timeseries query.
Druid::Query::Builder.new.time_series([:aggregate1, :aggregate2])
Aggregations
longSum, doubleSum, count, min, max, hyperUnique
Druid::Query::Builder.new.long_sum([:aggregate1, :aggregate2])
In the same way could be used the following methods for aggregations adding: double_sum, count, min, max, hyper_unique
cardinality
Druid::Query::Builder.new.cardinality(:aggregate, [:dimension1, dimension2], <by_row: true | false>)
javascript
For example calculation for sum(log(x)/y) + 10
:
Druid::Query::Builder.new.js_aggregation(:aggregate, [:x, :y],
aggregate: "function(current, a, b) { return current + (Math.log(a) * b); }",
combine: "function(partialA, partialB) { return partialA + partialB; }",
reset: "function() { return 10; }"
)
filtered aggregation
A filtered aggregator wraps any given aggregator, but only aggregates the values for which the given dimension filter matches.
Druid::Query::Builder.new.filtered_aggregation(:aggregate1, :aggregate_1_name, :longSum) do
dimension1.neq 1 & dimension2.neq 2
end
Post Aggregations
A simple syntax for post aggregations with +,-,/,* can be used like:
query = Druid::Query::Builder.new.long_sum([:aggregate1, :aggregate2])
query.postagg { (aggregate2 + aggregate2).as output_field_name }
Required fields for the postaggregation are fetched automatically by the library.
Javascript post aggregations are also supported:
query.postagg { js('function(aggregate1, aggregate2) { return aggregate1 + aggregate2; }').as result }
Query Interval
The interval for the query takes a string with date and time or objects that provide an iso8601
method.
query = Druid::Query::Builder.new.long_sum(:aggregate1)
query.interval("2013-01-01T00", Time.now)
Result Granularity
The granularity can be :all
, :none
, :minute
, :fifteen_minute
, :thirthy_minute
, :hour
or :day
.
It can also be a period granularity as described in the Druid documentation.
The period 'day'
or :day
will be interpreted as 'P1D'
.
If a period granularity is specifed, the (optional) second parameter is a time zone. It defaults to the machines local time zone. i.e.
query = Druid::Query::Builder.new.long_sum(:aggregate1)
query.granularity(:day)
is (on my box) the same as
query = Druid::Query::Builder.new.long_sum(:aggregate1)
query.granularity('P1D', 'Europe/Berlin')
Having filters
# equality
Druid::Query::Builder.new.having { metric == 10 }
# inequality
Druid::Query::Builder.new.having { metric != 10 }
# greater, less
Druid::Query::Builder.new.having { metric > 10 }
Druid::Query::Builder.new.having { metric < 10 }
Compound having filters
Having filters can be combined with boolean logic.
# and
Druid::Query::Builder.new.having { (metric != 1) & (metric2 != 2) }
# or
Druid::Query::Builder.new.having { (metric == 1) | (metric2 == 2) }
# not
Druid::Query::Builder.new.having{ !metric.eq(1) }
Filters
Filters are set by the filter
method. It takes a block or a hash as parameter.
Filters can be chained filter{...}.filter{...}
Base Filters
# equality
Druid::Query::Builder.new.filter{dimension.eq 1}
Druid::Query::Builder.new.filter{dimension == 1}
# inequality
Druid::Query::Builder.new.filter{dimension.neq 1}
Druid::Query::Builder.new.filter{dimension != 1}
# greater, less
Druid::Query::Builder.new.filter{dimension > 1}
Druid::Query::Builder.new.filter{dimension >= 1}
Druid::Query::Builder.new.filter{dimension < 1}
Druid::Query::Builder.new.filter{dimension <= 1}
# JavaScript
Druid::Query::Builder.new.filter{a.javascript('dimension >= 1 && dimension < 5')}
Compound Filters
Filters can be combined with boolean logic.
# and
Druid::Query::Builder.new.filter{dimension.neq 1 & dimension2.neq 2}
# or
Druid::Query::Builder.new.filter{dimension.neq 1 | dimension2.neq 2}
# not
Druid::Query::Builder.new.filter{!dimension.eq(1)}
Inclusion Filter
This filter creates a set of equals filters in an or filter.
Druid::Query::Builder.new.filter{dimension.in(1,2,3)}
Geographic filter
These filters have to be combined with time_series and do only work when coordinates is a spatial dimension GeographicQueries
Druid::Query::Builder.new.time_series().long_sum([:aggregate1]).filter{coordinates.in_rec [[50.0,13.0],[54.0,15.0]]}
Druid::Query::Builder.new.time_series().long_sum([:aggregate1]).filter{coordinates.in_circ [[53.0,13.0], 5.0]}
Exclusion Filter
This filter creates a set of not-equals fitlers in an and filter.
Druid::Query::Builder.new.filter{dimension.nin(1,2,3)}
Hash syntax
Sometimes it can be useful to use a hash syntax for filtering for example if you already get them from a list or parameter hash.
Druid::Query::Builder.new.filter{dimension => 1, dimension1 =>2, dimension2 => 3}
# which is equivalent to
Druid::Query::Builder.new.filter{dimension.eq(1) & dimension1.eq(2) & dimension2.eq(3)}
Instrumentation
Provides a single event post.druid
. Payload:
data_source
query
Contributing
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request