Ruby + PostgreSQL + Liquibase + Rake
This small Ruby gem helps you integrate PostgreSQL with your Ruby web app, through Liquibase. It also adds a simple connection pool and query processor, to make SQL manipulation simpler.
First of all, on top of Ruby and Bundler you need to have PostgreSQL, Java 8+, and Maven 3.2+ installed. In Ubuntu 16+ this should be enough:
sudo apt-get install -y postgresql-10 postgresql-client-10
sudo apt-get install -y default-jre maven
Then, add this to your Gemfile:
gem 'pgtk'
Then, add this to your Rakefile:
require 'pgtk/pgsql_task'
Pgtk::PgsqlTask.new :pgsql do |t|
# Temp directory with PostgreSQL files:
t.dir = 'target/pgsql'
# To delete the directory on every start;
t.fresh_start = true
t.user = 'test'
t.password = 'test'
t.dbname = 'test'
# YAML file to be created with connection details:
t.yaml = 'target/pgsql-config.yml'
# List of contexts or empty if all:
t.contexts = '!test'
# List of PostgreSQL configuration options:
t.config = {
log_min_messages: 'ERROR',
log_filename: 'target/pg.log'
}
end
And this too (org.postgresql:postgresql and org.liquibase:liquibase-maven-plugin are used inside):
require 'pgtk/liquibase_task'
Pgtk::LiquibaseTask.new liquibase: :pgsql do |t|
# Master XML file path:
t.master = 'liquibase/master.xml'
# YAML files connection details:
t.yaml = ['target/pgsql-config.yml', 'config.yml']
# Reduce the amount of log messages (TRUE by default):
t.quiet = false
# Overwriting default version of PostgreSQL server:
t.postgresql_version = '42.7.0'
# Overwriting default version of Liquibase:
t.liquibase_version = '3.2.2'
end
The config.yml
file should be in this format:
pgsql:
url: jdbc:postgresql://<host>:<port>/<dbname>?user=<user>
host: ...
port: ...
dbname: ...
user: ...
password: ...
You should create that liquibase/master.xml
file in your repository,
and a number of other XML files with Liquibase changes. This
example
will help you understand them.
Now, you can do this:
bundle exec rake pgsql liquibase
A temporary PostgreSQL server will be started and the entire set of
Liquibase SQL changes will be applied. You will be able to connect
to it from your application, using the file target/pgsql-config.yml
.
From inside your app you may find this class useful:
require 'pgtk/pool'
pgsql = Pgtk::Pool.new(Pgtk::Wire::Yaml.new('config.yml'))
pgsql.start(5) # Start it with five simultaneous connections
You can also let it pick the connection parameters from the environment
variable DATABASE_URL
, formatted like
postgres://user:password@host:5432/dbname
:
pgsql = Pgtk::Pool.new(Pgtk::Wire::Env.new)
Now you can fetch some data from the DB:
name = pgsql.exec('SELECT name FROM user WHERE id = $1', [id])[0]['name']
You may also use it when you need to run a transaction:
pgsql.transaction do |t|
t.exec('DELETE FROM user WHERE id = $1', [id])
t.exec('INSERT INTO user (name, phone) VALUES ($1, $2)', [name, phone])
end
To make your PostgreSQL database visible in your unit tests, I would
recommend you create a method test_pgsql
in your test__helper.rb
file
(which is required
in all unit tests) and implement it like this:
require 'yaml'
require 'minitest/autorun'
require 'pgtk/pool'
module Minitest
class Test
def test_pgsql
@@test_pgsql ||= Pgtk::Pool.new(
Pgtk::Wire::Yaml.new('target/pgsql-config.yml')
).start
end
end
end
Logging with Pgtk::Spy
You can also track all SQL queries sent through the pool,
with the help of Pgtk::Spy
:
require 'pgtk/spy'
pool = Pgtk::Spy.new(pool) do |sql|
# here, save this "sql" somewhere
end
Query Timeouts with Pgtk::Impatient
To prevent queries from running indefinitely, use Pgtk::Impatient
to enforce
timeouts on database operations:
require 'pgtk/impatient'
# Wrap the pool with a 5-second timeout for all queries
impatient = Pgtk::Impatient.new(pool, 5)
The impatient decorator ensures queries don't hang your application:
begin
# This query will be terminated if it takes longer than 5 seconds
impatient.exec('SELECT * FROM large_table WHERE complex_condition')
rescue Pgtk::Impatient::TooSlow => e
puts "Query timed out: #{e.message}"
end
You can exclude specific queries from timeout enforcement using regex patterns:
# Don't timeout any SELECT queries or specific maintenance operations
impatient = Pgtk::Impatient.new(pool, 2, /^SELECT/, /^VACUUM/)
Key features:
- Configurable timeout in seconds for each query
- Raises
Pgtk::Impatient::TooSlow
exception when timeout is exceeded - Can exclude queries matching specific patterns from timeout checks
- Also sets PostgreSQL's
statement_timeout
for transactions
Query Caching with Pgtk::Stash
For applications with frequent read queries,
you can use Pgtk::Stash
to add a caching layer:
require 'pgtk/stash'
stash = Pgtk::Stash.new(pgsql)
Stash
automatically caches read queries and invalidates the cache
when tables are modified:
# First execution runs the query against the database
result1 = stash.exec('SELECT * FROM users WHERE id = $1', [123])
# Second execution with the same query and parameters returns cached result
result2 = stash.exec('SELECT * FROM users WHERE id = $1', [123])
# This modifies the 'users' table, invalidating any cached queries for that table
stash.exec('UPDATE users SET name = $1 WHERE id = $2', ['John', 123])
# This will execute against the database again since cache was invalidated
result3 = stash.exec('SELECT * FROM users WHERE id = $1', [123])
Note that the caching implementation is basic and only suitable for simple queries:
- Queries must reference tables (using
FROM
orJOIN
) - Cache is invalidated by table, not by specific rows
- Write operations (
INSERT
,UPDATE
,DELETE
) bypass the cache and invalidate all cached queries for affected tables
Automatic Retries with Pgtk::Retry
For resilient database operations, Pgtk::Retry
provides automatic retry
functionality for failed SELECT
queries:
require 'pgtk/retry'
# Wrap the pool with retry functionality (default: 3 attempts)
retry_pool = Pgtk::Retry.new(pgsql)
# Or specify custom number of attempts
retry_pool = Pgtk::Retry.new(pgsql, attempts: 5)
The retry decorator automatically retries SELECT
queries that fail due to
transient errors (network issues, connection problems, etc.):
# This SELECT will be retried up to 3 times if it fails
users = retry_pool.exec('SELECT * FROM users WHERE active = true')
# Non-SELECT queries are NOT retried to prevent duplicate writes
retry_pool.exec('INSERT INTO logs (message) VALUES ($1)', ['User logged in'])
Key features:
- Only
SELECT
queries are retried (to prevent duplicate data modifications) - Retries happen immediately without delay
- The original error is raised after all retry attempts are exhausted
- Works seamlessly with other decorators like
Pgtk::Spy
andPgtk::Impatient
Some Examples
This library works in netbout.com, wts.zold.io, mailanes.com, and 0rsk.com.
They are all open source, you can see how they use pgtk
.
How to contribute
Read these guidelines. Make sure your build is green before you contribute your pull request. You will need to have Ruby 2.3+ and Bundler installed. Then:
bundle update
bundle exec rake
If it's clean and you don't see any error messages, submit your pull request.
To run a single test, do this:
bundle exec ruby test/test_pool.rb -n test_basic