AtomicArrays
AtomicArrays is a lightweight gem that aims to assist ActiveRecord with PostgreSQL array operations by offering a couple simple methods to update arrays in the database and the instance that it is called on. These methods are atomic in nature because they update the arrays in the database without relying on the current object's instantiated arrays.
Installation
Add this line to your application's Gemfile:
gem 'atomic_arrays'
And then execute:
$ bundle
Or install it yourself as:
$ gem install atomic_arrays
Usage
This gem is very simple to use. After bundling the gem, include it in your ActiveRecord-descended class. Example:
class User < ActiveRecord::Base
include AtomicArrays
end
Make sure that you have specified the array field in your migrations. Example:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users, force: true do |t|
t.string :name
t.text :hobbies, array: true, default: [] # This is an array of strings
t.integer :comment_ids, array: true, default: [] # This is an array of ints
end
end
end
This will give you a couple of instance methods used in updating and relating arrays. The first argument of each method is the targeted array column and the second is the value/values.
Appending
Method atomic_append(array_column, value)
will take a single value to append it on to the end of the specified PG array. Example:
user = User.find(1)
# => <#User id: 1, hobbies: ["Basketball", "Racing"]>
user.atomic_append(:hobbies, "Eating")
# => <#User id: 1, hobbies: ["Basketball", "Racing", "Eating"]> # "Eating" was appended to the array in the db.
Removing
Method atomic_remove(array_column, value)
will remove a single value from the specified PG array. It should be noted that the PG array "remove" function removes ALL occurences of that value, therefore this method does as well. Example:
user = User.find(2)
# => <#User id: 2, friend_ids: [12, 34, 89]>
user.atomic_remove(:friend_ids, 12)
# => <#User id: 2, friend_ids: [34, 89]> # 12 was removed from the array in the db.
Concatenation
Method atomic_cat(array_column, value_array)
will concatenate an array of values with the specified PG array. Example:
user = User.find(2)
# => <#User id: 2, friend_ids: [34, 89]>
user.atomic_cat(:friend_ids, [34, 30, 56, 90])
# => <#User id: 2, friend_ids: [34, 89, 34, 30, 56, 90]> # All four values were concatenated with the array in the db.
Relating
Method atomic_relate(array_column, related_class=nil, limit=100)
is a little odd and unorthodox with a relational db. It assists with querying a denormalized database that uses arrays. Let's say your users
table has an array column called blog_ids
and you also have a blogs
table with each row having an id, like normal. Every time a User
creates a blog, you could append that blog's id to your user's blog_ids
column. When relating your user to his/her blogs (one->many
), rather than scanning the blogs
.user_id
column for your user's id, you can use this method to grab all of his/her blogs in a single query, without scanning a table. First, make sure AtomicArrays
is included in both classes, then it'll be ready to go! This method will automatically parse the symbol you pass in the first argument into a class (:blog_ids -> Blog
), unless you pass a class in the second argument atomic_relate(:friend_ids, User) #=> looks in the user table
. Example:
user = User.find(2)
# => <#User id: 2, blog_ids: [4, 16, 74]>
user.atomic_relate(:blog_ids)
# => [
# <#Blog id: 4, body: "This is my blog!">,
# <#Blog id: 16, body: "This is my other blog!">,
# <#Blog id: 74, body: "This is my third blog!">
# ]
This method is extremely performant, especially with large tables because it uses a subquery to grab all of the user's blog_ids
then immediately unnests
the ids to JOIN
them with the primary id key of the blogs
table. The subquery that this method employs has nearly zero overhead on performance. The power of this method really reveals itself with (many->many
) relationships. For instance, let's say each Blog
has many authors and each User
authors many blogs. Instead of having a blog_users
join table, you can potentially just store all of the blogs' user_ids
in one of its columns and the users' blog_ids
on one of their columns. Then you could relate them by using atomic_relate
.
While denormalizing using arrays may sound like an excellent performance prospect, there are a couple downsides. For instance, with the aformentioned (many->many
) relationship, you will not be able to store any other columns normally associated with a join table, such as an updated_at
timestamp. Another downside is that arrays are much harder to query than a join table, even with a GIN index. It should also be noted that PostgreSQL still lacks many features involving arrays, including foreign ids. Arrays should NOT be seen as a direct replacement for (x->many
) tables/keys, but rather a very performant solution if your database NEEDS to be denormalized.
Expound on this gem's assistance with atomicity.
So be it! All methods in this gem share the same first argument. When you pass the array column name as the first argument, such as user.atomic_append(:sports, "Golf")
, it doesn't call the instance's attribute with that name, but rather ignores it, updates the array in the database, then updates the instance's array with the returned columns. What does this mean?
Here's an example of nonatomic arrays. Pretend the code on the left and right are happening at the same time:
user = User.find(2) | user = User.find(2)
# => <#User id: 2, blog_ids: [4, 16]> | # => <#User id: 2, blog_ids: [4, 16]>
user.update({blog_ids: user.blog_ids+=[20]}) | ...
# => <#User id: 2, blog_ids: [4, 16, 20]> | ...
... | user.update({blog_ids: user.blog_ids+=[35]})
... | # => <#User id: 2, blog_ids: [4, 16, 35]>
The same user was being updated on both the left and right, and because the instance on the right side was updated last, it over-wrote the left side's added blog_id
of 20
with its own blog_id
update of 35
.
Here's how this gem works in the same situation.
user = User.find(2) | user = User.find(2)
# => <#User id: 2, blog_ids: [4, 16]> | # => <#User id: 2, blog_ids: [4, 16]>
user.atomic_append(:blog_ids, 20) | ...
# => <#User id: 2, blog_ids: [4, 16, 20]> | ...
... | user.atomic_append(:blog_ids, 35)
... | # => <#User id: 2, blog_ids: [4, 16, 20, 35]>
The user's blog_ids
will now include both 20
and 35
because this gem's methods append the value to the raw data array in the db first, then return the rows and re-hydrate the instance.
Releases
1.0.0
- Initial release.
1.1.0
- Replaced IN
with JOIN
clause for atomic_relate
, providing much better performance with large arrays.
1.1.2
- Made second argument in method atomic_relate
optional.
Etcetera
Apologies for any syntax highlighting or grammar issues above.
There is also a class method that this gem uses internally called execute_and_wrap
. It was heavily influenced by find_by_sql
in ActiveRecord, so thank you to the Rails guys.
This gem is focused on being both lightweight and performance-oriented. The entire gem is only about fifty lines of actual code. I tried to make the API as simple and predictable as possible. It was tested against Ruby-2.1.0, ActiveRecord 4.0.x, and Postgres 9.3. If you are looking to use the JRuby-AR adapter, this gem is very easy to replicate and modify to fit with the JRuby-AR adapter. I tried it with an earlier iteration of this gem and had no problems adapting it, but I have not tested this version of the gem with JRuby.
This gem is especially powerful if your favorite animal is either a Unicorn or a Puma.
If you find any issues or have any suggestions to improve this gem, open an issue!
Contributing
- Fork it ( https://github.com/twincharged/atomic_arrays/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request