Switching between multiple databases in Rails without breaking transactions
This is a tricky problem, because of tight coupling inside ActiveRecord
, but I've managed to create some proof of concept that works. Or at least it looks like it works.
Some background
ActiveRecord
uses a ActiveRecord::ConnectionAdapters::ConnectionHandler
class that is responsible for storing connection pools per model. By default there is only one connection pool for all models, because usual Rails app is connected to one database.
After executing establish_connection
for different database in particular model, new connection pool is created for that model. And also for all models that may inherit from it.
Before executing any query, ActiveRecord
first retrieves connection pool for relevant model and then retrieves the connection from the pool.
Note that above explanation may not be 100% accurate, but it should be close.
Solution
So the idea is to replace the default connection handler with custom one that will return connection pool based on provided shard description.
This can be implemented in many different ways. I did it by creating the proxy object that is passing shard names as disguised ActiveRecord
classes. Connection handler is expecting to get AR model and looks at name
property and also at superclass
to walk the hierarchy chain of model. I've implemented DatabaseModel
class that is basically shard name, but it is behaving like AR model.
Implementation
Here is example implementation. I've used sqlite database for simplicity, you can just run this file without any setup. You can also take a look at this gist
# Define some required dependenciesrequire "bundler/inline"gemfile(false) do source "https://rubygems.org" gem "activerecord", "~> 4.2.8" gem "sqlite3"endrequire "active_record"class User < ActiveRecord::BaseendDatabaseModel = Struct.new(:name) do def superclass ActiveRecord::Base endend# Setup database connections and create databases if not presentconnection_handler = ActiveRecord::ConnectionAdapters::ConnectionHandler.newresolver = ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new({ "users_shard_1" => { adapter: "sqlite3", database: "users_shard_1.sqlite3" }, "users_shard_2" => { adapter: "sqlite3", database: "users_shard_2.sqlite3" }})databases = %w{users_shard_1 users_shard_2}databases.each do |database| filename = "#{database}.sqlite3" ActiveRecord::Base.establish_connection({ adapter: "sqlite3", database: filename }) spec = resolver.spec(database.to_sym) connection_handler.establish_connection(DatabaseModel.new(database), spec) next if File.exists?(filename) ActiveRecord::Schema.define(version: 1) do create_table :users do |t| t.string :name t.string :email end endend# Create custom connection handlerclass ShardHandler def initialize(original_handler) @original_handler = original_handler end def use_database(name) @model= DatabaseModel.new(name) end def retrieve_connection_pool(klass) @original_handler.retrieve_connection_pool(@model) end def retrieve_connection(klass) pool = retrieve_connection_pool(klass) raise ConnectionNotEstablished, "No connection pool for #{klass}" unless pool conn = pool.connection raise ConnectionNotEstablished, "No connection for #{klass} in connection pool" unless conn puts "Using database \"#{conn.instance_variable_get("@config")[:database]}\" (##{conn.object_id})" conn endendUser.connection_handler = ShardHandler.new(connection_handler)User.connection_handler.use_database("users_shard_1")User.create(name: "John Doe", email: "john.doe@example.org")puts User.countUser.connection_handler.use_database("users_shard_2")User.create(name: "Jane Doe", email: "jane.doe@example.org")puts User.countUser.connection_handler.use_database("users_shard_1")puts User.count
I think this should give an idea how to implement production ready solution. I hope I didn't miss anything obvious here. I can suggest couple of different approaches:
- Subclass
ActiveRecord::ConnectionAdapters::ConnectionHandler
and overwrite those methods responsible for retrieving connection pools - Create completely new class implementing the same api as
ConnectionHandler
- I guess it is also possible to just overwrite
retrieve_connection
method. I don't remember where it is defined, but I think it is inActiveRecord::Core
.
I think approaches 1 and 2 are the way to go and should cover all cases when working with databases.