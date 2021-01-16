Design a SaaS application on Rails 6.1 with Horizontal Sharding

Rails, the framework built on top of Ruby, just got its latest version(6.1) released. A lot of features and enhancements have gone into the latest version of Rails. You can read the official announcement for more details.

I will be focusing particularly on the Multi-DB improvements section, what changed and how we can leverage Rails' native multi DB handling techniques for building scalable multi-tenant applications.

Rails 6.0 was the first official rails version to support multiple databases. From the release notes:

The new multiple database support makes it easy for a single application to connect to, well, multiple databases at the same time! You can either do this because you want to segment certain records into their own databases for scaling or isolation, or because you’re doing read/write splitting with replica databases for performance. Either way, there’s a new, simple API for making that happen without reaching inside the bowels of Active Record. The foundational work for multiple-database support was done by Eileen Uchitelle and Aaron Patterson.

This allowed application developers to be able to define multiple database connections for a single application. Before this, developers had to use one of the many third party gems for any kind of multi DB support in Rails. Even though the ruby/rails community is very vibrant, third party gems often come with maintenance overheads with respect to upgrades, breaking changes, bugs, performance issues, etc.

With Rails 6.0, you could define your

database.yml

# config/database.yml default: &default adapter: sqlite3 pool: <%= ENV.fetch( "RAILS_MAX_THREADS" ) { 5 } %> timeout: 5000 development: primary: <<: *default database: primary_db primary_replica: <<: *default database: primary_db_replica replica: true animals: <<: *default database: animals_db animals_replica: <<: *default database: animals_db_replica replica: true

in such a way:

Then define

ActiveRecord Abstract

# app/models/application_record.rb # frozen_string_literal: true class ApplicationRecord < ActiveRecord::Base connects_to database: { writing: :primary , reading: :primary_replica } end # app/models/animals_base.rb # frozen_string_literal: true class AnimalsBase < ApplicationRecord connects_to database: { writing: :animals , reading: :animals_replica } end # app/models/user.rb # frozen_string_literal: true class User < ApplicationRecord end

classes that could connect to these databases.

The abstract classes and models inheriting from them would both now have access to the

connected_to

# some_controller.rb # frozen_string_literal: true ApplicationRecord.connected_to( role: :reading ) do User.do_something_thats_slow end

method which can be used to establish connection to the configured database connections.

This approach worked great for primary-replica setup or setups where models had clear separation. i.e. a model always queried from a single database. However, with modern multi-tenant SaaS applications, horizontal sharding is almost a basic necessity. Depending on the tenant that's accessing the application, the application should be able to select which database it wants to query the data from. While how the application shards horizontally is DSL and can vary from a case to case basis, how it is able to connect to the underlying databases should be something that the framework should be able to handle. And so they did.

With Multi-DB improvements released in 6.1, you can now define shard connections for your abstract classes as well. The example from above changes as:

# config/database.yml default: &default adapter: sqlite3 pool: <%= ENV.fetch( "RAILS_MAX_THREADS" ) { 5 } %> timeout: 5000 development: primary: <<: *default database: primary_db primary_replica: <<: *default database: primary_db_replica replica: true animals: <<: *default database: animals_db animals_replica: <<: *default database: animals_db_replica replica: true animals_shard1: <<: *default database: animals_db1 animals_shard1_replica: <<: *default database: animals_db1_replica replica: true

# app/models/application_record.rb # frozen_string_literal: true class ApplicationRecord < ActiveRecord::Base connects_to database: { writing: :primary , reading: :primary_replica } end # app/models/animals_base.rb # frozen_string_literal: true class AnimalsBase < ApplicationRecord connects_to shards: { default: { writing: :animals , reading: :animals_replica }, shard1: { writing: :animals_shard1 , reading: :animals_shard1_replica } } end # app/models/cat.rb # frozen_string_literal: true class Cat < AnimalsBase end

Similar to 6.0, we can then leverage the

connected_to

# some_controller.rb # frozen_string_literal: true AnimalsBase.connected_to( shard: :shard1 , role: :reading ) do Cat.all # reads all cats from animals_shard1_replica end

method for switching(/establishing) connections to the configured databases.

One of the most common design patterns for multi-tenant architectures is to associate every tenant with a unique subdomain on your root domain. For eg. if your application runs on example.com, marvel as a tenant would access the system using marvel.example.com and so on.

This pattern has its own advantages(easy/faster DNS resolution when running on a multi pod setup) and disadvantages(DNS updates for every tenant creation). Instead of debating that, we will delve into how to implement this architecture in a Rails application using the new multi & horizontal DB setup provided by Rails 6.0/6.1.

To begin with, we will need a

Tenant

Shard

model. Since your tenants will be identified by subdomains, it makes sense to have a subdomain column in the table along with other application required attributes. Each tenant belongs to aand all data of that tenant would reside on that shard. So we will need a shard model as well.

We can begin by setting up the required database configurations first:

# config/database.yml default: &default adapter: sqlite3 pool: <%= ENV.fetch( "RAILS_MAX_THREADS" ) { 5 } %> timeout: 5000 development: default: <<: *default database: primary_db default_replica: <<: *default database: primary_db_replica replica: true shard1: <<: *default database: shard1_db shard1_replica: <<: *default database: shard1_db_replica replica: true

We will define the required models as well accordingly.

# app/models/application_record.rb # frozen_string_literal: true class ApplicationRecord < ActiveRecord::Base self .abstract_class = true db_configs = Rails.application.config.database_configuration[Rails.env].keys db_configs = db_file.each_with_object({}) do |key, configs| # key = default, db_key = default # key = default_replica, db_key = default db_key = key.gsub( '_replica' , '' ) role = key.eql?(db_key) ? :writing : :reading db_key = db_key.to_sym configs[db_key] || = {} configs[db_key][role] = key.to_sym end # connects_to shards: { # default: { writing: :default, reading: :default_replica }, # shard1: { writing: :shard1, reading: :shard1_replica } # } connects_to shards: db_configs end # app/models/global_record.rb # frozen_string_literal: true class GlobalRecord < ActiveRecord::Base self .abstract_class = true connects_to database: { writing: :default , reading: :default_replica } end # app/models/tenant.rb # frozen_string_literal: true class Tenant < ApplicationRecord include ActsAsCurrent validates :subdomain , format: { with: DOMAIN_REGEX } # other DSL after_commit :set_shard , on: :create private def set_shard Shard.create!( tenant_id: self .id, domain: subdomain) end end # app/models/shard.rb # frozen_string_literal: true class Shard < GlobalRecord include ActsAsCurrent validates :domain , format: { with: DOMAIN_REGEX } validates :tenant_id before_create :set_current_shard private def set_current_shard self .shard = APP_CONFIGS[ :current_shard ] #shard1 end end

With multi-tenant architectures, there will always be a global context and a tenant specific context. We isolate such models through abstract classes

ApplicationRecord

GlobalRecord

and. They also take care of abstracting database connections and setting up the required isolations.

We can also leverage the BelongsToTenant pattern for all models that belong to a tenant and inherit from

ApplicationRecord

All ActiveRecord inherited models connect by default to a default shard and a writing role unless

connected_to

GlobalRecord

another connection. Hence, when connecting toinherited models, we will not require any explicit connection handling.

We can also define a proxy class to abstract out all application specific connection handling logic:

# app/proxies/database_proxy.rb # frozen_string_literal: true class DatabaseProxy class << self def on_shard ( shard: , &block) _connect_to _ ( role: :writing , shard: shard, &block) end def on_replica ( shard: , &block) _connect_to _ ( role: :reading , shard: shard, &block) end def on_global_replica (&block) _connect_to _ ( klass: GlobalRecord, role: :reading , &block) end # for regular executions, since Global only connects to default shard, # no explicit connection switching is required. # def on_global(&block) # _connect_to_(klass: GlobalRecord, role: :writing, &block) # end private def _connect_to_ ( klass: ApplicationRecord, role: :writing , shard: :default , &block) klass.connected_to( role: role, shard: shard) do block.call end end end end

With this setup in place, we can now write both application and background middleware that handle shard selection and tenant isolation on a per request or job basis.

# lib/middlewares/multitenancy.rb # frozen_string_literal: true require 'database_proxy' module Middlewares # selecting account based on subdomain class Multitenancy def initialize (app) @app = app end def call (env) domain = env[ 'HTTP_HOST' ] shard = Shard.find_by( domain: domain) return @app.call(env) unless shard shard.make_current DatabaseProxy.on_shard( shard: shard.shard) do account = Account.find_by( subdomain: domain) account&.make_current @app.call(env) end end end end # config/application.rb require 'lib/middlewares/multitenancy' config.middleware.insert_after Rails::Rack::Logger, Middlewares::Multitenancy

With more widespread adoption, the underlying framework should only get better from here. Native implementations also allow us better flexibility and control over code flows and application design.

Also published on: https://dev.to/ritikesh/multitenant-architecture-on-rails-6-1-27c7

