db:schema:load vs db:migrate with capistrano db:schema:load vs db:migrate with capistrano ruby-on-rails ruby-on-rails

db:schema:load vs db:migrate with capistrano


Why to use db:schema:load

I find that my own migrations eventually do some shuffling of data (suppose I combine first_name and last_name columns into a full_name column, for instance). As soon as I do any of this, I start using ActiveRecord to sift through database records, and your models eventually make assumptions about certain columns. My "Person" table, for instance, was later given a "position" column by which people are sorted. Earlier migrations now fail to select data, because the "position" column doesn't exist yet.

How to change the default behavior in Capistrano

In conclusion, I believe deploy:cold should use db:schema:load instead of db:migrate. I solved this problem by changing the middle step which Capistrano performs on a cold deploy. For Capistrano v2.5.9, the default task in the library code looks like this.

namespace :deploy do  ...  task :cold do    update    migrate  # This step performs `rake db:migrate`.    start  end  ...end

I overrode the task in my deploy.rb as follows.

namespace :deploy do  task :cold do       # Overriding the default deploy:cold    update    load_schema       # My own step, replacing migrations.    start  end  task :load_schema, :roles => :app do    run "cd #{current_path}; rake db:schema:load"  endend


Climbing up on the shoulders of Andres Jaan Tack, Adam Spiers, and Kamiel Wanrooij, I've built the following task to overwrite deploy:cold.

task :cold do  transaction do    update    setup_db  #replacing migrate in original    start  endendtask :setup_db, :roles => :app do  raise RuntimeError.new('db:setup aborted!') unless Capistrano::CLI.ui.ask("About to `rake db:setup`. Are you sure to wipe the entire database (anything other than 'yes' aborts):") == 'yes'  run "cd #{current_path}; bundle exec rake db:setup RAILS_ENV=#{rails_env}"end

My enhancements here are...

  • wrap it in transaction do, so that Capistrano will do a proper rollback after aborting.
  • doing db:setup instead of db:schema:load, so that if the database doesn't already exist, it will be created before loading the schema.


That's a great answer from Andres Jaan Tack. I just wanted to add a few comments.

Firstly, here's an improved version of Andres' deploy:load_schema task which includes a warning, and more importantly uses bundle exec and RAILS_ENV to ensure that the environment is set up correctly:

namespace :deploy do  desc 'Load DB schema - CAUTION: rewrites database!'  task :load_schema, :roles => :app do    run "cd #{current_path}; bundle exec rake db:schema:load RAILS_ENV=#{rails_env}"  endend

I have submitted a feature request to have deploy:load_schema implemented in Capistrano. In that request, I noted that the 'db:schema:load vs. db:migrate' debate has already been covered in the Capistrano discussion group, and there was some reluctance to switch the deploy:cold task to using db:schema:load over db:migrate, since if run unintentionally, the former nukes the entire database whereas the latter would probably complain and bail harmlessly. Nevertheless db:schema:load is technically the better approach, so if the risk of accidental data loss could be mitigated, it would be worth switching.