Paperclip, Delayed Job, S3, Heroku - design for delayed processing of sensitive uploaded files: db or s3?
Heroku has a timeout of 30 seconds on any server request (learnt the hard way), so definitely storing files on s3 is a must.
Try carrierwave (carrierwave railscasts) instead of paperclip, as I prefer the added helpers that come onboard, plus there a number of great plugins, like carrierwave_direct for uploading large files to s3, which integrate nicely with carrierwave.
Delayed_job (railscasts - delayed_job) will work nicely for deleting files from s3 and any other background processing that may be required.
My gem file includes the following:
gem 'delayed_job'gem "aws-s3", :require => 'aws/s3'gem 'fog'gem 'carrierwave'gem 'carrierwave_direct'
fog gem is a nice way to have all your account info in a single place and sets up everything quite nicely. For the AWS gem how-to, good resource.
Here is a sample controller when submitting a form to upload (there are definitely better ways of doing this, but for illustrative purposes)
def create @asset = Asset.new(:description => params[:description], :user_id => session[:id], :question_id => @question.id) if @asset.save && @asset.update_attributes(:file_name => sanitize_filename(params[:uploadfile].original_filename, @asset.id)) AWS::S3::S3Object.store(sanitize_filename(params[:uploadfile].original_filename, @asset.id), params[:uploadfile].read, 'bucket_name', :access => :private, :content_type => params[:uploadfile].content_type) if object.content_length.to_i < @question.emailatt.to_i.megabytes && object.content_length.to_i < 5.megabytes url = AWS::S3::S3Object.url_for(sanitize_filename(params[:uploadfile].original_filename, @asset.id), 'bucket_name') if @asset.update_attributes(:download_link => 1) if Usermailer.delay({:run_at => 5.minutes.from_now}).attachment_user_mailer_download_notification(@asset, @question) process_attachment_user_mailer_download(params[:uploadfile], @asset.id, 24.hours.from_now, @question.id) flash[:notice] = "Thank you for the upload, we will notify this posts author" end end end else @asset.destroy flash[:notice] = "There was an error in processing your upload, please try again" redirect_to(:controller => "questions", :action => "show", :id => @question.id) endendprivate def sanitize_filename(file_name, id) just_filename = File.basename(file_name) just_filename.sub(/[^\w\.\-]/,'_') new_id = id.to_s new_filename = "#{new_id}" + just_filename end def delete_process(uploadfile, asset_id, time, question_id) asset = Asset.find(:first, :conditions => ["id = ?", asset_id]) if delete_file(uploadfile, asset_id, time) && asset.destroy redirect_to(:controller => "questions", :action => "show", :id => question_id) end enddef process_attachment_user_mailer_download(uploadfile, asset_id, time, question_id) asset = Asset.find(:first, :conditions => ["id = ?", asset_id]) if delete_file(uploadfile, asset_id, time) && @asset.delay({:run_at => time}).update_attributes(:download_link => 0) redirect_to(:controller => "questions", :action => "show", :id => question_id) end end #S3 METHODS FOR CREATE ACTION #deletes the uploaded file from s3 def delete_file(uploadfile, asset_id, time) AWS::S3::S3Object.delay({:run_at => time}).delete(sanitize_filename(uploadfile.original_filename, asset_id), 'bucket_name') end
Lots of unnecessary code, I know (wrote this when I was starting with Rails). Hopefully it will give some idea of the processes involved in writing this type of app. Hope it helps.
For my part I'm using :
- Delayed Job
- Paperclip
- Delayed Paperclip which uploads the original fileon S3 and create a delayed job with the custom post processing. Itcan add a column to you model stating that the file is beingprocessed.
Only a few lines to set up. And you can do a lot with paperclip interpolations and generators.