Ruby - net/http - following redirects
To follow redirects, you can do something like this (taken from ruby-doc)
Following Redirection
require 'net/http'require 'uri'def fetch(uri_str, limit = 10) # You should choose better exception. raise ArgumentError, 'HTTP redirect too deep' if limit == 0 url = URI.parse(uri_str) req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' }) response = Net::HTTP.start(url.host, url.port, use_ssl: true) { |http| http.request(req) } case response when Net::HTTPSuccess then response when Net::HTTPRedirection then fetch(response['location'], limit - 1) else response.error! endendprint fetch('http://www.ruby-lang.org/')
Given a URL that redirects
url = 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fhttpbin.org%2Fredirect-to%3Furl%3Dhttp%3A%2F%2Fexample.org'
A. Net::HTTP
begin response = Net::HTTP.get_response(URI.parse(url)) url = response['location']end while response.is_a?(Net::HTTPRedirection)
Make sure that you handle the case when there are too many redirects.
B. OpenURI
open(url).read
OpenURI::OpenRead#open
follows redirects by default, but it doesn't limit the number of redirects.
I wrote another class for this based on examples given here, thank you very much everybody. I added cookies, parameters and exceptions and finally got what I need: https://gist.github.com/sekrett/7dd4177d6c87cf8265cd
require 'uri'require 'net/http'require 'openssl'class UrlResolver def self.resolve(uri_str, agent = 'curl/7.43.0', max_attempts = 10, timeout = 10) attempts = 0 cookie = nil until attempts >= max_attempts attempts += 1 url = URI.parse(uri_str) http = Net::HTTP.new(url.host, url.port) http.open_timeout = timeout http.read_timeout = timeout path = url.path path = '/' if path == '' path += '?' + url.query unless url.query.nil? params = { 'User-Agent' => agent, 'Accept' => '*/*' } params['Cookie'] = cookie unless cookie.nil? request = Net::HTTP::Get.new(path, params) if url.instance_of?(URI::HTTPS) http.use_ssl = true http.verify_mode = OpenSSL::SSL::VERIFY_NONE end response = http.request(request) case response when Net::HTTPSuccess then break when Net::HTTPRedirection then location = response['Location'] cookie = response['Set-Cookie'] new_uri = URI.parse(location) uri_str = if new_uri.relative? url + location else new_uri.to_s end else raise 'Unexpected response: ' + response.inspect end end raise 'Too many http redirects' if attempts == max_attempts uri_str # response.body endendputs UrlResolver.resolve('http://www.ruby-lang.org')