How would you parse a url in Ruby to get the main domain?

ruby-on-rails ruby parsing url dns

Please note there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered.

This is the reason why the Public Suffix List exists.

I'm the author of PublicSuffix, a Ruby library that decomposes a domain into the different parts.

Here's an example

require 'uri/http'uri = URI.parse("http://toolbar.google.com")domain = PublicSuffix.parse(uri.host)# => "toolbar.google.com"domain.domain# => "google.com"uri = URI.parse("http://www.google.co.uk")domain = PublicSuffix.parse(uri.host)# => "www.google.co.uk"domain.domain# => "google.co.uk"

ruby-on-rails ruby parsing url dns

This should work with pretty much any URL:

# URL always gets parsed twicedef get_host_without_www(url)  url = "http://#{url}" if URI.parse(url).scheme.nil?  host = URI.parse(url).host.downcase  host.start_with?('www.') ? host[4..-1] : hostend

Or:

# Only parses twice if url doesn't start with a schemedef get_host_without_www(url)  uri = URI.parse(url)  uri = URI.parse("http://#{url}") if uri.scheme.nil?  host = uri.host.downcase  host.start_with?('www.') ? host[4..-1] : hostend

You may have to require 'uri'.

ruby-on-rails ruby parsing url dns

Just a short note: to overcome the second parsing of the url from Mischas second example, you could make a string comparison instead of URI.parse.

# Only parses oncedef get_host_without_www(url)  url = "http://#{url}" unless url.start_with?('http')  uri = URI.parse(url)  host = uri.host.downcase  host.start_with?('www.') ? host[4..-1] : hostend

The downside of this approach is, that it is limiting the url to http(s) based urls, which is widely the standard. But if you will use it more general (f.e. for ftp links) you have to adjust accordingly.

CodeHunter

How would you parse a url in Ruby to get the main domain?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last