Python check if website exists Python check if website exists python python

Python check if website exists


You can use HEAD request instead of GET. It will only download the header, but not the content. Then you can check the response status from the headers.

For python 2.7.x, you can use httplib:

import httplibc = httplib.HTTPConnection('www.example.com')c.request("HEAD", '')if c.getresponse().status == 200:   print('web site exists')

or urllib2:

import urllib2try:    urllib2.urlopen('http://www.example.com/some_page')except urllib2.HTTPError, e:    print(e.code)except urllib2.URLError, e:    print(e.args)

or for 2.7 and 3.x, you can install requests

import requestsresponse = requests.get('http://www.example.com')if response.status_code == 200:    print('Web site exists')else:    print('Web site does not exist') 


It's better to check that status code is < 400, like it was done here. Here is what do status codes mean (taken from wikipedia):

  • 1xx - informational
  • 2xx - success
  • 3xx - redirection
  • 4xx - client error
  • 5xx - server error

If you want to check if page exists and don't want to download the whole page, you should use Head Request:

import httplib2h = httplib2.Http()resp = h.request("http://www.google.com", 'HEAD')assert int(resp[0]['status']) < 400

taken from this answer.

If you want to download the whole page, just make a normal request and check the status code. Example using requests:

import requestsresponse = requests.get('http://google.com')assert response.status_code < 400

See also similar topics:

Hope that helps.


from urllib2 import Request, urlopen, HTTPError, URLErroruser_agent = 'Mozilla/20.0.1 (compatible; MSIE 5.5; Windows NT)'headers = { 'User-Agent':user_agent }link = "http://www.abc.com/"req = Request(link, headers = headers)try:        page_open = urlopen(req)except HTTPError, e:        print e.codeexcept URLError, e:        print e.reasonelse:        print 'ok'

To answer the comment of unutbu:

Because the default handlers handle redirects (codes in the 300 range), and codes in the 100-299 range indicate success, you will usually only see error codes in the 400-599 range. Source