In Python, how do I use urllib to see if a website is 404 or 200? In Python, how do I use urllib to see if a website is 404 or 200? python python

In Python, how do I use urllib to see if a website is 404 or 200?


The getcode() method (Added in python2.6) returns the HTTP status code that was sent with the response, or None if the URL is no HTTP URL.

>>> a=urllib.urlopen('http://www.google.com/asdfsf')>>> a.getcode()404>>> a=urllib.urlopen('http://www.google.com/')>>> a.getcode()200


You can use urllib2 as well:

import urllib2req = urllib2.Request('http://www.python.org/fish.html')try:    resp = urllib2.urlopen(req)except urllib2.HTTPError as e:    if e.code == 404:        # do something...    else:        # ...except urllib2.URLError as e:    # Not an HTTP-specific error (e.g. connection refused)    # ...else:    # 200    body = resp.read()

Note that HTTPError is a subclass of URLError which stores the HTTP status code.


For Python 3:

import urllib.request, urllib.errorurl = 'http://www.google.com/asdfsf'try:    conn = urllib.request.urlopen(url)except urllib.error.HTTPError as e:    # Return code error (e.g. 404, 501, ...)    # ...    print('HTTPError: {}'.format(e.code))except urllib.error.URLError as e:    # Not an HTTP-specific error (e.g. connection refused)    # ...    print('URLError: {}'.format(e.reason))else:    # 200    # ...    print('good')