Python 3.4 urllib.request error (http 403) Python 3.4 urllib.request error (http 403) python-3.x python-3.x

Python 3.4 urllib.request error (http 403)


It seems like the site does not like the user agent of Python 3.x.

Specifying User-Agent will solve your problem:

import urllib.requestreq = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})html = urllib.request.urlopen(req).read()

NOTE Python 2.x urllib version also receives 403 status, but unlike Python 2.x urllib2 and Python 3.x urllib, it does not raise the exception.

You can confirm that by following code:

print(urllib.urlopen(url).getcode())  # => 403


Here are some notes I gathered on urllib when I was studying python-3:
I kept them in case they might come in handy or help someone else out.

How to import urllib.request and urllib.parse:

import urllib.request as urlRequestimport urllib.parse as urlParse

How to make a GET request:

url = "http://www.example.net"# open the urlx = urlRequest.urlopen(url)# get the source codesourceCode = x.read()

How to make a POST request:

url = "https://www.example.com"values = {"q": "python if"}# encode values for the urlvalues = urlParse.urlencode(values)# encode the values in UTF-8 formatvalues = values.encode("UTF-8")# create the urltargetUrl = urlRequest.Request(url, values)# open the urlx  = urlRequest.urlopen(targetUrl)# get the source codesourceCode = x.read()

How to make a POST request (403 forbidden responses):

url = "https://www.example.com"values = {"q": "python urllib"}# pretend to be a chrome 47 browser on a windows 10 machineheaders = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}# encode values for the urlvalues = urlParse.urlencode(values)# encode the values in UTF-8 formatvalues = values.encode("UTF-8")# create the urltargetUrl = urlRequest.Request(url = url, data = values, headers = headers)# open the urlx  = urlRequest.urlopen(targetUrl)# get the source codesourceCode = x.read()

How to make a GET request (403 forbidden responses):

url = "https://www.example.com"# pretend to be a chrome 47 browser on a windows 10 machineheaders = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}req = urlRequest.Request(url, headers = headers)# open the urlx = urlRequest.urlopen(req)# get the source codesourceCode = x.read()