Python 2 vs. Python 3 - urllib formats
The code you post is presumably due to wrong cut-and-paste operations because it's clearly wrong in both versions (f.read()
fails because there's no f
barename defined).
In Py3, ur = response.decode('utf8')
works perfectly well for me, as does the following json.loads(ur)
. Maybe the wrong copys-and-pastes affected your 2-to-3 conversion attempts.
Depends of your python version you have to choose the correct library.
for python 3.5
import urllib.requestdata = urllib.request.urlopen(url).read().decode('utf8')
for python 2.7
import urlliburl = serviceurl + urllib.urlencode({'sensor':'false', 'address': address}) uh = urllib.urlopen(url)
Please see that answer in another Unicode related question.
Now: the Python 3 str
(which was the Python 2 unicode
) type is an idealised object, in the sense that it deals with “characters”, not “bytes”. These characters, in order to be used for/from disk/network data, need to be encoded-into/decoded-from bytes by a “conversion table”, a.k.a encoding a.k.a codepage. Because of operating system variety, Python historically avoided to guess what that encoding should be; this has been changing over the years, but still the principle of “In the face of ambiguity, refuse the temptation to guess.” applies.
Thankfully, a web server makes your work easier. Your response
above should give you all extra information needed:
>>> response.headers['content-type']'application/json; charset=UTF-8'
So, every time you issue a request to a web server, check the Content-Type header for a charset value, and decode the request's data into Unicode (Python 3: bytes.decode(charset)
→ str
) by using that charset.