Download Returned Zip file from URL
As far as I can tell, the proper way to do this is:
import requests, zipfile, StringIOr = requests.get(zip_file_url, stream=True)z = zipfile.ZipFile(StringIO.StringIO(r.content))z.extractall()
of course you'd want to check that the GET was successful with r.ok
.
For python 3+, sub the StringIO module with the io module and use BytesIO instead of StringIO: Here are release notes that mention this change.
import requests, zipfile, ior = requests.get(zip_file_url)z = zipfile.ZipFile(io.BytesIO(r.content))z.extractall("/path/to/destination_directory")
Most people recommend using requests
if it is available, and the requests
documentation recommends this for downloading and saving raw data from a url:
import requests def download_url(url, save_path, chunk_size=128): r = requests.get(url, stream=True) with open(save_path, 'wb') as fd: for chunk in r.iter_content(chunk_size=chunk_size): fd.write(chunk)
Since the answer asks about downloading and saving the zip file, I haven't gone into details regarding reading the zip file. See one of the many answers below for possibilities.
If for some reason you don't have access to requests
, you can use urllib.request
instead. It may not be quite as robust as the above.
import urllib.requestdef download_url(url, save_path): with urllib.request.urlopen(url) as dl_file: with open(save_path, 'wb') as out_file: out_file.write(dl_file.read())
Finally, if you are using Python 2 still, you can use urllib2.urlopen
.
from contextlib import closingdef download_url(url, save_path): with closing(urllib2.urlopen(url)) as dl_file: with open(save_path, 'wb') as out_file: out_file.write(dl_file.read())
With the help of this blog post, I've got it working with just requests
. The point of the weird stream
thing is so we don't need to call content
on large requests, which would require it to all be processed at once, clogging the memory. The stream
avoids this by iterating through the data one chunk at a time.
url = 'https://www2.census.gov/geo/tiger/GENZ2017/shp/cb_2017_02_tract_500k.zip'target_path = 'alaska.zip'response = requests.get(url, stream=True)handle = open(target_path, "wb")for chunk in response.iter_content(chunk_size=512): if chunk: # filter out keep-alive new chunks handle.write(chunk)handle.close()