Download Returned Zip file from URL Download Returned Zip file from URL python python

Download Returned Zip file from URL


As far as I can tell, the proper way to do this is:

import requests, zipfile, StringIOr = requests.get(zip_file_url, stream=True)z = zipfile.ZipFile(StringIO.StringIO(r.content))z.extractall()

of course you'd want to check that the GET was successful with r.ok.

For python 3+, sub the StringIO module with the io module and use BytesIO instead of StringIO: Here are release notes that mention this change.

import requests, zipfile, ior = requests.get(zip_file_url)z = zipfile.ZipFile(io.BytesIO(r.content))z.extractall("/path/to/destination_directory")


Most people recommend using requests if it is available, and the requests documentation recommends this for downloading and saving raw data from a url:

import requests def download_url(url, save_path, chunk_size=128):    r = requests.get(url, stream=True)    with open(save_path, 'wb') as fd:        for chunk in r.iter_content(chunk_size=chunk_size):            fd.write(chunk)

Since the answer asks about downloading and saving the zip file, I haven't gone into details regarding reading the zip file. See one of the many answers below for possibilities.

If for some reason you don't have access to requests, you can use urllib.request instead. It may not be quite as robust as the above.

import urllib.requestdef download_url(url, save_path):    with urllib.request.urlopen(url) as dl_file:        with open(save_path, 'wb') as out_file:            out_file.write(dl_file.read())

Finally, if you are using Python 2 still, you can use urllib2.urlopen.

from contextlib import closingdef download_url(url, save_path):    with closing(urllib2.urlopen(url)) as dl_file:        with open(save_path, 'wb') as out_file:            out_file.write(dl_file.read())


With the help of this blog post, I've got it working with just requests. The point of the weird stream thing is so we don't need to call content on large requests, which would require it to all be processed at once, clogging the memory. The stream avoids this by iterating through the data one chunk at a time.

url = 'https://www2.census.gov/geo/tiger/GENZ2017/shp/cb_2017_02_tract_500k.zip'target_path = 'alaska.zip'response = requests.get(url, stream=True)handle = open(target_path, "wb")for chunk in response.iter_content(chunk_size=512):    if chunk:  # filter out keep-alive new chunks        handle.write(chunk)handle.close()