How to download a file over HTTP? How to download a file over HTTP? python python

How to download a file over HTTP?


One more, using urlretrieve:

import urlliburllib.urlretrieve("http://www.example.com/songs/mp3.mp3", "mp3.mp3")

(for Python 3+ use import urllib.request and urllib.request.urlretrieve)

Yet another one, with a "progressbar"

import urllib2url = "http://download.thinkbroadband.com/10MB.zip"file_name = url.split('/')[-1]u = urllib2.urlopen(url)f = open(file_name, 'wb')meta = u.info()file_size = int(meta.getheaders("Content-Length")[0])print "Downloading: %s Bytes: %s" % (file_name, file_size)file_size_dl = 0block_sz = 8192while True:    buffer = u.read(block_sz)    if not buffer:        break    file_size_dl += len(buffer)    f.write(buffer)    status = r"%10d  [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)    status = status + chr(8)*(len(status)+1)    print status,f.close()


Use urllib.request.urlopen():

import urllib.requestwith urllib.request.urlopen('http://www.example.com/') as f:    html = f.read().decode('utf-8')

This is the most basic way to use the library, minus any error handling. You can also do more complex stuff such as changing headers.

On Python 2, the method is in urllib2:

import urllib2response = urllib2.urlopen('http://www.example.com/')html = response.read()


In 2012, use the python requests library

>>> import requests>>> >>> url = "http://download.thinkbroadband.com/10MB.zip">>> r = requests.get(url)>>> print len(r.content)10485760

You can run pip install requests to get it.

Requests has many advantages over the alternatives because the API is much simpler. This is especially true if you have to do authentication. urllib and urllib2 are pretty unintuitive and painful in this case.


2015-12-30

People have expressed admiration for the progress bar. It's cool, sure. There are several off-the-shelf solutions now, including tqdm:

from tqdm import tqdmimport requestsurl = "http://download.thinkbroadband.com/10MB.zip"response = requests.get(url, stream=True)with open("10MB", "wb") as handle:    for data in tqdm(response.iter_content()):        handle.write(data)

This is essentially the implementation @kvance described 30 months ago.