Python progress bar and downloads Python progress bar and downloads python python

Python progress bar and downloads


I've just written a super simple (slightly hacky) approach to this for scraping PDFs off a certain site. Note, it only works correctly on Unix based systems (Linux, mac os) as PowerShell does not handle "\r":

import sysimport requestslink = "http://indy/abcde1245"file_name = "download.data"with open(file_name, "wb") as f:    print("Downloading %s" % file_name)    response = requests.get(link, stream=True)    total_length = response.headers.get('content-length')    if total_length is None: # no content length header        f.write(response.content)    else:        dl = 0        total_length = int(total_length)        for data in response.iter_content(chunk_size=4096):            dl += len(data)            f.write(data)            done = int(50 * dl / total_length)            sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50-done)) )                sys.stdout.flush()

It uses the requests library so you'll need to install that. This outputs something like the following into your console:

>Downloading download.data

>[=============                            ]

The progress bar is 52 characters wide in the script (2 characters are simply the [] so 50 characters of progress). Each = represents 2% of the download.


You can use the 'clint' package (written by the same author as 'requests') to add a simple progress bar to your downloads like this:

from clint.textui import progressr = requests.get(url, stream=True)path = '/some/path/for/file.txt'with open(path, 'wb') as f:    total_length = int(r.headers.get('content-length'))    for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length/1024) + 1):         if chunk:            f.write(chunk)            f.flush()

which will give you a dynamic output which will look like this:

[################################] 5210/5210 - 00:00:01

It should work on multiple platforms as well! You can also change the bar to dots or a spinner with .dots and .mill instead of .bar.

Enjoy!


Python 3 with TQDM

This is the suggested technique from the TQDM docs.

import urllib.requestfrom tqdm import tqdmclass DownloadProgressBar(tqdm):    def update_to(self, b=1, bsize=1, tsize=None):        if tsize is not None:            self.total = tsize        self.update(b * bsize - self.n)def download_url(url, output_path):    with DownloadProgressBar(unit='B', unit_scale=True,                             miniters=1, desc=url.split('/')[-1]) as t:        urllib.request.urlretrieve(url, filename=output_path, reporthook=t.update_to)