How to get pdf filename with Python requests?

It is specified in an http header content-disposition. So to extract the name you would do:

import red = r.headers['content-disposition']fname = re.findall("filename=(.+)", d)[0]

Name extracted from the string via regular expression (re module).

python pdf python-requests filenames

Building on some of the other answers, here's how I do it. If there isn't a Content-Disposition header, I parse it from the download URL:

import reimport requestsfrom requests.exceptions import RequestExceptionurl = 'http://www.example.com/downloads/sample.pdf'try:    with requests.get(url) as r:        fname = ''        if "Content-Disposition" in r.headers.keys():            fname = re.findall("filename=(.+)", r.headers["Content-Disposition"])[0]        else:            fname = url.split("/")[-1]        print(fname)except RequestException as e:    print(e)

There are arguably better ways of parsing the URL string, but for simplicity I didn't want to involve any more libraries.

python pdf python-requests filenames

Apparently, for this particular resource it is in:

r.headers['content-disposition']

Don't know if it is always the case, though.

CodeHunter

How to get pdf filename with Python requests?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last