Requests - get content-type/size without fetching the whole page/content
Yes.
You can use the Session.head
method to create HEAD
requests:
response = session.head(url, timeout=self.pageOpenTimeout, headers=customHeaders)contentType = response.headers['content-type']
A HEAD
request similar to GET
request, except that the message body would not be sent.
Here is a quote from Wikipedia:
HEAD Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
Use requests.head()
for this. It will not return the message body. You should use head
method if you are interested only in the headers
. Check this link for detail.
h = requests.head(some_link)header = h.headerscontent_type = header.get('content-type')
Sorry, my mistake, I should read documentation better. Here is the answer:http://docs.python-requests.org/en/latest/user/advanced/#advanced (Body Content Workflow)
tarball_url = 'https://github.com/kennethreitz/requests/tarball/master'r = requests.get(tarball_url, stream=True)if int(r.headers['content-length']) > TOO_LONG: r.connection.close() # log request too long