Does python urllib2 automatically uncompress gzip data fetched from webpage?

How can I tell if the data at a URL is gzipped?

This checks if the content is gzipped and decompresses it:

from StringIO import StringIOimport gziprequest = urllib2.Request('http://example.com/')request.add_header('Accept-encoding', 'gzip')response = urllib2.urlopen(request)if response.info().get('Content-Encoding') == 'gzip':    buf = StringIO(response.read())    f = gzip.GzipFile(fileobj=buf)    data = f.read()

Does urllib2 automatically uncompress the data if it is gzipped? Will the data always be a string?

No. The urllib2 doesn't automatically uncompress the data because the 'Accept-Encoding' header is not set by the urllib2 but by you using: request.add_header('Accept-Encoding','gzip, deflate')

python gzip urllib2

If you are talking about a simple .gz file, no, urllib2 will not decode it, you will get the unchanged .gz file as output.

If you are talking about automatic HTTP-level compression using Content-Encoding: gzip or deflate, then that has to be deliberately requested by the client using an Accept-Encoding header.

urllib2 doesn't set this header, so the response it gets back will not be compressed. You can safely fetch the resource without having to worry about compression (though since compression isn't supported the request may take longer).

python gzip urllib2

Your question has been answered, but for a more comprehensive implementation, take a look at Mark Pilgrim's implementation of this, it covers gzip, deflate, safe URL parsing and much, much more, for a widely-used RSS parser, but nevertheless a useful reference.

CodeHunter

Does python urllib2 automatically uncompress gzip data fetched from webpage?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last