Python: Inflate and Deflate implementations Python: Inflate and Deflate implementations python python

Python: Inflate and Deflate implementations


You can still use the zlib module to inflate/deflate data. The gzip module uses it internally, but adds a file-header to make it into a gzip-file. Looking at the gzip.py file, something like this could work:

import zlibdef deflate(data, compresslevel=9):    compress = zlib.compressobj(            compresslevel,        # level: 0-9            zlib.DEFLATED,        # method: must be DEFLATED            -zlib.MAX_WBITS,      # window size in bits:                                  #   -15..-8: negate, suppress header                                  #   8..15: normal                                  #   16..30: subtract 16, gzip header            zlib.DEF_MEM_LEVEL,   # mem level: 1..8/9            0                     # strategy:                                  #   0 = Z_DEFAULT_STRATEGY                                  #   1 = Z_FILTERED                                  #   2 = Z_HUFFMAN_ONLY                                  #   3 = Z_RLE                                  #   4 = Z_FIXED    )    deflated = compress.compress(data)    deflated += compress.flush()    return deflateddef inflate(data):    decompress = zlib.decompressobj(            -zlib.MAX_WBITS  # see above    )    inflated = decompress.decompress(data)    inflated += decompress.flush()    return inflated

I don't know if this corresponds exactly to whatever your server requires, but those two functions are able to round-trip any data I tried.

The parameters maps directly to what is passed to the zlib library functions.

PythonC
zlib.compressobj(...)deflateInit(...)
compressobj.compress(...)deflate(...)
zlib.decompressobj(...)inflateInit(...)
decompressobj.decompress(...)inflate(...)

The constructors create the structure and populate it with default values, and pass it along to the init-functions.The compress/decompress methods update the structure and pass it to inflate/deflate.


This is an add-on to MizardX's answer, giving some explanation and background.

See http://www.chiramattel.com/george/blog/2007/09/09/deflatestream-block-length-does-not-match.html

According to RFC 1950, a zlib stream constructed in the default manner is composed of:

  • a 2-byte header (e.g. 0x78 0x9C)
  • a deflate stream -- see RFC 1951
  • an Adler-32 checksum of the uncompressed data (4 bytes)

The C# DeflateStream works on (you guessed it) a deflate stream. MizardX's code is telling the zlib module that the data is a raw deflate stream.

Observations: (1) One hopes the C# "deflation" method producing a longer string happens only with short input (2) Using the raw deflate stream without the Adler-32 checksum? Bit risky, unless replaced with something better.

Updates

error message Block length does not match with its complement

If you are trying to inflate some compressed data with the C# DeflateStream and you get that message, then it is quite possible that you are giving it a a zlib stream, not a deflate stream.

See How do you use a DeflateStream on part of a file?

Also copy/paste the error message into a Google search and you will get numerous hits (including the one up the front of this answer) saying much the same thing.

The Java Deflater ... used by "the website" ... C# DeflateStream "is pretty straightforward and has been tested against the Java implementation". Which of the following possible Java Deflater constructors is the website using?

public Deflater(int level, boolean nowrap)

Creates a new compressor using the specified compression level. If 'nowrap' is true then the ZLIB header and checksum fields will not be used in order to support the compression format used in both GZIP and PKZIP.

public Deflater(int level)

Creates a new compressor using the specified compression level. Compressed data will be generated in ZLIB format.

public Deflater()

Creates a new compressor with the default compression level. Compressed data will be generated in ZLIB format.

A one-line deflater after throwing away the 2-byte zlib header and the 4-byte checksum:

uncompressed_string.encode('zlib')[2:-4] # does not work in Python 3.x

or

zlib.compress(uncompressed_string)[2:-4]