The structure of Deflate compressed block The structure of Deflate compressed block unix unix

The structure of Deflate compressed block


First lets look at the hexadecimal representation of the compressed data as a series of bytes (instead of a series of 16-bit big-endian values as in your question):

4b e4 02 00

Now lets convert those hexadecimal bytes to binary:

01001011 11100100 00000010 000000000

According to the RFC, the bits are packed "starting with the least-significant bit of the byte". The least-significant bit of a byte is the right-most bit of the byte. So first bit of the first byte is this one:

01001011 11100100 00000010 000000000       ^       first bit

The second bit is this one:

01001011 11100100 00000010 000000000      ^      second bit

The third bit:

01001011 11100100 00000010 000000000     ^     third bit

And so on. Once you gone through all the bits in the first byte, you then start on the least-significant bit of the second byte. So the ninth bit is this one:

01001011 11100100 00000010 000000000                ^                ninth bit

And finally the last-bit, the thirty-second bit, is this one:

01001011 11100100 00000010 000000000                           ^                           last bit

The BFINAL value is the first bit in the compressed data, and so is contained in the single bit marked "first bit" above. It's value is 1, which indicates that this is last block in compressed data.

The BTYPE value is stored in the next two bits in data. These are the bits marked "second bit" and "third bit" above. The only question is which bit of the two is the least-significant bit and which is the most-significant bit. According to the RFC, "Data elements other than Huffman codes are packedstarting with the least-significant bit of the data element." That means the first of these two bits, the one marked "second bit", is the least-significant bit. This means the value of BTYPE is 01 in binary. and so indicates that the block is compressed using fixed Huffman codes.

And that's the easy part done. Decoding the rest of the compressed block is more difficult (and with a more realistic example, much more difficult). Properly explaining how do that would be make this answer far too long (and your question too broad) for this site. I'll given you a hint though, the next three elements in the data are the Huffman codes 10010001 ('a'), 00111010 ('\n') and 0000000 (end of stream). The remaining 6 bits are unused, and aren't part of the compressed data.

Note to understand how to decode deflate compressed data you're going to have to understand what Huffman codes are. The RFC you're following assumes that you do. You should also know how LZ77 compression works, though the document more or less explains what you need to know.