How is a CRC32 checksum calculated? How is a CRC32 checksum calculated? c c

How is a CRC32 checksum calculated?


The polynomial for CRC32 is:

x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

Or in hex and binary:

0x 01 04 C1 1D B7
1 0000 0100 1100 0001 0001 1101 1011 0111

The highest term (x32) is usually not explicitly written, so it can instead be represented in hex just as

0x 04 C1 1D B7

Feel free to count the 1s and 0s, but you'll find they match up with the polynomial, where 1 is bit 0 (or the first bit) and x is bit 1 (or the second bit).

Why this polynomial? Because there needs to be a standard given polynomial and the standard was set by IEEE 802.3. Also it is extremely difficult to find a polynomial that detects different bit errors effectively.

You can think of the CRC-32 as a series of "Binary Arithmetic with No Carries", or basically "XOR and shift operations". This is technically called Polynomial Arithmetic.

To better understand it, think of this multiplication:

(x^3 + x^2 + x^0)(x^3 + x^1 + x^0)= (x^6 + x^4 + x^3 + x^5 + x^3 + x^2 + x^3 + x^1 + x^0)= x^6 + x^5 + x^4 + 3*x^3 + x^2 + x^1 + x^0

If we assume x is base 2 then we get:

x^7 + x^3 + x^2 + x^1 + x^0

Why? Because 3x^3 is 11x^11 (but we need only 1 or 0 pre digit) so we carry over:

=1x^110 + 1x^101 + 1x^100          + 11x^11 + 1x^10 + 1x^1 + x^0=1x^110 + 1x^101 + 1x^100 + 1x^100 + 1x^11 + 1x^10 + 1x^1 + x^0=1x^110 + 1x^101 + 1x^101          + 1x^11 + 1x^10 + 1x^1 + x^0=1x^110 + 1x^110                   + 1x^11 + 1x^10 + 1x^1 + x^0=1x^111                            + 1x^11 + 1x^10 + 1x^1 + x^0

But mathematicians changed the rules so that it is mod 2. So basically any binary polynomial mod 2 is just addition without carry or XORs. So our original equation looks like:

=( 1x^110 + 1x^101 + 1x^100 + 11x^11 + 1x^10 + 1x^1 + x^0 ) MOD 2=( 1x^110 + 1x^101 + 1x^100 +  1x^11 + 1x^10 + 1x^1 + x^0 )= x^6 + x^5 + x^4 + 3*x^3 + x^2 + x^1 + x^0 (or that original number we had)

I know this is a leap of faith but this is beyond my capability as a line-programmer. If you are a hard-core CS-student or engineer I challenge to break this down. Everyone will benefit from this analysis.

So to work out a full example:

   Original message                : 1101011011   Polynomial of (W)idth 4         :      10011   Message after appending W zeros : 11010110110000

Now we divide the augmented Message by the Poly using CRC arithmetic. This is the same division as before:

            1100001010 = Quotient (nobody cares about the quotient)       _______________10011 ) 11010110110000 = Augmented message (1101011011 + 0000)=Poly   10011,,.,,....        -----,,.,,....         10011,.,,....         10011,.,,....         -----,.,,....          00001.,,....          00000.,,....          -----.,,....           00010,,....           00000,,....           -----,,....            00101,....            00000,....            -----,....             01011....             00000....             -----....              10110...              10011...              -----...               01010..               00000..               -----..                10100.                10011.                -----.                 01110                 00000                 -----                  1110 = Remainder = THE CHECKSUM!!!!

The division yields a quotient, which we throw away, and a remainder, which is the calculated checksum. This ends the calculation. Usually, the checksum is then appended to the message and the result transmitted. In this case the transmission would be: 11010110111110.

Only use a 32-bit number as your divisor and use your entire stream as your dividend. Throw out the quotient and keep the remainder. Tack the remainder on the end of your message and you have a CRC32.

Average guy review:

         QUOTIENT        ----------DIVISOR ) DIVIDEND                 = REMAINDER
  1. Take the first 32 bits.
  2. Shift bits
  3. If 32 bits are less than DIVISOR, go to step 2.
  4. XOR 32 bits by DIVISOR. Go to step 2.

(Note that the stream has to be dividable by 32 bits or it should be padded. For example, an 8-bit ANSI stream would have to be padded. Also at the end of the stream, the division is halted.)


For IEEE802.3, CRC-32. Think of the entire message as a serial bit stream, append 32 zeros to the end of the message. Next, you MUST reverse the bits of EVERY byte of the message and do a 1's complement the first 32 bits. Now divide by the CRC-32 polynomial, 0x104C11DB7. Finally, you must 1's complement the 32-bit remainder of this division bit-reverse each of the 4 bytes of the remainder. This becomes the 32-bit CRC that is appended to the end of the message.

The reason for this strange procedure is that the first Ethernet implementations would serialize the message one byte at a time and transmit the least significant bit of every byte first. The serial bit stream then went through a serial CRC-32 shift register computation, which was simply complemented and sent out on the wire after the message was completed. The reason for complementing the first 32 bits of the message is so that you don't get an all zero CRC even if the message was all zeros.


A CRC is pretty simple; you take a polynomial represented as bits and the data, and divide the polynomial into the data (or you represent the data as a polynomial and do the same thing). The remainder, which is between 0 and the polynomial is the CRC. Your code is a bit hard to understand, partly because it's incomplete: temp and testcrc are not declared, so it's unclear what's being indexed, and how much data is running through the algorithm.

The way to understand CRCs is to try to compute a few using a short piece of data (16 bits or so) with a short polynomial -- 4 bits, perhaps. If you practice this way, you'll really understand how you might go about coding it.

If you're doing it frequently, a CRC is quite slow to compute in software. Hardware computation is much more efficient, and requires just a few gates.