CRC Demystified: The Math That Keeps Ethernet Honest

Link Layer ยท Part 1

This is the opening post of a new series on the data link layer. We will walk up from raw bits, through framing, error detection, medium access, switching, and VLANs, one focused topic at a time. We start where every Ethernet frame ends: the 32-bit Frame Check Sequence.

Every Ethernet frame leaving your NIC carries a four-byte tail called the Frame Check Sequence. Wi-Fi frames have one. PPP, HDLC, USB, SATA, ZIP archives, PNG files, even the boot sector of an SD card. All of them carry a Cyclic Redundancy Check. CRC is the most widely deployed error-detection scheme in computing, and it is mathematically beautiful once you see it.

This post is the gentle, worked-by-hand version. We will derive CRC from first principles, do a full division on a small message, verify it on the receiver side, and end with a survey of the CRCs you will actually meet in the field.

What you will learn

  • Why the link layer needs error detection at all
  • How CRC differs from parity and checksums
  • Polynomial representation of bit strings and modulo-2 arithmetic
  • A full sender-side CRC calculation, step by step
  • How the receiver verifies a frame with one division
  • Which classes of errors CRC catches, and which it misses
  • Standard generator polynomials used in real protocols

1. Why the Link Layer Cares About Errors

The physical layer is messy. A copper pair picks up crosstalk and electromagnetic interference. A fibre run is mostly clean but still suffers from dispersion and connector loss. Wireless is the worst offender: multipath fading, hidden-node collisions, microwave ovens, your neighbour's baby monitor. Bits flip. They flip in bursts, especially when the noise event is short and intense.

The data link layer is the first place in the stack with enough structure to do something about it. It groups bits into frames, and it appends a small tag (the CRC) that lets the receiver decide whether the frame survived the trip. There are two basic strategies:

  • Error detection: the receiver can tell if the frame is corrupted, but cannot repair it. Corrupted frames are discarded; higher layers (TCP, application retries) handle retransmission. This is what CRC does.
  • Forward Error Correction (FEC): the receiver can repair some errors locally without retransmission. Heavier on bandwidth and compute. Used in optical, wireless PHY, satellite links, usually below the link layer.

Ethernet, Wi-Fi MAC, PPP, and most LAN protocols pick detection. It is cheap, fast, and on a healthy wired link the residual error rate is so low that retransmission is rarely needed.

2. Why Not Just Use Parity or a Checksum?

If all we want is "did anything flip?", simpler schemes exist. They are also much weaker.

Scheme Size Catches Misses
Single parity bit 1 bit Any odd number of flipped bits Any even number of flips (2, 4, 6, …)
2D parity O(√n) bits All single, double, triple-bit errors Specific 4-bit rectangle patterns
Internet checksum (one's complement sum) 16 bits Most random errors Reordered 16-bit words, certain byte swaps, all-zero / all-FF blocks
CRC-32 32 bits All 1-bit errors, all 2-bit errors within a useful range, all odd-bit errors, all burst errors ≤ 32 bits One in 232 random patterns; pathological collisions

The killer feature of CRC is burst-error detection. On a real link, errors do not arrive evenly spaced; a noise spike flips a run of adjacent bits. CRC is designed precisely for that case.

3. The Idea: Treat Bits as Polynomials

Take a bit string and read it as the coefficients of a polynomial over GF(2): the field with two elements, where addition is XOR and there are no carries. The bit string 1101 becomes:

1101  →  1·x³ + 1·x² + 0·x + 1  =  x³ + x² + 1

The CRC scheme picks a fixed generator polynomial G(x) of degree r, agreed on by sender and receiver. To encode a message M(x):

  1. Shift the message left by r bits. Algebraically: multiply by xr. Concretely: append r zero bits.
  2. Divide M(x) · xr by G(x) using modulo-2 long division. Call the remainder R(x). It has degree at most r−1, so it fits in r bits.
  3. Transmit T(x) = M(x) · xr + R(x). In bits: the original message followed by the r-bit remainder.

That remainder is the CRC. By construction, T(x) is exactly divisible by G(x), with remainder zero. The receiver re-runs the same division on the whole received frame. If the remainder is zero, the frame is accepted. If not, somewhere along the way, bits flipped.

Modulo-2 arithmetic in one paragraph

Every operation is bitwise XOR: no carries, no borrows. Addition and subtraction are the same operation. 1 + 1 = 0, 1 − 1 = 0, 1 + 0 = 1. This is what makes CRC trivial to implement in hardware: a handful of XOR gates and a shift register.

4. A Worked Example, Front to Back

Let's pick a small, friendly case so we can write every step on paper.

  • Message M = 1101011011 (10 bits)
  • Generator G = 10011, that is x⁴ + x + 1 (degree r = 4)

We are going to compute a 4-bit CRC and produce the transmitted frame.

1 Append r zeros to the message

Multiplying by x⁴ means shifting left by 4 bits:

M       =  1101011011
M · x⁴  =  11010110110000   (14 bits)

2 Divide by G using XOR (modulo-2 long division)

At every step, line up the generator under the leftmost remaining 1, then XOR. We stop when the working value is shorter than G.

      1 1 0 1 0 1 1 0 1 1 0 0 0 0       ← dividend (M with 4 zeros appended)
  XOR 1 0 0 1 1                          ← G aligned under leftmost 1
      ---------------
      0 1 0 0 1 1 1 0 1 1 0 0 0 0
        XOR 1 0 0 1 1
        ---------------
      0 0 0 0 0 0 1 0 1 1 0 0 0 0
              XOR 1 0 0 1 1
              ---------------
      0 0 0 0 0 0 0 0 1 0 1 0 0 0
                  XOR 1 0 0 1 1
                  ---------------
      0 0 0 0 0 0 0 0 0 0 1 1 1 0

remainder R = 1 1 1 0   (4 bits = degree of G)

Notice how each XOR clears the leftmost 1, then we slide right until we run out of bits. When the working value has fewer bits to the right of the alignment point than G has, we're done. The last 4 bits are the remainder.

CRC = 1110

3 Build the transmitted frame

Replace the trailing zeros with the CRC:

T = M followed by R
T = 1101011011 1110
T = 11010110111110

This is the 14-bit "frame" the sender puts on the wire.

5. The Receiver's Side

The receiver does not need a different algorithm. It just divides the entire received bit string by G and checks the remainder.

Case A: Clean reception

Suppose the frame arrives intact: 11010110111110. Divide by G = 10011:

      1 1 0 1 0 1 1 0 1 1 1 1 1 0
  XOR 1 0 0 1 1
      ---------------
      0 1 0 0 1 1 1 0 1 1 1 1 1 0
        XOR 1 0 0 1 1
        ---------------
      0 0 0 0 0 0 1 0 1 1 1 1 1 0
              XOR 1 0 0 1 1
              ---------------
      0 0 0 0 0 0 0 0 1 0 0 1 1 0
                  XOR 1 0 0 1 1
                  ---------------
      0 0 0 0 0 0 0 0 0 0 0 0 0 0

remainder = 0 0 0 0   ✓ frame accepted

Zero remainder ⇒ the frame is consistent with G. The receiver hands the payload up the stack.

Case B: A bit flips in transit

Now imagine the channel flips bit 5 (counting from the left, 0-indexed): position 5 was 1, arrives as 0. Received: 11010010111110.

      1 1 0 1 0 0 1 0 1 1 1 1 1 0
  XOR 1 0 0 1 1
      ---------------
      0 1 0 0 1 1 1 0 1 1 1 1 1 0    ← wait, that looks familiar
        XOR 1 0 0 1 1
        ---------------
      0 0 0 0 0 0 1 0 1 1 1 1 1 0    ← but the rest of the string differs
              XOR 1 0 0 1 1
              ---------------
      0 0 0 0 0 0 0 0 1 0 0 1 1 0
                  XOR 1 0 0 1 1
                  ---------------
      0 0 0 0 0 0 0 0 0 0 0 0 0 0

(Try this with a different flipped bit and you will get a non-zero remainder.)

Try it yourself

Flip a single bit in 11010110111110. For example, change the very first bit from 1 to 0 and redo the division by 10011. You will get a non-zero remainder. That is the receiver detecting the error. The point is: any single-bit flip is guaranteed to leave a non-zero remainder, as long as G(x) has more than one term. Our 10011 = x⁴ + x + 1 has three terms, so it qualifies.

6. What CRC Actually Guarantees

The strength of a CRC depends entirely on the generator polynomial. With a well-chosen G(x) of degree r, CRC catches:

  • All single-bit errors, provided G(x) has at least two non-zero terms.
  • All double-bit errors, provided G(x) has a factor that does not divide xk + 1 for any k up to the frame length.
  • All errors with an odd number of flipped bits, provided (x + 1) is a factor of G(x).
  • All burst errors of length ≤ r. A "burst of length L" is any error pattern confined to L consecutive bits; the bits in between can be flipped or not. This is the property that makes CRC so good against real-world line noise.
  • Most longer bursts. A burst of length exactly r + 1 escapes detection with probability 2−(r−1). Longer bursts escape with probability 2−r.

Intuition for the burst guarantee

An error pattern E(x) is undetected only when G(x) divides E(x). If E(x) represents a burst of length L, it factors as xi · B(x) where B(x) has degree L−1. For G(x) of degree r to divide xi · B(x), it must divide B(x), which is impossible if the degree of B(x) is less than r. So any burst shorter than r + 1 bits is caught with certainty.

7. The CRCs You Will Actually Meet

Name Polynomial (hex) Degree Used in
CRC-8 (ATM HEC) 0x07 8 ATM cell headers, 1-Wire bus
CRC-16-CCITT 0x1021 16 HDLC, X.25, Bluetooth, XMODEM, SD card commands
CRC-16-IBM 0x8005 16 Modbus, USB tokens
CRC-32 (IEEE 802.3) 0x04C11DB7 32 Ethernet FCS, PPP, MPEG-2, ZIP, PNG, gzip
CRC-32C (Castagnoli) 0x1EDC6F41 32 iSCSI, SCTP, Btrfs, ext4 metadata, hardware-accelerated on x86 (CRC32 instruction)
CRC-64 0x42F0E1EBA9EA3693 64 XZ archives, some storage systems

The polynomial for IEEE 802.3 Ethernet (CRC-32) reads, when expanded:

G(x) = x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹²
     + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1

Every Ethernet frame on every wire in the world is divisible by that polynomial.

8. Why Hardware Loves CRC

The naive picture (line up, XOR, slide) suggests an expensive operation. In hardware it isn't. CRC maps to a linear feedback shift register (LFSR): r flip-flops, a handful of XOR gates wired according to G(x), and the message bits clocked in one at a time. After the last bit, the register holds the remainder. That is the whole circuit. It runs at line rate, gate-by-gate, with no buffering required.

In software, the trick is table-driven CRC: precompute the effect of each 8-bit input byte on the register and XOR a 256-entry lookup table per byte processed. Most modern CPUs go a step further and provide a CRC32 instruction (CRC-32C on x86 SSE 4.2, generic CRC on ARMv8) that does several bytes per cycle.

9. What CRC Is Not

CRC is not a cryptographic hash

CRC is linear: CRC(A ⊕ B) = CRC(A) ⊕ CRC(B). An attacker who knows the message can flip bits in the payload and adjust the CRC field to keep it valid. CRC protects against random line noise, not against a deliberate adversary. For integrity against tampering, use HMAC, Poly1305, or a signed MAC, not CRC.

The other thing worth being clear about: CRC does not correct errors. It only detects them. On a wired LAN, a frame that fails its CRC is silently dropped; TCP eventually notices the missing segment and retransmits. On Wi-Fi, the 802.11 MAC layer adds its own ACK/retransmission on top because the air is much noisier than copper.

10. Where to Look Next

If you want to see CRC in the wild right now, you don't need a lab. Two quick exercises:

  • In Wireshark, capture any Ethernet frame on a switched LAN. By default the NIC strips the FCS before handing the frame to the OS, but Wireshark will show you the computed CRC under Frame Check Sequence if "Validate the FCS" is enabled. On most Linux NICs you can request FCS pass-through with ethtool -K eth0 rx-fcs on.
  • On Linux, cksum file.txt and crc32 file.txt both compute (different) CRCs over arbitrary data. They use 0x04C11DB7 and the standard IEEE polynomial respectively.

Takeaways

  • CRC is polynomial division over GF(2): XOR-based long division, no carries.
  • The sender appends the remainder; the receiver re-divides and accepts on remainder = 0.
  • The generator polynomial determines exactly which error patterns get caught.
  • A well-chosen CRC of degree r catches all bursts up to r bits and a huge fraction of everything else.
  • It is fast in hardware (LFSR), fast in software (table lookup or CPU instruction), and trivial to implement.
  • It is not a security primitive, only an error-detection primitive.

Next in the Link Layer Series

Part 2 leaves error detection behind and tackles the next big link-layer problem: sharing the medium. When many devices contend for one wire or one chunk of radio spectrum, who gets to talk, when, and what happens when two of them talk at once? We will cover the partitioning protocols (TDMA, FDMA, CDMA), the random-access ladder (ALOHA, CSMA, CSMA/CD on Ethernet, CSMA/CA on Wi-Fi), what really happens during a collision, and the difference between interference and jamming.

← Back to Blog