Thursday, July 4, 2024

32-bit CRCs might be insufficient for WiFi

A new paper suggests CRCs might not be enough for WiFi.

Is It Time to Upgrade From CRC-32?  /  Mohit Balany; Craig Partridge
(IEEE Xplore / paywall)

The authors set up a WiFi network with fault injection (aluminum foil helped here) and found that the types of bit faults they were injecting slipped past the HD of IEEE CRC-32.  Their data suggests, but did not test, that an alternate 32-bit CRC would not be sufficient. However, they did not seem to take into account that other CRCs have a better HD performance for sparse burst faults, which is suggested would be relevant by their Figure 5.

That having been said, CRCs not being enough for any radio transmission is not a huge surprise since it is a much more fault-prone transmission medium than copper or optical fiber.

Paper Abstract:

IEEE CRC-32 is widely used in data communications to detect transmission errors. The decision to use CRC-32 was made in the 1970s, based on studies that compared the performance of various 32-bit CRCs on relatively small (2 000 bit) packets subjected to errors described by a renewal process. In today’s networks we often see far larger packets (8 000 bits or more) and experiments have shown that errors do not fit a renewal process. The world has changed. To get a sense of that change, we captured 1 900 gigabit WIFI (802.11n) transmissions errors to better understand error distributions and patterns. Our data suggest that a considerable fraction of errors could slip past IEEE CRC-32. It may soon be time to consider a new generation of error codes.




Wednesday, March 13, 2024

Why To Avoid Hash Algorithms If What You Really Need Is A Checksum

Sometimes we hear that someone plans to use a hash function for fault detection instead of a checksum. This is probably not the best idea, because hashes provide weaker fault detection for random independent bit faults than a decent checksum.

A hash function is typically optimized for really good bit mixing, but not for ensuring that small numbers of bit faults are always detected. For a Pud effectiveness metric (random independent bit faults at a particular BER), this means that a hash function is likely to do worse than a good checksum.

To explore this, we evaluated the hash function Murmur3 (32 bits) on a Monte Carlo simulation of random independent bit faults and compared it with a variety of checksums:

Monte Carlo simulations of 32-bit checksum and hash functions, BER=1e-12

The results show that Murmur3 is an excellent approximation of a Simple HD=2 curve, which detects all 1-bit faults, and has a 1/(2**32) probability of undetected fault for all other numbers of bit faults. That is expected of a good hash function. Murmur3 turns out to have Hamming Distance=2, at least for the lengths we investigated.

In comparison, a 32-bit one's complement checksum does worse, but that is not a huge surprise.

All the other checksums did better.  Dual-sum and Koopman checksums provided HD=3, although with less mixing. Their curves are below Murmur3 but not all the way down at the Simple HD=3 curve.

DualX-32P and Koopman-32P both have curves above the Simple HD=4 curve. CRC curves were not plotted, but would be at least as good as the Koopman-32P curve, and potentially much better for short data word lengths where a higher HD can be provided.

Summarizing, Murmur3 does a better job of bit mixing than the checksum functions, but suffers in ability to detect random independent bit flips due to being limited to HD=2.

We expect that other good hash functions will trace out either an approximate Simple HD=2 curve, or possibly a Simple HD=1 curve depending on their specifics. A dual-sum checksum, a Koopman checksum, or a CRC will all provide significantly better fault detection.

These curves are for random independent bit faults. For memory arrays sometimes people are concerned with multi-bit single event upsets. The answer to how these will perform will depend on the specifics of the geometry of the memory array (for example, row size and whether columns are interleaved). Checksums and CRCs will generally be good at multi-bit faults in bits that are adjacent in the data word. And the 32-P checksums will detect all 1-, 2-, and 3-bit faults regardless of the bit position. Beyond that is difficult to know without analysis tailored to the specific situation of the memory array you care about as well as the comparative frequency and patterns of different numbers of bit faults.

Wednesday, February 28, 2024

Comparative speeds for different Checksum & CRC implementations

Comparative speeds for different Checksum & CRC implementations.

For more information see my new book:  Understanding Checksums and Cyclic Redundancy Checks -- which includes downloadable open source example implementations for the checksums mentioned below.  This is a preview of section 12.1 from that book.

Monday, February 19, 2024

Book: Understanding Checksums and Cyclic Redundancy Checks


Amazon e-book: https://www.amazon.com/dp/B0CVXWDZ99   (Also available from country-specific Amazon web sites for:  UK, DE, FR, ES, IT, NL, JP, BR, CA, MX, AU, IN).  

Updated, permanent post with more info here: https://checksumcrc.blogspot.com/p/book.html


Monday, January 1, 2024

Explanation of CRC Operation: Polynomial Division, Shift Register, and Comparison (video)

 Here is a video explaining the mechanics of Cyclic Redundancy Check computations, including:

  • An example of the polynomial division approach
  • An example of the XOR-feedback shift register approach
  • A step-by-step comparison of how the polynomial division and shift register approaches are actually doing exactly the same computation.
If you've ever wondered why it is that a shift register and polynomial division end up with the same result, this video explains it.

FULL VIDEO:   YouTube | Archive.org

The full video consists of these four parts: (YouTube Play List):
  • Part 1: Example CRC Division to compute check value (YouTube 4:25)
  • Part 2: Example CRC Validation via Polynomial Division (YouTube 3:30)
  • Part 3: Example CRC Computation via Hardware Shift Register (YouTube 8:10)
  • Part 4: Comparison of CRC Polynomial Division and Hardware Shift Register (YouTube 6:21)




Thursday, December 28, 2023

New, faster version of HDLEN

 A new, faster version of hdlen is now available (version 2.0.0).

This uses a hash table to dramatically speed up determining the maximum data word length for each Hamming Distance for any particular CRC polynomial.

More info here: https://users.ece.cmu.edu/~koopman/crc/hdlen.html



Wednesday, May 10, 2023

Koopman Checksum

An Improved Modular Addition Checksum Algorithm

Philip Koopman

This paper introduces a checksum algorithm that provides a new point in the performance/complexity/effectiveness checksum tradeoff space. It has better fault detection properties than single-sum and dual-sum modular addition checksums. It is also simpler to compute efficiently than a cyclic redundancy check (CRC) due to exploiting commonly available hardware and programming language support for unsigned integer division. The key idea is to compute a single running sum, but introduce a left shift by the size (in bits) of the modulus before performing the modular reduction after each addition step. This approach provides a Hamming Distance of 3 for longer data word lengths than dual-sum approaches such as the Fletcher checksum. Moreover, it provides this capability using a single running sum that is only twice the size of the final computed check value, while providing fault detection capabilities even better than large-block variants of dual-sum approaches that require larger division operations. A variant that includes a parity bit achieves Hamming Distance 4 for about half the data word length as the baseline version for the same size check value.

Full paper here: https://arxiv.org/abs/2304.13496




Below is a simplistic comparison of execution time for 32-bit checksums. Every system will have a different outcome, but this gives a general idea of what to expect. Koopman checksums have about the same run time as a Fletcher or Adler checksum, and likely have slightly faster run time for a machine such as an Intel server that has strong hardware support for division. (The DualSum32 implementation includes an optimization of delaying the SumB modulo operation until the end of the computation.) Nonetheless, Koopman checksums provide better Hamming Distance performance.  CRC_R32T is a CRC implementation that uses a 256-element lookup table to speed up the computation.








32-bit CRCs might be insufficient for WiFi

A new paper suggests CRCs might not be enough for WiFi. Is It Time to Upgrade From CRC-32?  /  Mohit Balany; Craig Partridge (IEEE Xplore / ...