Checksum and CRC Central

32-bit CRCs might be insufficient for WiFi

2024-07-04T03:32:00.000-07:00

A new paper suggests CRCs might not be enough for WiFi.

Is It Time to Upgrade From CRC-32? / Mohit Balany; Craig Partridge
(IEEE Xplore / paywall)

The authors set up a WiFi network with fault injection (aluminum foil helped here) and found that the types of bit faults they were injecting slipped past the HD of IEEE CRC-32. Their data suggests, but did not test, that an alternate 32-bit CRC would not be sufficient. However, they did not seem to take into account that other CRCs have a better HD performance for sparse burst faults, which is suggested would be relevant by their Figure 5.

That having been said, CRCs not being enough for any radio transmission is not a huge surprise since it is a much more fault-prone transmission medium than copper or optical fiber.

Paper Abstract:

IEEE CRC-32 is widely used in data communications to detect transmission errors. The decision to use CRC-32 was made in the 1970s, based on studies that compared the performance of various 32-bit CRCs on relatively small (2 000 bit) packets subjected to errors described by a renewal process. In today’s networks we often see far larger packets (8 000 bits or more) and experiments have shown that errors do not fit a renewal process. The world has changed. To get a sense of that change, we captured 1 900 gigabit WIFI (802.11n) transmissions errors to better understand error distributions and patterns. Our data suggest that a considerable fraction of errors could slip past IEEE CRC-32. It may soon be time to consider a new generation of error codes.

Why To Avoid Hash Algorithms If What You Really Need Is A Checksum

2024-03-13T13:18:00.000-07:00

Sometimes we hear that someone plans to use a hash function for fault detection instead of a checksum. This is probably not the best idea, because hashes provide weaker fault detection for random independent bit faults than a decent checksum.

A hash function is typically optimized for really good bit mixing, but not for ensuring that small numbers of bit faults are always detected. For a Pud effectiveness metric (random independent bit faults at a particular BER), this means that a hash function is likely to do worse than a good checksum.

To explore this, we evaluated the hash function Murmur3 (32 bits) on a Monte Carlo simulation of random independent bit faults and compared it with a variety of checksums:

Monte Carlo simulations of 32-bit checksum and hash functions, BER=1e-12

The results show that Murmur3 is an excellent approximation of a Simple HD=2 curve, which detects all 1-bit faults, and has a 1/(2**32) probability of undetected fault for all other numbers of bit faults. That is expected of a good hash function. Murmur3 turns out to have Hamming Distance=2, at least for the lengths we investigated.

In comparison, a 32-bit one's complement checksum does worse, but that is not a huge surprise.

All the other checksums did better. Dual-sum and Koopman checksums provided HD=3, although with less mixing. Their curves are below Murmur3 but not all the way down at the Simple HD=3 curve.

DualX-32P and Koopman-32P both have curves above the Simple HD=4 curve. CRC curves were not plotted, but would be at least as good as the Koopman-32P curve, and potentially much better for short data word lengths where a higher HD can be provided.

Summarizing, Murmur3 does a better job of bit mixing than the checksum functions, but suffers in ability to detect random independent bit flips due to being limited to HD=2.

We expect that other good hash functions will trace out either an approximate Simple HD=2 curve, or possibly a Simple HD=1 curve depending on their specifics. A dual-sum checksum, a Koopman checksum, or a CRC will all provide significantly better fault detection.

These curves are for random independent bit faults. For memory arrays sometimes people are concerned with multi-bit single event upsets. The answer to how these will perform will depend on the specifics of the geometry of the memory array (for example, row size and whether columns are interleaved). Checksums and CRCs will generally be good at multi-bit faults in bits that are adjacent in the data word. And the 32-P checksums will detect all 1-, 2-, and 3-bit faults regardless of the bit position. Beyond that is difficult to know without analysis tailored to the specific situation of the memory array you care about as well as the comparative frequency and patterns of different numbers of bit faults.

This post is supplemental to my book Understanding Checksums and Cyclic Redundancy Checks. That book has more information about Simple HD curves, Pud, DualX checksums, and so on.

Comparative speeds for different Checksum & CRC implementations

2024-02-28T09:45:00.000-08:00

Comparative speeds for different Checksum & CRC implementations.

For more information see my new book: Understanding Checksums and Cyclic Redundancy Checks -- which includes downloadable open source example implementations for the checksums mentioned below. This is a preview of section 12.1 from that book.

One of the main lessons from doing benchmarking is that the answer to “how fast/good is something” always ends up being “it depends.” We do not attempt to resolve the question of benchmarking algorithm speed in a comprehensive way.

Rather, we present a single relatively simple comparison (figure 12.1) to illustrate possible speed differences among checksum values – along with a strong caution that any particular application on any particular computing hardware with any particular compiler with any particular workload of computational tasks competing for memory hierarchy resources is almost certain to see a different outcome. Nonetheless, we hope this comparison and analysis provides useful insight.

To do a speed comparison, we created optimized implementations of 32-bit checksum algorithms intended to be representative of the type of code one might see in a specific application. This included:

· Implementation is in the C programming language with optimizations enabled (gcc compiler with -O2 optimization option).

· Sums use 32-bit integers if possible, but use a 64-bit integer sum if required due to carry-out or shifting within the algorithm. 64-bit unsigned integers are used by Add-32, DualX-32, DualX-32P, Koopman-32, and Koopman-32P.

· A constant value is used for the modulus.

· A constant value of zero is used for the initial seed sum initialization.

· There is no final XOR value applied.

· No check value adjustment is performed. The check value is not stored in memory as part of the code word, meaning that the time measurement is just of the checksum computation itself.

· The data word is stored as an array of 32-bit integers to avoid overhead for building 32-bit values from a series of bytes.

· The data word is initialized with pseudorandom data a single time before repeating the checksum operation many times in a row. This results in a de minimus amortized data word initialization cost. Care is required to prevent the compiler’s optimizer from eliminating repeated calls to the checksum function given that the answer will be the same for repeated calculations. We expect that the data word bytes live comfortably in cache memory for the vast majority of computation loops. This can be expected to accentuate differences between algorithms compared to applications that spend more time waiting for memory accesses to data word content.

· Bit-shift amounts are defined as constant values that will only work for 32-bit checksums (generally 8 bits or 16 bits depending on the checksum involved). This helps the compiler optimize shifting into byte movements if possible.

· Modulo operations are deferred until after the block addition loop to the maximum degree practical.

In other words, the implementations do quite a lot to give a favorable implementation for each one.

The algorithms tested had the following implementation and optimization features, yielding corresponding speed results.

· LRC-32: This performs a 32-bit XOR of 32-bit data elements. This is slightly faster than Add-32 because there is no final modulo operation needed. Also, it only needs a 32-bit sum because there is no carry-out possible from the XOR “sum” operation.

· Add-32: This performs a 32-bit ones’ complement integer addition using a 64-bit sum variable. A modulo 0xFFFFFFFF operation is performed one time – after all data word blocks have been processed – to achieve a ones’ complement result rather than doing a modulo for every addition. This was the baseline for timing, with other speed results relative to this algorithm’s execution time.

· DualSum-32: This uses a pair of 32-bit sum variables and 16-bit blocks. Two pairs of additions of 16-bit blocks are done for each 32-bit portion of the data word retrieved (sumA and sumB are first updated with the low 16 bits, then again with the high 16 bits.) A 16-bit modulus of 65533 is applied to sumA and sumB every time the value in sumB gets too large, which is a significant speedup compared to computing a modulo in every loop iteration. This is nonetheless much slower than Add-32, by a factor of about 3, largely because of the need to perform four sums of 16-bit blocks in each loop iteration instead of just one sum of a 32-bit block. Dual-sum checksum with parity can provide HD=4, but would be even slower and would provide only half the data word length support.

· DualX-32: DualSum-32 modified to have a block size of 32 bits. This uses a pair of 64-bit sums with a 16-bit modulus of 65519. This improves speed compared to DualSum-32 due to doing one pair of additions per 32-bit data word segment retrieved instead of four additions, and also provides twice the HD=3 data word length. It runs about twice as fast as DualSum-32.

· DualX-32P: This is DualX-32 with 15-bit modulus of 32749 and an additional parity computation. The parity computation boosts fault detection to HD=4 while still being faster than DualSum-32.

· Koopman-32: This uses a 64-bit sum variable with a modulus of 4294967291. It processes the data word one 32-bit block at a time. A single 64-bit modulus operation is performed inside the inner loop. This runs at approximately one eighth the speed of Add-32 due to the need to perform a modulo operation every loop instead of only occasionally.

· Koopman-32P: This uses a 64-bit sum variable with a modulus of 0x7FFFFFED, processing the data word one 32-bit block at a time. This is slightly slower than Koopman-32 due to an extra XOR operation with an additional parity register inside the inner loop that is used to compute parity.

· CRCRT-32: This is a 32-bit CRC using a right-shift 256-entry lookup table that processes the data word a byte at a time. Each inner loop processes a 32-bit data word block by doing four in-line table lookup, shift, and XOR operations, with each of those four operations processing 8 bits of a 32-bit block. Reorganizing the data word into an array of 8-bit blocks likely would not speed up the implementation, because the implementation used can fetch four data word bytes at a time from memory into a 32-bit register. The lookup table is precomputed, so there is zero overhead for table initialization. The modulus does not affect computing speed, but happened to be 0xEDB88320 (IEEE 802.3 CRC-32). This is slower than Koopman checksums largely because it needs to make four lookup table accesses to process four bytes of a 32-bit block inside each inner loop, and that turns out to be slower than performing a division operation. This table-based CRC implementation is just over twice as slow as a Koopman checksum.

· CRCR-32: This is a bit-at-a-time right-shift CRC implementation that accesses the data word as a sequence of 32-bit blocks. This algorithm is so slow that the graph’s executing time bar did not fit on a reasonable vertical axis scale. This speed issue is what motivates using CRC_R32T instead.

Relative times are given in the chart for the average of 10 runs. Each run was 100 million iterations of computing the check value for a 256-element array of 32-bit integers (1 KB) for the CRC implementations, and one billion iterations for other checksums.[1] The hardware used was one core of an 8-core Intel Xeon Silver CPU running at 3.2 GHz. The array size is small enough to fit comfortably into cache memory, so the run times shown were optimistic compared to a situation in which the checksum is being computed on data that is not resident in cache memory when the computation is started.

If memory access times dominate in an application, the computing speed differences among algorithms will be reduced. If integer division is slower on some processor (e.g., due to no hardware support), it is possible that DualSum-32 will have more speed advantage compared to a Koopman checksum.

Recapping the fault detection vs. speed tradeoff overall: LRC-32 and Add-32 give HD=2 but are fast; DualX-32 and DualX-32P are reasonably fast while giving HD=3 or HD=4 for shorter data word lengths; DualSum-32 is slower and gives HD=3 for a shorter data word length than DualX-32; Koopman-32 gives much longer HD=3 for an additional speed penalty; Koopman-32P gives longer HD=4 for a little more speed penalty; and a CRC can give HD=3, HD=4, or higher HD according to the polynomial selected but is at best half as fast as the other algorithms.

Recommendations based on these results are:

(1) use a ones’ complement addition checksum only if speed is the sole and overwhelming consideration;

(2) use a dual-sum checksum if it provides a long enough HD=3 or HD=4 length for the application, preferring DualX or DualXP rather than a conventional dual-sum checksum;

(3) move to a Koopman checksum if a longer HD=3 or HD=4 length is needed; and

(4) use a CRC (preferably with a 256-entry lookup table) if higher HD is needed than available with other checksum techniques.

[1] The spread between fastest and slowest of the ten runs ranged from 1.7% for CRCRT-32 to 12.8% for DualX-32P. The full-length simulation campaign produced results quite similar to a 1/10^th-size pilot campaign. While a much more thorough experimental campaign on a variety of systems might be attempted, it would not change the general trend of relative performance, which is the point for our purposes.

Book: Understanding Checksums and Cyclic Redundancy Checks

2024-02-19T11:50:00.000-08:00

Amazon e-book: https://www.amazon.com/dp/B0CVXWDZ99 (Also available from country-specific Amazon web sites for: UK, DE, FR, ES, IT, NL, JP, BR, CA, MX, AU, IN).

Updated, permanent post with more info here: https://checksumcrc.blogspot.com/p/book.html

Explanation of CRC Operation: Polynomial Division, Shift Register, and Comparison (video)

2024-01-01T11:33:00.000-08:00

Here is a video explaining the mechanics of Cyclic Redundancy Check computations, including:

An example of the polynomial division approach
An example of the XOR-feedback shift register approach
A step-by-step comparison of how the polynomial division and shift register approaches are actually doing exactly the same computation.

If you've ever wondered why it is that a shift register and polynomial division end up with the same result, this video explains it.

FULL VIDEO: YouTube | Archive.org

The full video consists of these four parts: (YouTube Play List):

Part 1: Example CRC Division to compute check value (YouTube 4:25)
Part 2: Example CRC Validation via Polynomial Division (YouTube 3:30)
Part 3: Example CRC Computation via Hardware Shift Register (YouTube 8:10)
Part 4: Comparison of CRC Polynomial Division and Hardware Shift Register (YouTube 6:21)

New, faster version of HDLEN

2023-12-28T05:39:00.000-08:00

A new, faster version of hdlen is now available (version 2.0.0).

This uses a hash table to dramatically speed up determining the maximum data word length for each Hamming Distance for any particular CRC polynomial.

More info here: https://users.ece.cmu.edu/~koopman/crc/hdlen.html

Koopman Checksum

2023-05-10T16:38:00.002-07:00

An Improved Modular Addition Checksum Algorithm

Philip Koopman

This paper introduces a checksum algorithm that provides a new point in the performance/complexity/effectiveness checksum tradeoff space. It has better fault detection properties than single-sum and dual-sum modular addition checksums. It is also simpler to compute efficiently than a cyclic redundancy check (CRC) due to exploiting commonly available hardware and programming language support for unsigned integer division. The key idea is to compute a single running sum, but introduce a left shift by the size (in bits) of the modulus before performing the modular reduction after each addition step. This approach provides a Hamming Distance of 3 for longer data word lengths than dual-sum approaches such as the Fletcher checksum. Moreover, it provides this capability using a single running sum that is only twice the size of the final computed check value, while providing fault detection capabilities even better than large-block variants of dual-sum approaches that require larger division operations. A variant that includes a parity bit achieves Hamming Distance 4 for about half the data word length as the baseline version for the same size check value.

Full paper here: https://arxiv.org/abs/2304.13496

Below is a simplistic comparison of execution time for 32-bit checksums. Every system will have a different outcome, but this gives a general idea of what to expect. Koopman checksums have about the same run time as a Fletcher or Adler checksum, and likely have slightly faster run time for a machine such as an Intel server that has strong hardware support for division. (The DualSum32 implementation includes an optimization of delaying the SumB modulo operation until the end of the computation.) Nonetheless, Koopman checksums provide better Hamming Distance performance. CRC_R32T is a CRC implementation that uses a 256-element lookup table to speed up the computation.

Large-Block Checksums

2023-05-06T16:32:00.003-07:00

Large-Block Modular Addition Checksum Algorithms

Philip Koopman

Checksum algorithms are widely employed due to their use of a simple algorithm with fast computational speed to provide a basic detection capability for corrupted data. This paper describes the benefits of adding the design parameter of increased data block size for modular addition checksums, combined with an empirical approach to modulus selection. A longer processing block size with the right modulus can provide significantly better fault detection performance with no change in the number of bytes used to store the check value. In particular, a large-block dual-sum approach provides Hamming Distance 3-class fault detection performance for many times the data word length capability of previously studied Fletcher and Adler checksums. Moduli of 253 and 65525 are identified as being particularly effective for general-purpose checksum use.

Download the full paper here: https://arxiv.org/abs/2302.13432

Why Life Critical Networks Tend To Provide HD=6

2023-04-04T15:24:00.008-07:00

I'm thinking about why a Hamming Distance of 6 for network checksums/CRCs is so often found. Certainly that is designed into many network protocols, but sometimes I like to ask the question as to whether folk wisdom makes sense. So I did a trial calculation:

100 bit messages
1000 messages/second
20 year operational life
10 million deployed units
1e-5 bit error ratio

I got 47.47 expected arrivals of 5-bit corrupted messages (random independent bisymmetric "bit flip" fault model at BER), and probability of only 1% of any 6-bit corrupted messages. So that seems to justify HD=6 as being reasonable for this example system, which looks to me like a CAN network with low-cost network driver circuits on a car fleet. (Even 10x fleet size still works out as more likely than not there will be no 6-bit faults.) The safety argument here would simply be that there will never be an undetected network message fault for this fault model.

Did I miss something? (Did I get my math wrong :) ?) Does anyone know of a place where HD=6 is justified in a different way other than by folklore? So much of checksums and CRCs risks being lost in folklore as the grey beards retire. I'm trying to capture some of this type of thinking for posterity before it is my turn to retire...

In my experience 1e-5 BER is a typical conservative number for an embedded network. That means about 1% of network messages are corrupted. More than that and it's going to be difficult to get a working system. Less than that and there is economic pressure to use cheaper cabling with worse BER. But this is a rule-of-thumb approximation. You should do the calculation for your own network. For IT networks BER is dramatically lower, but they tend to operate in a much more benign environment and have budget for better network hardware.

Wolfram Alpha computation for those who'd like to check my work (you might need to click on "approximate solution" to get the actual number):

https://www.wolframalpha.com/input?i=n%3D100%3Bf%3D5%3BB%3D1e-5%3B+++X%3D%28B**f%29%3B+Y%3D%28%281-B%29**%28n-f%29%29%3B+Z%3Dcombination%28n%2Cf%29%3B++X*Y*Z*20*365.25*24*3600*1000*10000000

Excel source: https://archive.org/details/hd-worksheet

Pending clean-up of papers

2020-08-11T08:11:00.003-07:00

I received the following comments with potential issues in some of my work. I'm delighted to receive such feedback since a major point of my work is to try to get a complex topic straight.

At the moment I'm fully engaged working on self-driving car safety so it will be a while before I can resolve these. However, in the spirit of transparency here are the points raised so you know to be careful on these points when reading my work.

(This is posted with permission conditioned on anonymity. Thanks to the person who sent the comments.)

----------------------------------------------

1) On page 5 of http://users.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc_poly_embedded.pdf you say that SMBus uses CRC-8, polynomial 0xEA in your notation and say that 0x83 would be better. I am not familiar with SMBus but three sources say that the polynomial chosen for SMBus was 0x83.

CRC Catalogue entry CRC-8/SMBUS lists poly=0x07 (they use what Wikipedia calls Normal form)

SMBus Spec page 27 clearly states that the polynomial is x8 + x2 + x1 + 1

Presentation 22 April 1999 USAR Systems has the same.

2) On same page of same document as 1), it says XMODEM also uses CRC-8 polynomial 0xEA but it appears that XMODEM uses a 16-bit CRC according to Wikipedia and the CRC Catalogue amongst others. There is reference to an earlier 8-bit FCS but that was called a checksum in the references I found, not a CRC. ref1, ref2, ref3

3) Finally, I am having trouble reproducing your Probability of Undetected Error graphs from your koopman04 paper. The maxino_ms.pdf, page 9 gives the equation for Pud as

Pud = HW * BER^x * (1-BER)^(n-x)

where n is code word length in bits, x is number of bit errors, HW is Hamming weight at the HD. For example, polynomial 0x9C at 9 bit data word length is shown as just above 1e-33 (I used a tool to read scaled values off log charts to get about 6e-33 since I can't find a numeric value anywhere), but the formula gives 3.4e-29 at same BER of 1e-6, since HD=5 and HW(5)=34.

I suspect this is because your Pud graphs are, as you say in section 4.1, "weighted by the percentage of errors caught per corresponding polynomial weights for each data word length" but I cannot find a formula or example of this weighting which gives the required correction. Although formula (2) in page 18 of the faa15_tc-14-49.pdf document is different from the above, as far as I can see the combin(c, HD) term cancels out because UndetectedFractionHD = HW/combin(c, HD). So please could you provide the weighting formula, preferably with one numerical example.

----------------------------------------------

Initial responses:

- I would not be at all surprised for #1 and #2 I was referring to documents that got the polynomials wrong, which was previously quite common but is getting better over time.

- For #3 the graphs and probability equations can be a bit tricky. I'll have to get all that stuff back in my head before being able to respond. I'd suggest going with the analysis here if you're in doubt, which had the math reviewed by researchers from Honeywell, so hopefully it's OK: http://www.tc.faa.gov/its/worldpac/techrpt/tc14-49.pdf

CRC Information Quick Start

2020-06-29T04:12:00.000-07:00

I'm not able to spend much if any time on this lately due to my work in self driving car safety. Here is a quick start for those looking for info.

A tutorial on using CRCs and Checksums (how to apply them, pitfalls, system considerations; not a deep crawl through the math):
https://betterembsw.blogspot.com/2013/11/crc-webinar.html

A report written for the FAA on CRCs and Checksums (goes with the tutorial)
https://betterembsw.blogspot.com/2015/04/faa-crc-and-checksum-report.html

A web site of "best" CRCs, including:

Description:
https://betterembsw.blogspot.com/2010/05/whats-best-crc-polynomial-to-use.html
Table of "best" CRCs for each size and length
https://users.ece.cmu.edu/~koopman/crc/index.html
A zoo that has performance of most published CRCs (there is a page for each polynomial size):
https://users.ece.cmu.edu/~koopman/crc/crc32.html
Software you can run yourself to confirm performance as a double-check on the tables (which were generated with a much more complicated search algorithm, then checked with this software)
https://users.ece.cmu.edu/~koopman/crc/hdlen.html

A summary of "Seven Deadly Sins of CRCs and Checksums":
https://betterembsw.blogspot.com/2013/06/seven-deadly-sins-of-crcs-and-checksums.html

Maximal length LFSR polynomials:
https://users.ece.cmu.edu/~koopman/lfsr/index.html

My papers on this topic are here:
https://users.ece.cmu.edu/~koopman/projects.html#crc

(There are other posts you might find of interest in the blog as well.)
https://betterembsw.blogspot.com/2015/04/faa-crc-and-checksum-report.html

Also, some FAQ:

Can I combine using two small CRCs to get better HD? Probably not.
It doesn't work because of error patterns that hit both CRC fields. Maybe it can be done, but every paper I've read has been flawed for this reason and the polynomial searches I ran came up empty handed due to this reason.
How does polynomial X perform?
Please look at the zoo and see if you can find it. If it is not there I'll add it when I have time if it is a published value. Be sure to consider there are different notations, which are accounted for in the Zoo.
Can you please provide me with free consulting help?
Sorry, too busy with self driving car safety these days in addition to teaching. I sometimes answer general interest questions on my blog, but I can only spend a minute or two and it has to be something others will likely find interesting.

Effect of Initial Seed Values on CRCs and Checksums

2019-08-16T04:22:00.002-07:00

Checksums and CRCs involve a repeated computation over the length of a dataword that
accumulates a result to form an FCS value. The accumulation value must be initialized to some
known value for this process to produce a predictable result for any given codeword. When the
initial value is unstated, it is typically assumed to be zero.

In some cases, a non-zero value is used instead, such as 0xFFFF for a 16-bit checksum or CRC.
This has the advantage of making an all zero codeword impossible. Perhaps more importantly
for Fletcher checksum, ATN-32, and CRCs, this has the additional advantage of causing leading
zeros of a message to affect the FCS value. To better understand, consider a CRC initialized
with an all zero seed value. Any number of all zero dataword chunks can be fed into the CRC
and it will stay at a zero value, throwing away information about how many zero chunks have
been processed. However, if the CRC computation is initialized to any non-zero value,
processing chunks with a value of zero will cycle patterns through the CRC, tracking the number
of zero-valued chunks that have been processed as a sort of counter. This can help with
situations in which additional leading-zero bytes have been erroneously added to a dataword.

In general, any non-zero seed value can be used. The value selected does not affect
error-detection properties beyond those previously discussed. In particular, CRC error detection,
in terms of the BER fault model, is independent of the seed value used.

The CRC and checksum values can be alternately or additionally modified to create an FCS
value. A common modification is to XOR the result of a checksum computation with a value,
such as 0xFFFF, which inverts bits, but does not lose any of the information content of the
checksum computation. In some cases, CRC result values must be bit-reversed to preserve burst
error properties across the dataword to FCS boundary, depending upon the computation method and message bit ordering in the dataword.

Note:
The "seed value" is sometimes also called the "initial value"

Source:

Koopman, Driscoll, Hall, "Selection of Cyclic Redundancy Code and Checksum Algorithms to Ensure Critical Data Integrity," DOT/FAA/TC-14/49 , March 2015

Video on Hamming Codes & Hamming Distance

2017-03-11T08:56:00.002-08:00

Nice video that talks about Hamming Codes and gives a hypercube explanation that helps understand Hamming Distance

Perfect Codes:

https://youtu.be/WPoQfKQlOjg
on computerphile.

On sabbatical

2017-01-22T11:53:00.001-08:00

I noticed that my e-mail has some CRC queries. I'm on sabbatical through summer 2017, so there might be significant delays before I'm able to respond to e-mails. If you notice something that seems to be a problem please do let me know, but please bear with me until I have time to respond.

(Contrary to what some might think "sabbatical" is usually the opposite of "vacation" for engineering faculty. It is a chance for us to concentrate on a large difficult problem that we can't do during normal semesters because of all the many demands made upon our time.)

Significantly updated CRC data

2015-07-28T05:23:00.002-07:00

Over the course of the summer I was able to get a little time to revise and update my CRC data web site. It is still work-in-progress, but you'll find a lot more data than was there previously.

New summary tables: http://users.ece.cmu.edu/~koopman/crc/

Recommendations for "good" polynomials with various HD values are completed through 25 bit CRCs, and I'll let the scripts compute away for a few months until eventually the tables are completed through 32 bits. Past 32 bits my search algorithm starts taking a long time to execute so probably will only have samples above that, but we'll see how it goes.

When you look at the data, yellow and blank mean things I'm working on (the color coding helps me keep straight what the scripts are currently working on). Green boxes and "under construction" mean preliminary results that should be OK but aren't double-checked yet. A significant addition is that I've added automatically generated examples of violations of the HD so you can confirm results for yourself. There is also some do-it-yourself HD computation help here.

If you see a polynomial missing that is published somewhere let me know and I'll add it.

I should mention that Prof. Berthold Gick from University of Applied Sciences, Koblenz, Germany has been independently working in this area and verified many of my results. More importantly, he's pointed out a few typos and other issues that I'm working to correct. I'd like to take this chance to thank him for his efforts. If anyone else notices problems please let me know.

Is HD=1 Useful?

2015-06-11T05:18:00.001-07:00

I received an e-mail with a question as to why a HD=1 checksum would ever be useful (referred to on slide 89 of my Sept 2014 slide set on CRCs). The point is that HD=1 means you are not guaranteeing detection of any error, not even 1-bit errors. Here's my answer, which I am posting since it might have general interest:

HD=1 (zero errors guaranteed to be detected) means that there is still a residual capability to randomly detect most errors (all errors except 1 in 2**k for k bits).

If you only need an error detection ratio of a few orders of magnitude this is fine if you have a large CRC.

If you are using a 256 bit cryptographic checksum (which must be assumed to have HD=1) then undetected error probability is 1 in 2**256, which is pretty good. However, a high-HD CRC will usually get you better non-malicious error detection for less computational cost.

Note that any "perfect" pseudo-random number generator will produce HD=1, including cryptographically secure random number generators.
Getting HD>1 means there is a predictable structure in the pseudo-random numbers generated by a CRC in terms of certain outputs being less probable. In fact, for small numbers of bits set in the inputs it is impossible to produce certain outputs with small numbers of bits set, because that would violate the HD. Thus, the outputs are less than perfectly random in a way that is predictable. For security this is bad, because it means the hash function leaks information, and you'd hope a good hash doesn't leak information. But for error detection that's good because it gives you a HD.

FAA CRC Report

2015-04-11T11:23:00.001-07:00

After an extensive editing and approval process, the FAA has released the report that I did with Kevin Driscoll & Brendan Hall from Honeywell. You can see it here:

Selection of Cyclic Redundancy Code and Checksum Algorithms to Ensure Critical Data Integrity
DOT/FAA/TC-14/49
http://www.tc.faa.gov/its/worldpac/techrpt/tc14-49.pdf

Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error Detection

2014-09-27T14:22:00.000-07:00

I just presented a session at an FAA sponsored conference on "Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error Detection" on Sept. 25. Here are the slides:

(Acrobat download: http://users.ece.cmu.edu/~koopman/pubs/koopman14_crc_faa_conference_presentation.pdf)

A recording of an older variant of this presentation can be viewed below:

Webinar download: https://cmu.box.com/s/nn5r9ogwpw100fqkp6g2
Accompanying slides for that older webinar can be found: here

Significant update to on-line CRC performance data

2014-08-14T07:17:00.000-07:00

I've managed to create a new set of pages with CRC performance data during my hobby time this summer.

"Best" CRC Polynomials:
http://users.ece.cmu.edu/~koopman/crc/index.html
This contains an extensive list of of good CRC polynomials for different CRC sizes and dataword lengths. I re-ran all the computations in light of some data that has become available since the initial publication a number of years back. The table is slightly different than the table I previously published. The old table is OK, but this one is marginally better than the old one for the data that was previously provided, so there is no pressing need to change systems based on the old version so long as you were running at shorter than 2Kbit dataword sizes. More importantly, this new table consideres unlimited dataword lengths and larger CRC sizes. (At this time the 32-bit computations are still running and they'll be updated as they are available.)

CRC Polynomial Zoo with Hamming Weight Data

http://users.ece.cmu.edu/~koopman/crc/hw_data.html

Contains error detection performance data for a large variety of CRCs, including numerous "standard" and published CRCs. If you see one missing let me know with a citation and I'll add it.

Please be sure to scroll down past the data to see the notes on how to interpret the summary page and the data files. The summary page is hand-created, and thus slightly prone to minor bugs, so please do double-check things there. The data files are 100% machine created and therefore less prone to bugs (and have no bugs as far as I know).

I hope that this data is useful to others. Some of the large polynomials have incomplete or shortened data files, and will be updating those over time as the computations complete.

CRC publication errata

2014-08-04T07:29:00.002-07:00

Thanks to Prof. Berthold Gick at University of Applied Sciences in Koblenz Germany for finding a few bugs in my published CRC values. (And also thanks for a very thorough second-check of results -- this is an essential part of scholarly publication that is not acknowledged as often as it should be!)

I have updated presentations and papers that I can find with the following corrections:

From my FAA presentation materials:
Polynomial HD=2 HD=3 HD=4
0x80000d | 16,777,212 | 5815 | 509 (length in bits at that HD)

From my 2002 paper:
Polynomial 0x992c1a4c has a maximal HD=6 payload of 32738 bits (originally published as 32737)

From many publications the table of 16-bit and smaller CRCs:
11-bit polynomial 0x5d7 has a maximal HD=5 payload of 26 bits (originally published as 25)

Some of the "best" polynomial recommendations have been updated. If you are unsure about a polynomial from a previous publication, you can look up its properties here:
http://users.ece.cmu.edu/~koopman/crc/
and in the accompanying polynomial zoo (click on the polynomial size header in each column).

If anyone else spots an error please do let me know. This work happens time-available, but I do strive to eventually fix any bugs I know about.

CRC and Checksum Tutorial Slides

2012-05-12T08:18:00.000-07:00

I recently presented an invitation-only webinar on the topic of CRC and Checksum good practices and error detection effectiveness. The talk was for an aviation audience, but it applies almost entirely to other applications as is. The recording didn't work out, but if there is enough demand I'll find time this summer to see if I can do an audio narration recording to go with the slides.

Meanwhile, you can find the slides here:
http://www.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf

Contents:
- Introduction
- Motivation -- why isn't this a solved problem?
- Parity computations as an example
- Error code construction and evaluation (without scary math)
- Example using parity codes
- Checksums
- What's a checksum?
- Commonly used checksums and their performance
- Cyclic redundancy codes (CRCs)
- What's a CRC
- Commonly used CRC approaches and their performance
- Don't blindly trust what you hear on this topic
- A good CRC is almost always much better than a good checksum
- Many published (and popular) approaches are suboptimal or just plain wrong
- There are some topics to be careful of for which we don't know the answers yet

We are also collecting information about how CRCs and checksums are used in real applications (which is not necessarily the same as what researchers study!) If you have anecdotes, examples, or other information about what you do or what you'd like to do, we'd love to hear about it. If you are in the aviation community the FAA is running a much more specific survey just for you; let me know and I'll get you in touch with the appropriate FAA contact.

(We can't necessarily provide immediate answers to specific questions such as "which polynomial should I use for this specific situation." But popular topics and questions that we think are broadly applicable will be put into the research bin and published here or in a report when results are available.)

Consolidated Wikipedia Pointers for Checksums and CRCs

2012-01-15T07:00:00.000-08:00

Here are some starting points within Wikipedia if you are interested in checksum and CRC background. (These articles also point to other places of interest.) Over time I hope to distill the practical application of all this in my blog, but for now these are reasonable starting points.

List of checksum and CRC algorithms: http://en.wikipedia.org/wiki/List_of_checksum_algorithms

Longitudinal Redundancy Check (XOR checksum): http://en.wikipedia.org/wiki/Longitudinal_redundancy_check

Overview of Checksums: http://en.wikipedia.org/wiki/Checksum
Fletcher Checksum: http://en.wikipedia.org/wiki/Fletcher%27s_checksum
Adler Checksum: http://en.wikipedia.org/wiki/Adler-32

Overview of CRC: http://en.wikipedia.org/wiki/Cyclic_redundancy_check
More detailed example of CRC computation: http://en.wikipedia.org/wiki/Computation_of_CRC
Mathematics of CRCs: http://en.wikipedia.org/wiki/Mathematics_of_CRC

Overview of error detection and correction codes: http://en.wikipedia.org/wiki/Redundancy_check
Discussion of parity: http://en.wikipedia.org/wiki/Parity_bit
Check digits (it's like parity for numbers handled by people): http://en.wikipedia.org/wiki/Check_digit

It is important to realize that Wikipedia is not necessarily an authoritative source. For this area it seems there is enough critical mass of authors, editors and reviewers to get the pages into reasonable shape.

CRC for Protecting A Single Value

2012-01-01T07:00:00.000-08:00

Some critical applications need to protect a single byte or word of data in memory against corruption. (For example, if you want to be extra-sure about protecting EEPROM values against corruption.) A previous post gives good CRC polynomials for a wide range of circumstances, but it is always nice to know if you can do better for a special case. Below are optimal CRC polynomials for a few special cases involving small data words that might be stored in memory (RAM, EEPROM, or otherwise).

Protecting an 8-bit data word:
8-bit CRC: HD=5; 0x9C = x^8 + x^5 + x^4 + x^3 + 1
16-bit CRC: HD=8; 0xE92F = x^16 + x^15 + x^14 + x^12 + x^9 + x^6 + x^4 + x^3 + x^2 + x + 1
32-bit CRC: HD=16; 0xC563942F = x^32 +x^31 +x^27 +x^25 +x^23 +x^22 +x^18 +x^17 +x^16 +x^13 +x^11 +x^6 +x^4 +x^3 +x^2 +x +1

Protecting a 16-bit data word:
8-bit CRC: HD=4; 0x93 = x^8 + x^5 + x^2 + x + 1
16-bit CRC: HD=7; 0x978A = x^16 + x^13 + x^11 + x^10 + x^9 + x^8 + x^4 + x^2 + 1
32-bit CRC: HD=14; 0xB396786D = x^32 +x^30 +x^29 +x^26 +x^25 +x^24 +x^21 +x^19 +x^18 +x^15 +x^14 +x^13 +x^12 +x^7 +x^6 +x^4 +x^3 +x +1

Protecting a 32-bit data word:
8-bit CRC: HD=4; 0x92 = x^8 + x^5 + x^2 + 1
16-bit CRC: HD=6; 0x8BFC = x^16 + x^12 + x^10 + x^9 + x^8 + x^7 + x^6 + x^5 + x^4 + x^3 + 1
32-bit CRC: HD=12; 0xB527A43B = x^32 +x^30 +x^29 +x^27 +x^25 +x^22 +x^19 +x^18 +x^17 +x^16 +x^14 +x^11 +x^6 +x^5 +x^4 +x^2 +x +1

The hex number has an implicit +1. "HD" means Hamming Distance. Other information about this topic can be found at my previous post on good CRCs.

Note that for this sort of use it is a good idea to "seed" the CRC computation with a value of that is not zero. That avoids a data word value of zero giving a CRC computed value of zero -- which will result in an undetected error if memory is wiped to all zeros for some reason.

Pointers to other CRC discussions

2011-12-28T07:00:00.000-08:00

Over time I'll add links to postings on CRCs that I find interesting or entertaining:

Nigel Jones has an essay on how we don't always stop to think about whether what we're doing is actually good enough, even if it is what we've always done. In this case he's talking about CRC selection.
(link: Thoughts on BCC's, LRC's, CRC's and being experienced)

Lammert Bies has a Q&A forum about practical implementations of standard CRC functions.
(link: Error Correction and Detection Forum )

What is the best CRC up to 16 bits in size?

2011-12-26T07:00:00.000-08:00

Please see the posting at this link for this article:
http://betterembsw.blogspot.com/2010/05/whats-best-crc-polynomial-to-use.html