Checksum speed comparison

In this blog post we address the speed question for different checksums. One of the main lessons from doing benchmarking is that the answer to “how fast is it” always ends up being “it depends.” We do not attempt to resolve the question of benchmarking algorithm speed in a comprehensive way. Rather, we present a single relatively simple comparison to illustrate expected speed differences among checksum values – along with a strong caution that any particular application on any particular computing hardware is almost certain to see different speed differences. That having been said, we hope this comparison and analysis provides useful insight. To do a speed comparison, we created optimized implementations of 32-bit checksum algorithms intended to be representative of the type of code one might see in a specific application. This included: ·      Implementation is in the C programming language with optimizations enabled (gcc compiler with -O2 optimization option). ·      Sums use 3

Koopman Checksum

An Improved Modular Addition Checksum Algorithm Philip Koopman This paper introduces a checksum algorithm that provides a new point in the performance/complexity/effectiveness checksum tradeoff space. It has better fault detection properties than single-sum and dual-sum modular addition checksums. It is also simpler to compute efficiently than a cyclic redundancy check (CRC) due to exploiting commonly available hardware and programming language support for unsigned integer division. The key idea is to compute a single running sum, but introduce a left shift by the size (in bits) of the modulus before performing the modular reduction after each addition step. This approach provides a Hamming Distance of 3 for longer data word lengths than dual-sum approaches such as the Fletcher checksum. Moreover, it provides this capability using a single running sum that is only twice the size of the final computed check value, while providing fault detection capabilities even better than large-block

Large-Block Checksums

Large-Block Modular Addition Checksum Algorithms Philip Koopman Checksum algorithms are widely employed due to their use of a simple algorithm with fast computational speed to provide a basic detection capability for corrupted data. This paper describes the benefits of adding the design parameter of increased data block size for modular addition checksums, combined with an empirical approach to modulus selection. A longer processing block size with the right modulus can provide significantly better fault detection performance with no change in the number of bytes used to store the check value. In particular, a large-block dual-sum approach provides Hamming Distance 3-class fault detection performance for many times the data word length capability of previously studied Fletcher and Adler checksums. Moduli of 253 and 65525 are identified as being particularly effective for general-purpose checksum use. Download the full paper here:

Why Life Critical Networks Tend To Provide HD=6

I'm thinking about why a Hamming Distance of 6 for network checksums/CRCs is so often found. Certainly that is designed into many network protocols, but sometimes I like to ask the question as to whether folk wisdom makes sense. So I did a trial calculation: 100 bit messages 1000 messages/ second 20 year operational life 10 million deployed units 1e-5 bit error ratio I got 47.47 expected arrivals of 5-bit corrupted messages (random independent bisymmetric "bit flip" fault model at BER), and probability of only 1% of any 6-bit corrupted messages. So that seems to justify HD=6 as being reasonable for this example system, which looks to me like a CAN network with low-cost n etwork driver circuits on a car fleet. (Even 10x fleet size still works out as more likely than not there will be no 6-bit faults.) The safety argument here would simply be that there will never be an undetected network message fault for this fault model. Did I miss something? (Did I get my math wrong :)

Explanation of CRC Operation: Polynomial Division, Shift Register, and Comparison (video)

 Here is a video explaining the mechanics of Cyclic Redundancy Check computations, including: An example of the polynomial division approach An example of the XOR-feedback shift register approach A step-by-step comparison of how the polynomial division and shift register approaches are actually doing exactly the same computation. If you've ever wondered why it is that a shift register and polynomial division end up with the same result, this video explains it. Video links (27 minutes):    YouTube |

Pending clean-up of papers

 I received the following comments with potential issues in some of my work.  I'm delighted to receive such feedback since a major point of my work is to try to get a complex topic straight. At the moment I'm fully engaged working on self-driving car safety so it will be a while before I can resolve these. However, in the spirit of transparency here are the points raised so you know to be careful on these points when reading my work. (This is posted with permission conditioned on anonymity. Thanks to the person who sent the comments.) ---------------------------------------------- 1) On page 5 of you say that SMBus uses CRC-8, polynomial 0xEA in your notation and say that 0x83 would be better. I am not familiar with SMBus but three sources say that the polynomial chosen for SMBus was 0x83. CRC Catalogue entry CRC-8/SMBUS lists poly=0x07 (they use what Wikipedia calls Normal form) SMBus Spec page 27 cl

CRC Information Quick Start

I'm not able to spend much if any time on this lately due to my work in self driving car safety. Here is a quick start for those looking for info. A tutorial on using CRCs and Checksums (how to apply them, pitfalls, system considerations; not a deep crawl through the math): A report written for the FAA on CRCs and Checksums (goes with the tutorial) A web site of "best" CRCs, including: Description: Table of "best" CRCs for each size and length A zoo that has performance of most published CRCs (there is a page for each polynomial size): Software you can run yourself to confirm performance as a double-check on the tables (which were generated wi