Thursday, December 28, 2023

New, faster version of HDLEN

 A new, faster version of hdlen is now available (version 2.0.0).

This uses a hash table to dramatically speed up determining the maximum data word length for each Hamming Distance for any particular CRC polynomial.

More info here: https://users.ece.cmu.edu/~koopman/crc/hdlen.html



Wednesday, May 10, 2023

Koopman Checksum

An Improved Modular Addition Checksum Algorithm

Philip Koopman

This paper introduces a checksum algorithm that provides a new point in the performance/complexity/effectiveness checksum tradeoff space. It has better fault detection properties than single-sum and dual-sum modular addition checksums. It is also simpler to compute efficiently than a cyclic redundancy check (CRC) due to exploiting commonly available hardware and programming language support for unsigned integer division. The key idea is to compute a single running sum, but introduce a left shift by the size (in bits) of the modulus before performing the modular reduction after each addition step. This approach provides a Hamming Distance of 3 for longer data word lengths than dual-sum approaches such as the Fletcher checksum. Moreover, it provides this capability using a single running sum that is only twice the size of the final computed check value, while providing fault detection capabilities even better than large-block variants of dual-sum approaches that require larger division operations. A variant that includes a parity bit achieves Hamming Distance 4 for about half the data word length as the baseline version for the same size check value.

Full paper here: https://arxiv.org/abs/2304.13496




Below is a simplistic comparison of execution time for 32-bit checksums. Every system will have a different outcome, but this gives a general idea of what to expect. Koopman checksums have about the same run time as a Fletcher or Adler checksum, and likely have slightly faster run time for a machine such as an Intel server that has strong hardware support for division. (The DualSum32 implementation includes an optimization of delaying the SumB modulo operation until the end of the computation.) Nonetheless, Koopman checksums provide better Hamming Distance performance.  CRC_R32T is a CRC implementation that uses a 256-element lookup table to speed up the computation.








Saturday, May 6, 2023

Large-Block Checksums

Large-Block Modular Addition Checksum Algorithms

Philip Koopman

Checksum algorithms are widely employed due to their use of a simple algorithm with fast computational speed to provide a basic detection capability for corrupted data. This paper describes the benefits of adding the design parameter of increased data block size for modular addition checksums, combined with an empirical approach to modulus selection. A longer processing block size with the right modulus can provide significantly better fault detection performance with no change in the number of bytes used to store the check value. In particular, a large-block dual-sum approach provides Hamming Distance 3-class fault detection performance for many times the data word length capability of previously studied Fletcher and Adler checksums. Moduli of 253 and 65525 are identified as being particularly effective for general-purpose checksum use.

Download the full paper here: https://arxiv.org/abs/2302.13432





Tuesday, April 4, 2023

Why Life Critical Networks Tend To Provide HD=6

I'm thinking about why a Hamming Distance of 6 for network checksums/CRCs is so often found. Certainly that is designed into many network protocols, but sometimes I like to ask the question as to whether folk wisdom makes sense. So I did a trial calculation:

  • 100 bit messages
  • 1000 messages/second
  • 20 year operational life
  • 10 million deployed units
  • 1e-5 bit error ratio

I got 47.47 expected arrivals of 5-bit corrupted messages (random independent bisymmetric "bit flip" fault model at BER), and probability of only 1% of any 6-bit corrupted messages. So that seems to justify HD=6 as being reasonable for this example system, which looks to me like a CAN network with low-cost network driver circuits on a car fleet. (Even 10x fleet size still works out as more likely than not there will be no 6-bit faults.) The safety argument here would simply be that there will never be an undetected network message fault for this fault model.


Did I miss something? (Did I get my math wrong :) ?) Does anyone know of a place where HD=6 is justified in a different way other than by folklore? So much of checksums and CRCs risks being lost in folklore as the grey beards retire. I'm trying to capture some of this type of thinking for posterity before it is my turn to retire...


In my experience 1e-5 BER is a typical conservative number for an embedded network. That means about 1% of network messages are corrupted. More than that and it's going to be difficult to get a working system. Less than that and there is economic pressure to use cheaper cabling with worse BER. But this is a rule-of-thumb approximation. You should do the calculation for your own network. For IT networks BER is dramatically lower, but they tend to operate in a much more benign environment and have budget for better network hardware.


Wolfram Alpha computation for those who'd like to check my work (you might need to click on "approximate solution" to get the actual number):

https://www.wolframalpha.com/input?i=n%3D100%3Bf%3D5%3BB%3D1e-5%3B+++X%3D%28B**f%29%3B+Y%3D%28%281-B%29**%28n-f%29%29%3B+Z%3Dcombination%28n%2Cf%29%3B++X*Y*Z*20*365.25*24*3600*1000*10000000

              



Excel source: https://archive.org/details/hd-worksheet

Why To Avoid Hash Algorithms If What You Really Need Is A Checksum

Sometimes we hear that someone plans to use a hash function for fault detection instead of a checksum. This is probably not the best idea, b...