Sometimes we hear that someone plans to use a hash function for fault detection instead of a checksum. This is probably not the best idea, because hashes provide weaker fault detection for random independent bit faults than a decent checksum.
A hash function is typically optimized for really good bit mixing, but not for ensuring that small numbers of bit faults are always detected. For a Pud effectiveness metric (random independent bit faults at a particular BER), this means that a hash function is likely to do worse than a good checksum.
To explore this, we evaluated the hash function Murmur3 (32 bits) on a Monte Carlo simulation of random independent bit faults and compared it with a variety of checksums:
The results show that Murmur3 is an excellent approximation of a Simple HD=2 curve, which detects all 1-bit faults, and has a 1/(2**32) probability of undetected fault for all other numbers of bit faults. That is expected of a good hash function. Murmur3 turns out to have Hamming Distance=2, at least for the lengths we investigated.
In comparison, a 32-bit one's complement checksum does worse, but that is not a huge surprise.
All the other checksums did better. Dual-sum and Koopman checksums provided HD=3, although with less mixing. Their curves are below Murmur3 but not all the way down at the Simple HD=3 curve.
DualX-32P and Koopman-32P both have curves above the Simple HD=4 curve. CRC curves were not plotted, but would be at least as good as the Koopman-32P curve, and potentially much better for short data word lengths where a higher HD can be provided.
Summarizing, Murmur3 does a better job of bit mixing than the checksum functions, but suffers in ability to detect random independent bit flips due to being limited to HD=2.
We expect that other good hash functions will trace out either an approximate Simple HD=2 curve, or possibly a Simple HD=1 curve depending on their specifics. A dual-sum checksum, a Koopman checksum, or a CRC will all provide significantly better fault detection.