sha512 is the best algorithm to use since it gives the longest
output and there is very little chance of collision.
I think this article is not going to make much sense without
explaining the rationale and the math behind the idea of
cryptographic hashes of message digests.
The basic goal is quite easy to state and understand. The idea
of a message digest is to create a fixed length "fingerprint"
from any input data of any length, be it 2 bytes or 2 Terabytes.
This is done in a such a way that the output varies significantly
for slight changes in input data.
All that is fine and dandy but the most important aspect of
the checksum algorithm is its ability to avoid collisions.
Collisions are input values for which the checksum algorithm
produces the same output. This can be quite dangerous and
defeats the very purpose of having a checksum in the first place.
But mathematically speaking, nature enforces a limit to the
probability or possibilty of collisions. But in practice this
works quite well as long as your output sample space is quite
big. Which is the case with sha512 digests. MD4 is broken.
Don't use it. MD5 is weak too.
The importance of cryptographic hashing comes from many
angles. First thing is that it is key to generating digital
signatures. A signature is a private key encrypted message
digest of the input message. Simple and straight.
Then you have something called HMAC or hashed message
authentication code where a secret key is used for generating
message digests. Normally message digests do not employ any
secret information. It is completely open. Anyone can generate
cryptographic hashes since the algorithm is well known, there
are no keys and given the input, the output is fixed.
This is alright when we want to detect accidental changes or
integrity of file transfers. But this does not protect us from
malicious tampering. For that we normally encrypt the hash
with a secret key. Or append it with the message and encrypt
it. That way we can detect tampering.
However HMAC is different. In this method, the cryptographic
hash is protected with a secret key and only if you possess
the secret key you can generate the hash.
HMAC is widely used in TLS or SSL web security. We have
already seen many applications for message digests or
cryptographic hashes.
There is one important detail however. All the public key
cryptosystems in particular the most widely used RSA algorithm
relies on cryptographic hashes in a interesting way.
RSA is a little complicated to explain in this article but
my idea is to illustrate that cryptographic hashes have a much
bigger role to play than simple integrity checking.
Cryptographic hash functions are also known as one way hash
functions. Which is to say that the function is not reversible.
There is no inverse of the function. You can only get an output
from input, never the other way round.
RSA is nothing but a one way hash function of the input data with
a key. RSA relies on the prime number factorization problem. So
the idea here is that you can multiply two prime numbers trivially
but you cannot divide them. You can of course but not without a
significant computational overhead.
Now that we have seen enough theory, let us get to the practical
side and figure out how it can help us in real life. After all math
has a great real life significance.
$ cksum /etc/passwd
3171604895 3646 /etc/passwd
$ cksum -a sha512 /etc/passwd
SHA512 (/etc/passwd) =
b4d6a742cada5305686832f1037b60f79b56fe6dfdf9904e6070295e7
4c2341535db26b731e27e04a73f0cb70bb589d31b8e9e18e207a8aae5
aa81d06ea29f5a (above line wrapped - pretty darned long)
$ cksum -a sha256 /etc/passwd
SHA256 (/etc/passwd) =
fd2626c043a288c0a25bdc9772af4b19e001c890e93170a4de40043a9516e94a
$ cksum -a rmd160 /etc/passwd
RMD160 (/etc/passwd) = 70455b60aad955b556aa4052af017ecf61bfe5a1
$ cksum -a md5 /etc/passwd
MD5 (/etc/passwd) = e7818836fe36fa1fde1ab6198ee2da77
$ cksum -a sha1 /etc/passwd
SHA1 (/etc/passwd) = 9c80fdd62b69909a705c64fe79b6294d778ffef6
You can avoid the above mentioned issue of collisions with the cksum(1) utility since you have access to several state of the
art checksum algorithms from one command/utility. So if you
are paranoid kindly compare the sha1 and sha256 sums of the
same file at both sides after transfer. That way you can avoid
issues with collision.
openssl(1) also comes built in with access to several checksum
algorithms and so can sha1 and md5 commands help under OpenBSD.
But cksum has an advantage of supporting many algorithms. Moreover
all these utilities come with base OpenBSD. There is never a need to
install any specific package. In other words you are guaranteed to
find them, on any OpenBSD box!
Note on Authorship:
This article was contributed by Girish Venkatachalam who
is also a co-author on Denny's OpenBSD Newbies Blog.