One of the fundamental pillars of security is something called cryptographic Hashing, a checksum, or a Hash Function. Often you will hear this referred to “as a one-way mathematical function”. But what does that mean and why should you care?
Hashing is also often compared to encryption though it’s actually very different and the usecases for it are almost entirely mutually exclusive. I prefer to think of hashing as a fingerprint, and as we know fingerprints have many uses in security.
Whereas encryption is a way to “encode” data to make it unreadable by someone who doesn’t have the key to decrypt it, therefore it’s Natively a two way function. The sending party will encrypt the data and the receiving party will decrypt it. Since no one else has the key (at least theoretically), the information is safe whether it’s stored or in transit.
Hashing is more like one-way encryption. A word, file, password, database or even application can be hashed, but the function is called one-way since having a hash doesn’t allow you to directly obtain the original data. (More on this later).
Let’s look at some examples of hashing to clarify this concept. Let’s say I have a password for my example I will use “Welcome2018” , a password I reserve for only my most secure accounts. So please don’t share it with anyone. If I hash this phrase with the MD5 hash function (one of many available hashing algorithms, although it is now considered compromised and should not be used in production it is excellent for use in demonstrating the concept of hashing). I would get the following output. 5c2620e4fb5a5228ec435e0a592b6b57
This looks nothing like the original because it went through a function which outputs any input to a 32 character alphanumeric output, aka a 128-bit hash.
So why do we need hashes? Well, for security specifically there are 3 main uses. And all of them are based on the same concept. An effective way to quickly fingerprint something.
A simple use-case for this might be the storage of passwords in a database. Encryption wouldn’t work well here because if we want to compare whether a user’s password matches without the ability for any employees to know the users’ passwords. Storing the password in a hashed format is a good way of obfuscating it and preventing it from being revealed easily.
Hashes can also be used to verify the integrity of a file. Use-case example, You are sending a file to a friend and want to provide them with a way of verifying that the file you send them has not been tampered with. One way to do this is to simply take a hash of the file. You would then transfer this hash to your friend via a different medium, for example, if you sent the file via email, you may share the hash via SMS. Your friend would then hash the files they received to verify it’s the same file.
Other use cases for hashing can be to verify a large file download wasn’t corrupt during download. This is especially helpful to IT professionals. Back when I was a systems engineer I encountered one such case. A colleague of mine had burned a disc with an image of vmware. We installed dozens of servers from this disk without any issue. Suddenly on one install we began having phantom issues. After investigating with vmware support it turned out there was corruption of this disc image at some point and that was the source of our issues. Verifying it’s hash would have prevented all of these issues.
In security specifically, hashes form the basis of many antivirus applications as well as incident response tools. Hashes are used to determine that a file really is or is not what the title says.
Traditional antivirus is based almost entirely on hashes, or as I prefer to think of it the fingerprint of the file. A file is either whitelisted or blacklisted based on the behavior it displays in a lab setting. That gives it either a pass or a fail. That is turned into “signatures” or collections of information including hashes. Those signatures are updated frequently as malware changes frequently. When you open a file on your computer, a hash of it is taken and compared to this list to give a pass or fail determination. Thankfully most new anti-virus applications use more information than this to make a determination but the hash is still widely used as one factor in this decision in almost every AV.
© 2020 Iospa Tech LLC. All Rights Reserved. Various trademarks held by their respective owners.