A checksum is a string of characters and numbers generated by running a cryptographic hash function against a file. You can use this output, or checksum, to verify that a file is genuine, error free and has not been changed from it's original source.

A checksum is a small-sized datum derived from a block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage.

-Wikipedia

A Basic Checksum Example

The most popular Linux command to create a checksum from a file is the md5sum command. The md5sum uses the MD5 message-digest algorithm to produce a 128-bit hash value from the contents of a file.

Here is an example. Let's take the string "putorius" and generate a checksum from it.

$ echo -n putorius | md5sum
 6d011e0c5c6198b635e3e08973ff1339  -

The long string of characters above is the checksum. It is followed by a dash, which in this case stands for standard input (STDIN). If you were running a checksum against a file, the name would be listed instead of a dash. Now, let's change the input string slightly by capitalizing the letter P.

$ echo -n Putorius | md5sum
 676d4de5953fcfa91f31b1ffc762e410  -

As you can see, md5sum generated a completely different checksum, even with a simple change.

Running a Checksum on a File

To run a checksum on a file is simple. Just evoke md5sum followed by the name of the file.

$ md5sum harrison-bergeron.txt 
 5ec9ea693fafdad2379b838f392b631e  harrison-bergeron.txt

Here we generated a checksum of a text file containing all 185 lines of the short story Harrison Bergeron by Kurt Vonnegut. If we edit the file and change one character, the checksum will change. Let's change one random letter G in the file to a letter R and run the checksum again.

$ md5sum harrison-bergeron.txt 
 7c5efdee1fbef2fcff87706f330f8597  harrison-bergeron.txt

Even though we only made a very small change in the file, the checksum is dramatically different. This allows you to easily notice that the file has changed.

Choosing a Hashing Algorithm

In depth analysis of hashing algorithms is beyond the scope of this tutorial. However, it is important to know that not all algorithms are created equally. For example, MD5 has well documented weaknesses (more info in the resources and links section below). The GNU Core Utilities package that provides the md5sum utility also offers utilities that offer different algorithms and higher bit hash values. The following utilities all work the same way (or very similar) and should be available on any Linux system.

  • b2sum - Uses the BLAKE2b message digest
  • cksum - Checksums and counting bytes in a file
  • md5sum - Uses the MD5 message digest
  • sha1sum, sha224sum, sha256sum, sha384sum, sha512sum - Uses SHA-1/SHA-2 message digests
  • sum - Checksums and counts blocks in a file

SHA-256 (sha256sum) and SHA-512 (sha512sum) are recommended for most applications and generally considered secure. For instance, CentOS and Kali Linux both require the use of sha256sum to check the integrity of it's ISO files.

Example using sha256sum:

$ sha256sum harrison-bergeron.txt 
 5f110a42988f80e119de8aec4a0a1ec399e53c60b7027b63996c3e0458192f95  harrison-bergeron.txt

Example using sha512sum:

$ sha512sum harrison-bergeron.txt 
5a3bfb9f283f940c9d383381eccb265a06f0e18a0ca4fe0b699d850054259cca0abdad5cfb8d94c70dbd2e95def0298bb4042f5bc764e8d9b223f61d1e1fc2de  harrison-bergeron.txt

Checksum Use Cases

A common use for a checksum is verifying the a downloaded file. For example, let's say you wanted to install the latest Kali Linux. When you get to the Kali downloads page you will notice they provide a SHA-256 checksum for each file.

screenshot of Kali Linux download screen showing file checksum
Screenshot of Kali Downloads with Checksum

Once you have the file downloaded, you check check the integrity of the file by running sha256sum and comparing the results with the hash on the website.

$ sha256sum kali-linux-2019.2-amd64.iso
 67574ee0039eaf4043a237e7c4b0eb432ca07ebf9c7b2dd0667e83bc3900b2cf  kali-linux-2019.2-amd64.iso

If the strings match, you file has downloaded successfully and has not been altered since the folks at Kali created the checksum.

Conclusion

Checking the integrity of a file is an important step in ensuring a secure system. Specifically when downloading files from the internet. In this article we discussed how to generate a checksum and how to use it for a file integrity check. Now that you know how to use them, you should read up on hashing algorithms.

Resource and Links