One of the things I’ve always loved about Linux and the “FOSS” BSD derivatives is that they can put the very same tools used by “professionals” into the hands of rank amateurs. Instead of playing around with a crippled facsimile of a commercial product or simulation (or obtaining a copy by questionable means) you can gain experience with the “real deal” tools and perform actual professional level work. In areas where commercial tools may be extremely expensive or even unobtainable to the lay person, open source tools may be the only way to get any “hands on” experience at all outside of the corporate/professional environment.
Digital forensics is one area where access to professional-grade, commercial tools and utilities can be difficult to come by for someone just looking to get their feet wet. Commercial packages in the domain can be very expensive (with good reason) or simply not marketed outside of government and law enforcement customers. Fortunately, the community of developers offering freely accessible and open source forensic tools is quite remarkable and with a little work and some understanding many of the capabilities of the commercial packages can be replicated using freely available code.
To that end, one tool I’m quite fond of is hashdeep, which is descended from the md5deep family of tools. Hashdeep was written by Jesse Kornblum while working as a Special Agent with the United States Air Force Office of Special Investigations, and while simple in concept, it enables a wide variety of forensic investigations to be done by anyone capable of downloading the code and make use of it.
What hashdeep does is compute and output hash values for input files, but it can compute and output several different algorithms simultaneously, and even more importantly, it can do so recursively through a file system. In other words, hashdeep can traverse a file system and compute several different hashes for each file it encounters, optionally recording those hashes to a file. Additionally, hashdeep can perform hash matching, examining a file system and reporting when a file matches a provided set of hashes. It can also perform audits, where the program reports if any of the files have moved, disappeared, changed, or if a new file is found. Because it uses more than one hash algorithm on each file, it can even detect collisions where one hash value matches and the other does not.
If you’d like an idea of the sort of research hashdeep can make possible, have a look at this Forensic analysis of the Tor browser bundle on OS X, Windows, and Linux performed by Runa Sandvik for the Tor project where it was used to detect artifacts left behind after the deletion of the browser bundle.
Getting started with hashdeep is pretty easy. On most Debian based systems, installing it is as simple as
sudo apt-get install md5deep
Once installed, you can run hashdeep -h for a list of the options available.
I always enjoy an example, so lets assume you have a drive you want to create a hash record of so that later you can perform an audit and detect changes at the file system level. First, mount up your drive:
mount -t exfat -o ro /dev/sda1 /media/foo
(the type and location, obviously, would most likely be different for you) Now, we’ll tell hashdeep to walk the file system we just mounted (-r), compute MD5 and SHA-256 hashes (default) for each file it encounters, and store the results in a file in our home directory (-W /path/):
hashdeep -r /media/foo -W ~/foo_hashdeep_log
and unmount the drive
umount /media/foo
No you can perform whatever activity you are looking to monitor with the drive elsewhere. When you are ready to audit the drive for changes, remount the drive as above. This time we’ll call hashdeep with the options -a (audit), -r (recursive), -v (verbose), -k /path/ (use this file to audit against):
hashdeep -arvk ~/foo_hashdeep_log /media/foo
Depending on whether the files have changed, you will get output similar to this:
hashdeep: Audit passed
Files matched: 15
Files partially matched: 0
Files moved: 0
New files found: 0
Known files not found: 0
“Files matched”, in this case indicates a file found where it was expected. Files partially matched would indicate only one of the multiple hashes matched. Files moved indicates, obviously, a file that hasn’t changed but isn’t located where expected. New files and files not found I’m hoping are self explanatory.
If you play around with the test file system and re-run the audit you can see how hashdeep will report different types of changes and get an idea for what it can detect and what it can’t. Two quick notes regarding the example above: As configured here, hashdeep wouldn’t detect the creation or deletion of an empty directory, which could be meaningful to you. It also only examines the file as reported by the operating system, so it wouldn’t detect changes occurring at a lower level, like in slack space.
Image may be NSFW.
Clik here to view.
Clik here to view.
