Digital forensic examiners rely on their expertise to interpret the data their tools retrieve. Although this requires utmost trust in the tools themselves, it also assumes that the tools are doing the job correctly. Not knowing how the tools do the job—not having access to their underlying code, as is the case with proprietary digital forensics tools—creates a veil of abstraction between examiners’ minds and the truth. Each layer of abstraction is a possible source for error or distortion.
That isn't to say the conscientious examiner needs to cease use of any and all proprietary tools. However, it is important to validate what they find—to make sure results are repeatable (identical items tested by the same examiner, in the same lab, using the same equipment and methodology) and reproducible (identical items tested by different examiners, in different labs, using different equipment and methodology).
Performed at regular intervals, reproducible tests typically use one proprietary tool to validate another. For the purposes of the legal test known as the Daubert Standard, this is usually enough. However, the more attorneys on both civil and criminal sides learn about digital evidence, the more they may start questioning how forensic tools actually obtain their data. If the science underlying the evidence cannot be explained, then it cannot be accepted as a science, and the credibility of both digital evidence and digital forensic examiner will be undermined.
In some cases, the engineers who design proprietary forensic tools have been brought to court to testify as to how the tools work. However, this should only be a last resort; the designers of every proprietary tool an examiner uses may be unavailable, or prohibitively costly to bring to trial. There’s an easier, faster, and cheaper way to validate findings that use these tools. That way is to use open source forensic tools.
What is open source software, and why use it for validation?
True open source software is freely redistributable, provides access to the source code, allows the end user to modify the source code at will, and doesn’t restrict the software’s end use. Of particular interest for validation purposes is access to source code.
Innately, open source forensic tools “show their work.” You can execute the tool, examine the options and output, and finally examine the code that produced the output to understand the logic behind the tool’s operation.
This means, of course, that it’s possible to use open source tools for digital forensic examinations—not just validation alone. However, where multiple examiners are collaborating or it’s standard in a lab to use a particular proprietary tool for all exams, it may be easier to start with that tool and rely on open source as a backup.
Another benefit to using open source software is access to a much broader community of examiners, developers, and enthusiasts who can answer questions and assist with research. Peer review is one of the five tests of the legal Daubert standard, which allows for novel scientific testing (digital or physical) to be admitted in court.
While some community interaction is possible in proprietary platforms’ user forums, those communities allow only for sharing of tips and tricks. It’s unusual for vendors to be so responsive as to fix bugs and add user requests in a matter of hours or days, but the open source community makes this possible—assuming examiners who are knowledgeable in coding don’t simply fix these issues on their own.
Open source forensics when you’re in law enforcement
It is perhaps more important for law enforcement digital forensic examiners to use open source tools to validate their work, than for any other group. This is because so many tools available to police are “law enforcement only,” meaning that not only are the tools proprietary, they are not even accessible by civilian examiners.
A law enforcement examiner who uses open source tools to validate findings from law enforcement-only tools reduces the chance of his or her credibility being attacked in court. For example, it is important to show when a forensic tool produces false positives or negatives in keyword search results. Using an open source tool adds to this explanation by allowing the examiner to show why positive and negative results were true or false.
Trying to explain such results in court could be disastrous for the prosecution, as false results appearing in a proprietary tool on a very complex or high-profile case will only add pressure. For example, the conflicts in the way two pieces of forensic browser analysis software parsed a Firefox database in the Casey Anthony trial helped the defense mount an effective counter-argument to the prosecution.1
An open source validation reduces the likelihood of such an event, and may even help examiners provide feedback to proprietary tool vendors on what’s going wrong. To minimize problems at trial, regular (quarterly or semiannual) testing is (or should be) part of every lab’s standard operating procedure, even when examiners’ time is at a premium.
There are other benefits, too. Police examiners who work in the private sector, either on the side or as a second career, cannot use law enforcement-only tools for those jobs. Major proprietary tools could prove too expensive for examiners to afford on their own. Because any product-specific expertise which examiners build up could end up being worthless, it makes sense from both a cost-saving and a validation standpoint to develop open source tool expertise.
Documenting the use of open source tools
Documenting validation procedures can follow the basic scientific method: Hypothesis, Test, Results. Examiners should record test dates and times, along with projected dates for future tests. This documentation is important both internally (such as during audits), as well as when an examiner’s process or findings are challenged in court as erroneous.
However, documentation doesn’t just come into play with validation. It is also important when the usual forensic tools fail to process the image file properly. Should the examiner need to turn to raw disk images and manual file carving, it’s important to document why, what tools were used, and how, etc. in order to make the process as repeatable and reproducible as possible.
Open source forensic tools may not be easy to work with, but can save a lot of grief down the road when used to validate results from proprietary tools. Even lacking their own programming expertise, forensic examiners can rely on the development community to fill in where they are lacking. Ultimately, being able to explain why you got the results you did can only bolster your case, strengthen your credibility, and even lead to better case law—at least where digital evidence is concerned.
How much programming expertise do you need to use open source?
“Examine the code that produced the output” would appear to mean that you need programming experience to understand what’s going on. What do you do when you’re not a programmer?
In a forensic context, open source software is distributed in source form. Either the examiner needs to generate executable code to run the software, or the examiner needs to use an interpreter to run scripting tools. (Examiners who are familiar with EnCase and EnScripts will already have experience with the latter scenario.)
In “Digital Forensics with Open Source Tools,” we describe both scenarios, taking examiners through a build of a Linux forensic examination system and then, throughout the book, showing examples of scripts that use Perl, Python, and Ruby scripting languages.
If you are not a programmer, you may find it beneficial to connect with a forensic examiner who is. If you cannot do this locally, in your own lab, or via a nearby professional association, you may find it beneficial to take courses in your language of choice. We recommend cross-platform, open source languages that are in use (and have libraries for) your topic of interest. For open source digital forensics, the main languages in use are C/C++, Perl, and Python (and to a lesser extent, Java). Freely available training programs such as diveintopython.org or Open Courseware—such as the lectures developed by the Massachusetts Institute of Technology—can be particularly useful.
1. Buckles, Greg. “What Went Wrong in the Casey Anthony Browser Analysis.” eDiscovery Journal. July 26, 2011. http://ediscoveryjournal.com/2011/07/what-went-wrong-in-the-casey-anthon...
Cory Altheide is the primary author of "Digital Forensics With Open Source Tools (2011)." He has ten years of information security, forensics, incident investigations experience. A security engineer at Google, he has worked as an incident responder for Mandiant, IBM, and Google, and was a network forensics specialist with the National Nuclear Security Administration (NNSA). Cory has also performed a wide variety of cyber-crime investigations, ranging from corporate and state-sponsored espionage to distributed financially-motivated criminal organizations, and was a contributing author for “Handbook of Digital Forensics and Investigation (2009).” He holds the SANS GCIH and GCFA certifications.
Christa M. Miller is a public relations and communication strategist serving the law enforcement and digital forensics communities. Prior to that she worked as a freelance trade journalist specialising in issues related to high-tech crime, law enforcement technology, and management issues. She has never been a digital forensics practitioner, but understands just enough technical detail to get herself into trouble in otherwise intelligent conversations.