It Is Infeasible to Alter A Hash Value to Hide It from an Examiner

While it is possible to cause two files to have matching hash values, it is a complex process. The person creating the compromised files must have physical possession of the files to be altered. The affected/compromised files must be altered prior to the hash algorithm being run so that a matching hash value is produced. Research by Stevens, et al. has shown, in their vulnerability assessment, that a known hash value cannot be targeted to produce a duplicate hash of a known file. “We cannot target a given hash value, and produce a (meaningful) input bit string hashing to that given value… colliding files have to be specially prepared by the attacker… Existing files with a known hash that have not been prepared in this way are not vulnerable.” This is important in the use of hash sets to identify known files. Since the known file hash sets have already been created independently by the National Institute of Standards and Technology (NIST) in the National Software Reference Library (NSRL) Hash Sets. Additionally, known file filtering is utilized to identify known contraband files. The National Child Victim Identification Program (NCVIP) Hash Sets are created to identify victim images of child sexual exploitation. It is infeasible, if not impossible, to create a hash value of a contraband image, and have a known hash set filter it out of a case, in an attempt to hide it from an examiner.

Using the files produced by Stevens et al. in their MD5 collision research, I applied both MD5 and SHA-1 hash algorithms. The resulting MD5 hashes confirmed their successful collision. The resulting SHA-1 values, however, were unique and did not share the same colliding values. The use of two hash algorithms would identify files specifically created to collide. Examination of the data packets confirmed the unique data appended to the files. The appended data was easily identified at the end of the file structure, when compared to the original file. If it were possible to reinsert a manipulated file into a forensic image, a cascading effect would occur changing the hash values throughout its parent folders in the image ultimately creating a mismatch to the acquisition hash.

From The Hash Algorithm Dilemma–Hash Value Collisions by Don L. Lewis