Content awareness is getting a lot of buzz these days particularly in regards to e-Discovery tools, data leak prevention (DLP) tools, and other automated tools designed to prevent accidental, or malicious, disclosure of sensitive information or to discover information for legal proceedings.
But what exactly does content awareness really mean? There is no universally accepted definition so the result is that it means different things to different people.
Essentially, content awareness is about the capability of being aware of the content of data at rest or data in motion.
As a very simple example, the data could be a string of numbers that could be a phone number, an account number, a credit card number, a geographical coordinate, etc. The data could also be a string of alphanumeric characters that could be a name, address, date of birth, etc.
Depending on the content, the data could be completely innocuous and of little value or it could be very sensitive and of very high value to a competitor or criminal.
Thus, content awareness has become a very important attribute for automated tools designed for DLP, whether the leakage is accidental or malicious, or to discover relevant information for an eDiscovery engagement.
Automated tools with content awareness capabilities typically implement awareness through sophisticated key word search and pattern matching algorithms. Some tools have to be “trained” and can require a lengthy process to develop relevant key word lists and number patterns.
So, you begin using an automated tool with state-of-the-art content awareness capability and then all is well, right? Not so fast.
Human nature being what it is—and this comes straight out of Psych 101—people (i.e., trusted insiders) that are doing something they know they shouldn’t be doing, will attempt to conceal their actions! Imagine that, people doing bad things and not wanting to get caught.
Presuming the trusted insiders have been advised of unacceptable activities, as they should have been by way of a clearly articulated Acceptable Use Policy, and presuming they have been advised their activities will be monitored, as they should have been by way of a signed Consent to Monitoring Agreement, to explicitly remove any expectation of privacy, those with malicious intent should be expected to go to great lengths to conceal their activities to avoid a visit, possibly an extended one, to the Crossbar Hotel (i.e., jail).
So, the malicious insider seeking to avoid being caught, pops open their browser and Googles something really clever like “information hiding” and lo and behold, they get around 7,000,000 hits! Yup, that’s 7 M-I-L-L-I-O-N!
Many of the links returned by the search engine will direct the insider to web sites with digital steganography applications.
Digital steganography is an Internet era version of an information hiding technique that dates back to the days of ancient Greece when covert communication was accomplished by scraping the wax from wax tablets, scribing the message directly onto the wood, and then covering the tablet with wax again to get the message past the Roman guards. But I digress.


Share this