Sitting back, looking at his security controls matrix, George felt comfortable with the trustworthiness of systems on which he expects sensitive information to reside. His database servers are located on segments locked down and monitored by unified threat management (UTM) devices. The NAS where he expects unstructured data (e.g., Word and Excel files) is encrypted. Data in motion is also protected, with nothing leaving the boundaries of his network in clear text. But he has a nagging feeling deep in his gut telling him something is missing. Then it hits him. What if users don’t put data where he expects? Does he already have PII or ePHI stored in risky storage? The worst of it, George realizes, is that he has no tools to help him answer these questions.
George’s situation isn’t unique. Across the globe security managers working for medium and large organizations are asking themselves these same questions. The most common barrier to answering them is the absence of an effective data discovery tool. Most of us have looked at data leakage prevention (DLP) solutions, but the cost is often high. Further, DLP solutions often provide little value beyond the security controls matrix. If you’ve done your job and achieved SOX or HIPAA compliance–an assertion verified by external auditors–you may find it hard to get approval for additional dollars for a security-only solution. But there may be another way. Why not demonstrate to executive management that the proposed solution will not only solve multiple security problems; it will also address an increasingly painful business challenge—e-discovery.
The DLP products I’ve seen were largely designed for just that, DLP. E-discovery is typically added as an afterthought due to growing market demand. However, when I looked at solutions designed specifically for e-discovery I made an interesting discovery; they were not only designed to discover and deal with data at rest. They also cost much less in most cases.
One of my favorite e-discovery solutions is StoredIQ’s Intelligent eDiscovery module. The module works without an agent installed on target systems and runs on a network appliance. Based on the EDRM model, it performs the following tasks (from StoredIQ product Web page):
- Scanning
Targeted scanning is available by custodian, path, share, server, modify date, and additional key metadata. Expansive scanning helps prove that any potentially relevant ESI [Electronically Stored Information] was not missed. - Identification by content and metadata
StoredIQ Intelligent eDiscovery provides topology mapping of potentially relevant ESI by sources, key player names, date ranges, keywords and document types. - Collection and preservation
Data objects are copied to central repository, with no alteration of system or object metadata. An audit trail of the copy process is developed that supports chain of custody and authenticity. Original, full-object path, SID and ACL information is properly maintained. - Indexing and searching
StoredIQ Intelligent eDiscovery performs content-level culling by full-text indexing your preserved data collection. Data is culled based on input from legal counsel regarding potentially relevant document sources, key player names, date ranges, keywords, phrases, metadata, classifications, concept tags or document types. - Review-ready output
Users can produce review ready output of native files with Concordance or standards-based XML load files. The product supports all rolling productions, allows subsequent collections to be compared to prior productions, and permits only the new documents to be produced.
By itself, the eDiscovery module can locate files with sensitive information in locations let you know if they present high risk. With the addition of the vendor’s Information Governance module,
Policies can be defined to associate an appropriate action (such as retain or secure) and apply it to positively identified and classified objects. Policies are symmetrically scaled across the StoredIQ platform to improve performance and scalability. Deep policy auditing at the individual item level is also supported (Information Governance product Web page).
StoredIQ isn’t the only solution which offers this dual functionality. McAfee, through its acquisition of Reconnex, offers a similar solution. The McAfee product, however, is more DLP focused. It’s able to not only find files at rest. It can also identify sensitive data in motion. McAfee claims it will integrate the Reconnex functionality into it’s centralized management product, ePolicy Orchestrator, by the end of 2009 or early 2010.
Both of these solutions, however, provide both DLP and e-discovery functionality at some level. So it might make sense to speak with your legal team before you try to make a case for a data discovery tool. Consider their e-discovery challenges when building your requirements and business value analysis presentations. You should be able to spreading the cost across multiple challenges thereby enhancing the value of your solution. You might also be able to enlist the legal department as an ally. Altogether you just might have enough to convince the signer of the checks that he or she is making a good business investment, not just incurring another security expense.