Data culling

eDiscovery requests tend to be a mixed blessing: they often provide invaluable evidence for your case (everything from alibi confirmation to a “smoking gun”) but this evidence is inevitably accompanied by an avalanche of extraneous information. With review being the costliest and most time-consuming part of eDiscovery, it is crucial to reduce your data set prior to review in order to get the most from your resources.

When asked how he knew what to chisel away when carving a statue, Michelangelo replied, “I saw the angel in the marble and carved until I set him free.” The same principle can be applied to data culling. Understanding what is most important to your case will help you identify and eliminate unnecessary data, leaving you with only the most relevant information.

Three Types of Data Culling

Put simply, data culling is the process of searching and isolating data based on specific criteria, such as date ranges and keywords. This process uses multiple techniques in order to remove as many documents from the collection as possible before process and review. The three most common methods of data culling include:

  • DeNISTing removes the “junk” data that can clog up your review. This includes program and system files that do not contain user-generated data and other file formats that generally hold no evidentiary value. This process is almost always performed during culling as it is a simple way to reduce non-relevant documents.
  • Dedupe (or deduplication) identifies and separates duplicate documents and emails. You can perform either a global or custodial dedupe. Global deduping removes documents across custodians, whereas custodial deduping only dedupes a custodian against themselves. The main advantage of custodial deduping is that it ensures a custodian’s entire collection is kept intact, while global deduping maximizes the number of duplicate documents that are removed.
  • Search terms are used to find relevant documents. Identifying the right terms requires a thorough knowledge of the facts of the case in order to maximize results. For example, choosing terms or date ranges that are too broad can produce a larger set of documents than is necessary. Likewise, a date range that is too small could produce insufficient results, forcing you to do another search. Using terms that are unique as possible will maximize the number of potentially responsive documents they bring back.

Get the Most from eDiscovery with Precise

Precise offers a variety of eDiscovery solutions to meet the needs of any size case or law firm. From managed services to digital forensics, Precise has all the tools you need to optimize your eDiscovery. Call us today at 866-277-3247 to gain the Precise advantage in your next case.