In a perfect world, e-Discovery would be as simple as pointing your software at the data source, kicking back and waiting for all documents to be ingested and processed with 100% accuracy. However, in the real world, e-Discovery involves dealing with thousands of file types, some of which are very complex and cannot be automatically handled by even the most sophisticated e-Discovery platforms. Consequently, being able to perform defensible e-Discovery requires the close supervision of experienced e-Discovery experts and a well-thought-out exception handling policy.
Even though most e-Discovery projects involve image output (TIFF, JPG, PDF etc.), we find that the specifications of the output images are rarely discussed thoroughly. An important detail, which is usually omitted from e-Discovery processing specifications, is whether or not output images should be normalized.
Image normalization (in the e-Discovery sense) is the process of transforming images to make them consistent in terms of dimensions, resolution, color depth and orientation. For example, larger images can be resized to 8.5″x11″, landscape pages can be rotated to portrait, images with different resolutions can be converted to 300 DPI etc.
Most mailboxes contain both active and deleted e-mail messages. By “deleted e-mail messages”, I am referring to messages that were permanently deleted. For example, a message that was deleted using SHIFT+Delete in Outlook or a message that was deleted from the “Deleted Items” folder. In some e-mail platforms, deleted messages are not immediately purged and can easily be recovered. For example, Ms Outlook does not purge deleted e-mail messages from a Personal Storage Table (PST) file until the PST is compacted.
Modern e-Discovery software can extract hundreds of metadata fields from documents. Extracted metadata is typically stored in a back-end database and a subset of it is exported and included in the e-Discovery production or review database. We often receive questions regarding which metadata fields should be included in an e-Discovery review database or which metadata fields should be requested during an electronic document production.
The answers to these questions depend on the requirements of each case and should ultimately be determined by the legal team. That said, we have prepared the following field list as an example, with the hope that it will serve as a good starting point.
If you are involved in the production or review of electronic evidence, you might have seen e-mail addresses that look a bit different than usual. For example:
Have you ever wondered what these values are? The two scenarios we run into most frequently are as follows.
A vast amount of electronic evidence is being transmitted everyday via electronic file transfers among corporations, law firms and e-Discovery service providers. Most of these transfers involve compressing the evidence into a file archive (ZIP, RAR, 7z etc.) and transferring the resultant archive(s) over the internet. While this is usually a straightforward process, it is critical to make the right decisions and use the right tools to avoid trouble down the road.
De-duplication is used extensively in digital forensics and e-Discovery as a way of culling documents. While the process itself is simple, de-duplication can be performed in numerous ways which affect review time, cost and your understanding of the custodians. Here are some questions that frequently come up while we discuss de-duplication options with clients.
Date/time information extracted from e-mails and electronic documents is a major aspect of electronic evidence. In order to interpret and display the extracted timestamps correctly, most digital forensics and e-Discovery software require the end user to specify a time zone. The selected time zone can have numerous effects such as the appearance of timestamps on printed e-mails or whether or not certain documents fall within the relevant time frame during culling. Especially in cases that involve multiple time zones, it is critical to determine how time zones should be handled in order to avoid potential problems down the road.
Robocopy is a great tool for copying files, but it does not offer an option to hash the source and destination files. While this may not be necessary for casual personal use, being able to confirm that the output files are identical to the source files using cryptographic hashes is crucial when working with electronic evidence.
There are commercial off-the-shelf file copy tools which have this functionality built-in, but they usually lack the flexibility that Robocopy offers. If you are a Robocopy fan, and do not mind a little bit of command line work, follow along and we will show you how to validate Robocopy results using the freely available software package md5deep.
We believe that discussing project specifications at the onset of a project and getting clear and complete instructions is the first step in completing an e-Discovery project successfully. One of the questions we regularly ask is whether or not embedded objects should be extracted. Over the years, we have found that most of our new clients require an explanation of what embedded objects are and the pros and cons of extracting them.
We typically recommend extracting all compound documents. However, we feel it is important that what this really means is understood clearly and an informed decision is made based on case requirements. We have come up with a few points for you to consider when making such a decision that will hopefully help you determine which route you should take.