In a perfect world, e-Discovery would be as simple as pointing your software at the data source, kicking back and waiting for all documents to be ingested and processed with 100% accuracy. However, in the real world, e-Discovery involves dealing with thousands of file types, some of which are very complex and cannot be automatically handled by even the most sophisticated e-Discovery platforms. Consequently, being able to perform defensible e-Discovery requires the close supervision of experienced e-Discovery experts and a well-thought-out exception handling policy.
What are Exceptions?
e-Discovery exceptions are documents that cannot be correctly processed by the e-Discovery platform. For example, the e-Discovery software may be unable to open, extract text or metadata from, or image a document. Exceptions can be encountered during various stages of the e-Discovery process such as collection, processing, review and production. For the purposes of this article, we will focus on exceptions in the processing stage of e-Discovery.
Types of Exceptions
A. Corrupt Files
Corrupt files are files that have structural problems which prevent them from being opened or manipulated in even their native application. File corruption can be caused by numerous factors such as network transmission errors, errors in the medium where files were stored (e.g. bad sectors on a hard drive) or unexpected termination of the software that was being used to edit the file (e.g. a power failure).
When handling corrupt file exceptions, the first course of action usually is to investigate the possibility of obtaining a replacement. If a replacement copy is not available, depending on the nature of the case and how critical the corrupt file is, attempting to repair the file may be a viable option (e.g. recovering a corrupt mailbox). Alternatively, the corrupt file can be excluded from processing and delivered in native format. In any case, the exception should be logged and all steps taken should be thoroughly documented.
B. Unprocessable Files
Unprocessable files are files that do not support the common e-Discovery actions such as text and metadata extraction or conversion to image format. For example, system files such as executables and dynamic link libraries are typically unprocessable file types.
C. Unsupported Files (Processable Files That Are Not Supported by The e-Discovery Platform)
No e-Discovery software supports all processable file types. An e-Discovery project may contain unsupported files that could be searched or converted to an image format using external software. Some examples would be advanced CAD/CAM files such as Unigraphics, less popular archive container formats such as ARJ and ACE, some database formats such as SQLite, project management files such as Primavera files and BlackBerry back-up files etc.
Depending on the type and amount of unsupported files, these files may be manually processed or the e-Discovery service provider can develop a custom solution to handle the files in an automated manner. Some complex file types such as technical drawings and databases can be processed and produced in native format if the discovery agreement allows.
D. Audio & Video Files
Audio and video files cannot be directly processed by most e-Discovery software, but depending on the case, they can be a very good source of electronic evidence. For example, voicemail messages sent as e-mail attachments by the VoIP phone system in a corporation may contain information not found anywhere else in the organization.
In some cases, the legal team may choose to perform audio discovery on the audio and video files to make them searchable. Alternatively, the files can be delivered in native format so that the reviewers can listen to the audio recordings or watch the videos during review.
E. Encrypted Files
Encrypted files are files that were protected by a password, via digital rights management (DRM) or other encryption schemes. Encrypted files can be single documents such as Ms Office files or PDFs, or encrypted containers such as TrueCrypt volumes.
The legal team should be informed of any encrypted files before further action is taken. Depending on the case, attorneys may choose to exclude the encrypted files from processing due to privacy concerns. On the other hand, in some cases, encrypted files may be of particular interest.
Attorneys may occasionally be able to obtain the passwords for the encrypted files in the data set. If passwords are not available, they can often be discovered by strategically reviewing neighbor documents or by attempting to crack the passwords.
Exception Handling: How Should Exceptions be Tracked, Handled and Reported?
A. e-Discovery Software
A well-designed e-Discovery software should provide the following mechanisms for exception tracking, handling and reporting:
- All encountered exceptions should be logged and displayed to the technician conspicuously. The log files should contain detailed information about the exceptions such as the full file path, file name, hash value and a description of the exception.
- The e-Discovery technician should be able to manually process documents that cannot be automatically processed. Large amounts of unsupported file types should be able to be batch processed using the native application (e.g. Shell Print).
- The e-Discovery software should be able to complete the processing of exceptions after the fact. For example, if the password of an encrypted container is discovered after the processing job was completed, or a replacement for a corrupt mailbox was obtained, the e-Discovery software should be able to add the new extracted documents to the data set where they belong, without having to re-process the documents.
- The e-Discovery software should have built-in mechanisms that facilitate the tracking and review of the exceptions during quality control. The technician should be able to add his comments for each file. This information can then be exported as part of the exceptions report.
- The e-Discovery software should have built-in password cracking functionality and should allow the technician to input a list of known passwords to be used for opening and processing encrypted files.
B. The e-Discovery Deliverable
In our experience, most legal teams prefer to leave processing exceptions in the review database as placeholders with native hyperlinks. The placeholder usually contains a few fields of metadata for the exception such as the file name, file path and hash value. This helps them see the exceptions in context, and be able to look at them in more detail if they wish to do so. If the exceptions are left in the database as placeholders, those records should be identified (e.g. with a tag in the database or by populating a field) so that they can easily be excluded from a subsequent production if required.
On the other hand, if the legal team would like to have the exceptions excluded from the review database, the exceptions should be exported separately and delivered in native format (e.g. in a folder called “EXCEPTIONS”) along with the exceptions report as part of the deliverable. This would ensure that someone looking at the deliverable in the future can easily locate and review the exceptions associated with that deliverable.
C. The Exceptions Report
The exceptions report is an important part of the documentation of the ESI lifecycle. At a minimum, an exceptions report should contain the following elements:
- Relevant information about the case and project so that, if the report is reviewed separately from the deliverable, the viewer can still identify which project/batch the report pertains to.
- Identifying information (e.g. BegDoc #, internal Doc ID) about each exception so that the entries on the exception report can be linked to the placeholders in the database or native files in the “EXCEPTIONS” folder.
- Technician’s comments and description of why the file is an exception.
- Crucial file metadata such as file name, file path, file size, file extension, hash value etc.
Processing exceptions are unavoidable in e-Discovery. We believe that being able to accurately identify exceptions, thoroughly documenting every step of action, and having a well-thought-out exception handling plan are essential components of an effective and defensible e-Discovery process. Legal teams should not only discuss exception handling policies with their in-house litigation support team and outside service providers, but also with the opposing counsel during “Meet and Confer”.