8 Common Misconceptions about Native File Productions

By March 28, 2014Articles

Native file productions are gaining more and more traction in e-Discovery, and rightfully so. However, what native format is, and its benefits and drawbacks are commonly misunderstood, occasionally rising to the level of e-Discovery disputes. Here are some of the misconceptions we encounter frequently about native file productions:

Misconception 1: “Producing Documents in Electronic Format Results in a Native File Production”

Native format is the file structure of an electronic document as defined by the original creating application. For example, if a spreadsheet was created using Microsoft Excel, then the native format of the spreadsheet would be its original Microsoft Excel format. If the producing party takes native documents and converts them to another electronic format such as searchable PDFs, RTF files, MHT files etc., the resultant production is not in native format.

Misconception 2: “The Native Format for Exchange E-mails Is MSG Files”

Microsoft Exchange Server stores e-mails in a proprietary database format which comprises “.edb” (MAPI-based Database) and “.stm” (Streaming Database) files. These binary files are created and maintained by the Exchange Server software and are the native format for Exchange E-mails.

This does not mean that Exchange e-mails should never be produced as MSG (or PST) files. In many cases, producing Exchange Server data in its native form may not be feasible. For example, the Exchange database may contain data for custodians who are outside the scope of the case, or data that belongs to a time frame that is not relevant. Additionally, the recipient of the production may not have the tools to work directly with the Exchange Store Repository and may request a different format in lieu of a native production. In such scenarios, PST files (usually a PST per mailbox) or MSG files are usually considered to be acceptable near-native alternatives.

Misconception 3: “Near-Native Means Electronic Format That Looks Like the Native File”

The key attribute of the near-native format of a document is that it contains most of the meaningful metadata found in the original native file. For example, PST files are usually considered near-native format for Exchange e-mails since they can be exported directly from the Exchange environment and contain most of the meaningful metadata found in the Exchange database.

On the other hand, if Exchange e-mails were converted to searchable PDFs, or MHT files, they would no longer be in native or near-native format. Even though the resulting PDF or MHT files would be searchable electronic files which look like printouts of the e-mails, they would be missing the hundreds of metadata fields which existed in the original Exchange environment.

Misconception 4: “Native File Productions Eliminate e-Discovery Processing Expenses”

In most cases, native file productions require e-Discovery processing either on the producing or receiving side, or both. For example, pre-production review of a custodian’s mailbox would typically require the producing party to extract all the e-mails from the mailbox, extract e-mail attachments, extract embedded objects and perform optical character recognition (OCR) on documents without extractable text so that the documents are made searchable and can be loaded into a legal review platform. Similarly, if the recipient receives a PST file containing responsive e-mails from the custodian’s local computer, they would have to perform similar processing steps in order to review the documents.

That said, it is true that native-only e-Discovery processing is generally less cumbersome than TIFF or PDF conversion, and is typically more cost effective.

Misconception 5: “Native Review Should Take Less Time than TIFF Review”

Not necessarily true. Making sure that the review team has fully reviewed the material being produced in native format can sometimes be a challenge. For example, privileged content found inside “very hidden” worksheets not visible to the end users, or numerous relevant documents found inside exotic containers not recognized by the e-Discovery processing software would require very thorough native review.

TIFF productions typically have the advantage that the review team can review the extracted text, metadata and rendered images which paint a very clear picture of what exactly is about to go out the door. However, deficiencies in the processing/review workflow in a TIFF production can also have devastating effects. For example, the file containers in the above example would typically be treated as processing exceptions by the e-Discovery software. If not thoroughly examined, these files, and the numerous relevant files they contain, could either be produced to the opposition in native format without being reviewed, or excluded from the production depending on the discovery agreement.

Misconception 6: “Native Files Cannot Be Redacted”

Native files can be redacted, but redacting in native format can be quite a challenge. For example, redacting a native Excel spreadsheet with multiple worksheets, formulas, macros and pivot tables can get very complex as each change can trigger a number of unforeseen additional changes at different parts of the spreadsheet. Needless to say, redacting the native file, by definition, would cause file data and metadata to be changed. Therefore, the redacted version of the file would have to be tracked alongside the clean version of the native file. A commonly used compromise is to produce redacted documents in TIFF or PDF format while the rest of the documents are produced in native format.

Misconception 7: “Legal Review Tools Always Display Electronic Files in Their True Native Format”

Many review tools utilize viewer components that can display numerous electronic file types without TIFF conversion (e.g. Oracle’s Outside In Viewer). However, when the native file is displayed using such a viewer, the end user is no longer reviewing the document in its native format. The viewer component presents a rendering of the native file, which is the result of a conversion process. That said, many legal review platforms give the user the option to either download or display (usually skinned inside the application window) the native files in their original format when desired.

Misconception 8: “Native Files Cannot Be Tracked Because They Do Not Have Bates Endorsements”

It is true that native files cannot be directly Bates stamped. However, this is usually not a problem since the designations and control numbers of native files can be tracked by other means (e.g. by incorporating them into the file names, using a reference number and keeping track of them in a database etc.). Even if the legal team wishes to print the native files, the printed material can be tracked by inserting reference information on the header/footer areas of the printed pages, similar to a Bates endorsement.

Conclusion

There are numerous arguments for and against native file productions. In my opinion, the biggest benefit of a native file production is that electronic evidence can be produced without degradation, which outweighs its drawbacks. Nevertheless, it is important to fully understand the pros and cons of native file productions, and how they would fit into your workflow on a case by case basis, as the chosen production format can have significant effects on what information is shared, how quickly e-Discovery can be completed and what the e-Discovery budget will be.

Arman Gungor

About Arman Gungor

Arman Gungor is a certified computer forensic examiner (CCE) and an adept e-Discovery expert with over 21 years of computer and technology experience. Arman has been appointed by courts as a neutral computer forensics expert as well as a neutral e-Discovery consultant. His electrical engineering background gives him a deep understanding of how computer systems are designed and how they work.