Proksimiti Near De-Duplication Engine
Identification of near-duplicate documents has become a key strategy in streamlining large scale document review. Legal teams frequently encounter numerous similar documents that cannot be eliminated using de-duplication since they are not identical. For instance, multiple revisions of the same document, documents converted to PDF or printed and scanned copies of electronic documents can exhibit high levels of similarity. When reviewed independently, these documents not only cost valuable resources but they also introduce undesired inconsistencies to document review.
Meridian Discovery has developed a proprietary near de-duplication technology called PROKSIMITI that identifies and groups similar documents in a fast and cost effective manner, saving our clients time and money. PROKSIMITI analyzes every document and creates fingerprints for each one. These fingerprints are then compared and documents whose mutual similarity exceeds a user defined similarity threshold are grouped together into document clusters. Each cluster contains a representative document that can be used as the starting point during review.

BENEFITS
Time and Cost Savings
Reviewing similar documents in groups allows legal teams to focus only on the differences in each document within a document cluster. This eliminates the need to review redundant information over and over again and reduces review time dramatically. In certain instances, large numbers of similar documents can be handled in bulk and flagged altogether.
Consistent Document Treatment
When near duplicate documents are reviewed independently by different reviewers, legal teams run the risk of inadvertently flagging different versions of the same document inconsistently. For example, while nine copies of a privileged document are flagged correctly, a tenth copy reviewed by a different reviewer can be flagged for production. Near de-duplication allows an entire cluster of near duplicate documents to be assigned to the same reviewer and reviewed consistently.
Reduced Risk
Working with large data sets makes it more difficult to locate and focus on documents that are most relevant to your case. Logically organizing documents into similarity groups helps you manage your case more efficiently and reduces your risk of overlooking critical information.
TECHNOLOGY HIGHLIGHTS
Performance and Scalability
In addition to being offered as a service, PROKSIMITI can be deployed as a self contained desktop application or a client-server solution with a database server back-end which provides enterprise levels of scalability and performance. Our enterprise solution enables organizations to process large amounts of data quickly and cost effectively by utilizing distributed processing and offers flexible network licensing models.
Language and Format Independence
Our near de-duplication technology accurately identifies near duplicate documents regardless of text or paragraph formatting (font variations or white space), document type (Ms Excel vs. Lotus 1-2-3 document) or language. Additionally, we give end users control over how their data is processed through a set of customizable parameters while maintaining ease of use.
Integration Options
Meridian Discovery offers several integration paths to support organizations desiring to integrate our PROKSIMITI into their existing applications or workflow. Our technology can be utilized as a service, a GUI application or a software development kit.
Please follow the link to download our Near De-Duplication Brochure.
