<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Meridian Discovery &#124; e-Discovery, Computer Forensics, Hosting</title>
	<atom:link href="http://www.meridiandiscovery.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.meridiandiscovery.com</link>
	<description>e-Discovery, Computer Forensics, Hosting</description>
	<lastBuildDate>Fri, 18 May 2012 22:50:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Why We See Strange Exchange E-mail Addresses in e-Discovery</title>
		<link>http://www.meridiandiscovery.com/articles/why-we-see-strange-exchange-e-mail-addresses-in-e-discovery/</link>
		<comments>http://www.meridiandiscovery.com/articles/why-we-see-strange-exchange-e-mail-addresses-in-e-discovery/#comments</comments>
		<pubDate>Fri, 18 May 2012 21:41:47 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[e-Discovery]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=2068</guid>
		<description><![CDATA[If you are involved in the production or review of electronic evidence, you might have seen e-mail addresses that look a bit different than usual. For example: /O=EXAMPLE/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=USERNAME Have you ever wondered what these values are? The two scenarios we run into most frequently are as follows: 1) LegacyExchangeDN &#038; X.500 Addresses ...]]></description>
			<content:encoded><![CDATA[<p>If you are involved in the production or review of electronic evidence, you might have seen e-mail addresses that look a bit different than usual. For example:<br />
<font style="font-size:12px;"><br />
/O=EXAMPLE/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=USERNAME<br />
</font></p>
<p>Have you ever wondered what these values are? The two scenarios we run into most frequently are as follows:</p>
<h3>1) LegacyExchangeDN &#038; X.500 Addresses</h3>
<p>In an Ms Exchange organization, internal e-mails are routed using X.500 addresses instead of SMTP addresses. The X.500 address of each mailbox is stored in the legacyExchangeDN attribute in Active Directory, which is set when a mailbox is created and includes the name of the Exchange Administrative Group where the mailbox belongs.  LegacyExchangeDN values typically look as follows:</p>
<p><font style="font-size:12px;"></p>
<code class="code">/O=EXAMPLE/OU=EXCHANGE ADMINISTRATIVE GROUP (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=USER</code>
<p></font>/O: Organization Name<br />
/OU: Organizational Unit<br />
/CN: Common Name</p>
<p>Starting with Exchange 2007, user-defined Exchange Administrative Groups were replaced by a single administrative group called &#8220;Exchange Administrative Group (FYDIBOHF23SPDLT)&#8221;. The value &#8220;FYDIBOHF23SPDLT&#8221; is actually an encoded version of the string &#8220;EXCHANGE12ROCKS&#8221; with each character replaced with the letter that follows it in the alphabet (E->F, X->Y etc.).</p>
<p>When an e-mail that was sent within the Exchange organization is taken outside (i.e. for e-Discovery processing or digital forensic analysis), the SMTP e-mail address for the user (e.g. abc@cde.com) can no longer be resolved and the only available address would be the legacyExchangeDN value. In this scenario, the e-Discovery processing output may look similar to the example image in Figure 1.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/05/Email_w_X500_sm.png"><img width="220" height="150" alt="E-mail with X.500 Addresses" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Email_w_X500_sm-220x150.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 1 &#8211; E-mail with X.500 Addresses</p>
<h3>2) IMCEA Encapsulation</h3>
<p>Another common scenario is IMCEA encapsulated addresses. The sender and recipient addresses for each message are looked up in the Global Address List (GAL) before the message is sent. If the SMTP addresses cannot be resolved (e.g. they are hidden from the GAL), the look-up fails and Exchange is forced to encapsulate the only address available (the X.500 directory name) using Internet Mail Connector Encapsulated Addressing (IMCEA). Addresses encapsulated in this manner would look as follows:<br />
<font style="font-size:12px;"></p>
<code class="code">IMCEAEX―_O=EXAMPLE_OU=EXCHANGE+20ADMINISTRATIVE+20GROUP_CN=RECIPIENTS_CN=CUASUENA@domain.com</code>
<p></font><br />
The string &#8220;EX&#8221; at the end of &#8220;IMCEAEX&#8221; indicates that the encapsulated non-SMTP address was an Exchange address. The encapsulation process replaces each forward slash &#8220;/&#8221; with an underscore &#8220;_&#8221; and each symbol with a plus sign &#8220;+&#8221; followed by its two-digit hexadecimal ASCII code (e.g. +20 for the space character).</p>
<p>In this scenario, the e-Discovery processing output may look similar to the example image in Figure 2. Please note that &#8220;domain.com&#8221; represents the SMTP domain that is used to encapsulate the non-SMTP address, which may not be the same domain where the original address belonged. </p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/05/IMCEA_Email_sm.png"><img width="292" height="190" alt="E-mail with IMCEA Encapsulated Addresses" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/IMCEA_Email_sm-292x190.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 2 &#8211; E-mail with IMCEA Encapsulated Addresses</p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/articles/why-we-see-strange-exchange-e-mail-addresses-in-e-discovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transferring Electronic Evidence in File Containers</title>
		<link>http://www.meridiandiscovery.com/how-to/transferring-electronic-evidence-in-file-containers/</link>
		<comments>http://www.meridiandiscovery.com/how-to/transferring-electronic-evidence-in-file-containers/#comments</comments>
		<pubDate>Mon, 07 May 2012 21:18:31 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[How-to]]></category>
		<category><![CDATA[e-Discovery]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1923</guid>
		<description><![CDATA[A vast amount of electronic evidence is being transmitted everyday via electronic file transfers among corporations, law firms and e-Discovery service providers. Most of these transfers involve compressing the evidence into a file archive (ZIP, RAR, 7z etc.) and transferring the resultant archive(s) over the internet. While this is usually a straightforward process, it is ...]]></description>
			<content:encoded><![CDATA[<p>A vast amount of electronic evidence is being transmitted everyday via electronic file transfers among corporations, law firms and e-Discovery service providers. Most of these transfers involve compressing the evidence into a file archive (ZIP, RAR, 7z etc.) and transferring the resultant archive(s) over the internet. While this is usually a straightforward process, it is critical to make the right decisions and use the right tools to avoid trouble down the road.</p>
<h3>Preservation of Metadata</h3>
<p>One of the most common issues associated with compressing electronic evidence has to do with the preservation of metadata. Unless proper care is taken, compressing electronic files can result in loss of valuable file system metadata.</p>
<p>In our experience, the most common file compression tools used in e-Discovery are WinZip, WinRAR and 7-Zip. Surprisingly, some of these applications do not capture and restore file system timestamps by default. The following table summarizes which metadata timestamps can be preserved using each software:<br />
</br></p>
<div class="table_style">
<table>
<thead>
<tr>
<th></th>
<th><b>WinRAR 4.11</b></th>
<th><b>WinZip 15.5</b></th>
<th><b>7-Zip 9.20</b></th>
</tr>
</thead>
<tr>
<td>Creation Date</td>
<td>Default: No<br/>Yes (Optional)</td>
<td>Always</td>
<td>Never</td>
</tr>
<tr>
<td>Last Modification Date</td>
<td>Always</td>
<td>Always</td>
<td>Always</td>
</tr>
<tr>
<td>Last Accessed Date</td>
<td>Default: No<br/>Yes (Optional)*</td>
<td>Always*</td>
<td>Never</td>
</tr>
<tfoot>
<tr>
<td colspan="4"><em style="font-size:11px;">* When the option is available, the stored last accessed date is the date/time when the files were accessed while creating the file archive. When not available, the last accessed date is set to the date/time the files were extracted.</em></td>
</tr>
</tfoot>
</table>
</div>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Table 1 &#8211; Date Metadata Preserved by File Archive Software</p>
<p>Based on the table above, only WinZip and WinRAR support preserving file creation dates. WinRAR captures file creation timestamps only after the option is selected while WinZip captures them by default. Consequently, if files were compressed using 7-Zip, or using WinRAR without the correct date options, their file system creation timestamps would be stripped off. This means that even if the electronic evidence was collected properly and file system metadata was preserved, the compression process can prevent this information from being transmitted to the recipient.</p>
<p>The relevant options during compression and extraction in WinRAR are as follows:</p>
<div class="one_half"><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/05/WinRAR_Compression.png"><img width="220" height="150" alt="WinRAR Date Settings (Compression)" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/WinRAR_Compression-220x150.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 1 &#8211; WinRAR Date Settings (Compression)</p>
</div>
<div class="one_half last"><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/05/WinRAR_Extraction.png"><img width="220" height="150" alt="WinRAR Date Settings (Extraction)" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/WinRAR_Extraction-220x150.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 2 &#8211; WinRAR Date Settings (Extraction)</p>
</div>
<div class="clearboth"></div>
<h3>Encryption</h3>
<p>Security is always a valid concern when evidence is transmitted electronically. When compressing files, we recommend using encrypted archives as an additional security measure. The encryption password should be strong, and different than the credentials required to access the file transfer system. WinZip and 7-Zip support AES-256 encryption while WinRAR uses a maximum AES key size of 128 bits. WinRAR and 7-Zip support encrypting file names while WinZip does not.</p>
<h3>Long File Paths</h3>
<p>Another common issue is compressing or decompressing files with very long paths. In most cases, you can work around this issue by mapping the source (if compressing) or destination (if decompressing) folder path as a network drive and accessing the files through that drive letter. For example, let&#8217;s assume that the files we would like to compress are in the following folder:</p>
<p>\\server\share\Case Documents\Client Name\Sources\Case Name\Date\Data Set 1</p>
<p>We can map this folder to a drive letter such as Z: using the following command:</p>
<code class="code">net use Z: "\\server\share\Case Documents\Client Name\Sources\Case Name\Date\Data Set 1"</code>
<p>We can now compress the contents of, or extract the files to the Z: drive. In this example, this would shave 73 characters off the file paths.</p>
<h3>Conclusion</h3>
<ul class="list2 list_color_blue">
<li>It is important to choose the right file compression tool and familiarize yourself with all available options. We typically recommend WinRAR as it provides a combination of good performance, compression and security features</li>
<li>Encrypting file archives before sending them over the internet can be a valuable additional security measure</li>
<li>In most cases, file paths that are too long can be accessed by mapping the parent folder as a network drive</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/how-to/transferring-electronic-evidence-in-file-containers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Frequently Asked Questions About De-Duplication</title>
		<link>http://www.meridiandiscovery.com/articles/frequently-asked-questions-about-de-duplication/</link>
		<comments>http://www.meridiandiscovery.com/articles/frequently-asked-questions-about-de-duplication/#comments</comments>
		<pubDate>Tue, 24 Apr 2012 15:27:06 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[e-Discovery]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1793</guid>
		<description><![CDATA[De-duplication is used extensively in digital forensics and e-Discovery as a way of culling documents. While the process itself is simple, de-duplication can be performed in numerous ways which affect review time, cost and your understanding of the custodians. Here are some questions that frequently come up while we discuss de-duplication options with clients: Q1: ...]]></description>
			<content:encoded><![CDATA[<p>De-duplication is used extensively in digital forensics and e-Discovery as a way of culling documents. While the process itself is simple, de-duplication can be performed in numerous ways which affect review time, cost and your understanding of the custodians. Here are some questions that frequently come up while we discuss de-duplication options with clients:<br/><br/></p>
<h4>Q1: What do &#8220;horizontal de-duplication&#8221; and &#8220;vertical de-duplication&#8221; mean?</h4>
<p>These terms are related to the scope of de-duplication. Horizontal de-duplication refers to de-duplication that is performed globally, across custodians while vertical de-duplication indicates that de-duplication scope will be limited to each custodian&#8217;s documents.</p>
<p>When documents are de-duplicated horizontally, all but one copy of a document is removed from the document universe regardless of which custodians had the same document. On the other hand, when de-duplication is performed vertically, multiple copies of the same document may be left in the document universe as long as each copy originated from a different custodian. This would allow the legal team to gain a greater understanding of the custodians at the expense of having more documents in the review database.</p>
<h4>Q2: I chose to have my documents de-duplicated prior to processing. Now, how do I track down duplicate copies of a certain document?</h4>
<p>When your documents are de-duplicated, duplicates are flagged in the service provider&#8217;s back-end database instead of being removed permanently. Every time de-duplication is performed, your service provider should include a de-duplication report that lists de-duplicated copies of each document and at a minimum their hashes, sizes, file names and folder paths.</p>
<h4>Q3: How are attachment families handled during de-duplication?</h4>
<p>De-duplication should normally be performed at the attachment family level rather than document level. In other words, an e-mail message and all of its attachments would have to be identical to another e-mail family in order for them to be considered duplicates. This would ensure that an e-mail attachment would not be de-duplicated against a loose electronic document and removed from its family.</p>
<h4>Q4: What is a cryptographic hash?</h4>
<p>A cryptographic hash is a fixed-size signature for an arbitrary block of data that represents its contents. Hash algorithms are designed such that even the slightest change to the original data changes the signature dramatically. These signatures can then be used to compare documents and identify duplicates. Message Digest 5 (MD5) and SHA-1 are two popular cryptographic hash functions used in e-Discovery.</p>
<h4>Q5: Can two documents with different file names or dates be considered duplicates?</h4>
<p>Yes. De-duplication is usually performed by comparing cryptographic hashes (e.g. MD5, SHA1 etc.) of documents to each other. The calculated hash values are based on the binary contents of documents and do not take into account external metadata that is stored in the file system. Therefore, two files with the same contents but different file names would produce the same hash value.</p>
<p>Most e-Discovery service providers would allow you to use a custom hash that includes your choice of metadata fields in addition to document contents for de-duplication. For example, you could choose to include the file name field in your custom hash if you would like to make sure that documents can be considered duplicates only when their file names are also identical.</p>
<p><b>Note:</b> Some document types (i.e. Ms Office documents, Adobe PDF files etc.) contain internal metadata including dates. Since this information is stored inside the document, it would affect the calculated hash value.</p>
<h4>Q6: Are e-mails hashed the same way as loose electronic documents?</h4>
<p>An e-mail is essentially a set of fields stored in a container. This container can hold an individual message (e.g. an MSG file) or it can be a database-like structure containing multiple messages (e.g. PST, NSF etc.). Consequently, most e-Discovery software compute cryptographic hashes for e-mails  based on the metadata values found in a predetermined set of fields. The following is a list of fields typically used for e-mail de-duplication: Author, Recipient(s), CC, BCC, Date Sent, Subject, Attachment Count, Attachment Names, Message Body</p>
<p>E-mail messages contain many more fields than the fields typically used for de-duplication and some of these fields can have variations among multiple copies of an e-mail message. For example, 4 copies of a message may be read and the fifth copy may be unread. Depending on the nature of the case, the contents of these additional fields may or may not be relevant. However, it is always important to clearly define what should be considered a duplicate at the onset of each project.</p>
<h4>Q7: What are the chances of two different documents having the same MD5 hash?</h4>
<p>Extremely small. In a set of 2^64 (18,446,744,073,709,551,616) documents, the chances of two different documents having the same MD5 hash is 50% (birthday paradox).</p>
<h4>Q8: I heard that MD5 is now considered cryptographically broken. Is it good enough to identify duplicate documents?</h4>
<p>The fact that MD5 is now cryptographically broken means that an attacker can create a pair of non-identical files that produce the same MD5 hash. Keep in mind that this is different than a preimage attack where an attacker produces a file that matches a specific, known hash value. There are no known preimage attacks against MD5 as of this writing.</p>
<p>Briefly, MD5 is currently considered suitable for identifying duplicate documents. However, our recommendation is to hash each document using more than one algorithm (e.g. MD5 and SHA-1) to alleviate security concerns. We anticipate that SHA-256 will be the new hash standard for e-Discovery in the near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/articles/frequently-asked-questions-about-de-duplication/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Time Zones in e-Discovery</title>
		<link>http://www.meridiandiscovery.com/articles/time-zones-in-e-discovery/</link>
		<comments>http://www.meridiandiscovery.com/articles/time-zones-in-e-discovery/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 18:35:10 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[Computer Forensics]]></category>
		<category><![CDATA[e-Discovery]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1573</guid>
		<description><![CDATA[Date/time information extracted from e-mails and electronic documents is a major aspect of electronic evidence. In order to interpret and display the extracted timestamps correctly, most digital forensics and e-Discovery software require the end user to specify a time zone. The selected time zone can have numerous effects such as the appearance of timestamps on ...]]></description>
			<content:encoded><![CDATA[<p>Date/time information extracted from e-mails and electronic documents is a major aspect of electronic evidence. In order to interpret and display the extracted timestamps correctly, most digital forensics and e-Discovery software require the end user to specify a time zone. The selected time zone can have numerous effects such as the appearance of timestamps on printed e-mails or whether or not certain documents fall within the relevant time frame during culling. Especially in cases that involve multiple time zones, it is critical to determine how time zones should be handled in order to avoid potential problems down the road.</p>
<h3>What are Time Zones?</h3>
<p>Due to the shape of the earth and its rotation around the sun, different parts of our planet observe different parts of the day at the same time. Time zones are used to ensure that the same clock time corresponds to the same part of the day throughout the world. Traditionally, all time zones were represented as an offset from the Greenwich Mean Time (GMT). As technology advanced and more accurate time keeping devices became available, Universal Coordinated Time (UTC) emerged as the new international time standard. UTC is a form of atomic time and is regularly modified (leap seconds) in order to account for the irregularities of the earth and sun&#8217;s movements.</p>
<h3>How Date/Time Values are Stored</h3>
<p>Different computer systems and software use different methods to store timestamps. The following are a few examples that we encounter frequently.</p>
<h4>1. Ms Outlook Emails</h4>
<p>Ms Outlook stores e-mail dates in UTC and displays them in the end user&#8217;s local time zone. The example e-mail in Figure 1 was sent on 04/16/2012 at 1:08 PM Pacific Daylight Time (PDT). You can see that Outlook displays the sent date in the local time zone (PDT) while it stores it internally as the PR_CLIENT_SUBMIT_TIME value in UTC. On the other hand, the message header shows the same value in Eastern Daylight Time (EDT) since the server through which the message was sent was in EDT.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/04/OutlookMessage_w_Arrows.png"><img width="220" height="150" alt="Dates in an Outlook E-mail" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/OutlookMessage_w_Arrows-220x150.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 1 &#8211; Dates in an Outlook E-mail</p>
<h4>2. Ms Office Documents</h4>
<p>Most Ms Office documents also store internal document metadata in UTC. For example, the Ms Word document in Figure 2 was created on 4/16/2012 at 1:22 PM (PDT). Looking at the docProps\core.xml file reveals that the internal metadata was stored as &#8220;2012-04-16T20:22:00Z&#8221;, which is the same as 04/16/2012 8:22 PM (UTC).</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/04/WordDocument1.png"><img width="220" height="150" alt="Dates in a Word Document" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/WordDocument1-220x150.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 2 &#8211; Dates in a Word Document</p>
<h4>3. File Systems</h4>
<p>File systems (e.g. FAT32, NTFS etc.) also store valuable date/time information about files. However, the way they store this information varies. For example, NTFS stores timestamps in UTC, as the number of 100 nanoseconds since 1/1/1601 (UTC). On the other hand, FAT file systems store date/time values in local time without a time offset. This means that one would have to know the correct time zone in which a file was created and used in order to determine its actual local date/time. For example, imagine the following scenario:</p>
<ul class="list1 list_color_blue">
<li><u>File 1:</u> Created in a FAT32 file system on 06/01/2004 at 9:32:15 PM (PDT)</li>
<li><u>File 2:</u> Created in an NTFS file system on 06/01/2004 at 9:32:15 PM (PDT)</li>
</ul>
<p>If both files were viewed today on a computer in EDT, File 1&#8242;s creation timestamp would appear as 06/01/2004 9:32:15 PM while File 2&#8242;s timestamp would be 06/02/2004 12:32:15 AM. The difference is caused by the fact that FAT32 only stores the date and time without the time offset and even if the file was later viewed in a different time zone such as EDT, its timestamp would not reflect the local time zone in which the file is viewed.</p>
<h3>Potential Issues</h3>
<h4>1. Processing Output</h4>
<p>Applications that store timestamps in UTC typically adjust the displayed value depending on the end user&#8217;s selected time zone. This means that the same document can produce different output during e-Discovery processing depending on the chosen processing time zone value. For example, the processing output for the test e-mail we mentioned earlier would be as follows:</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/04/Email_Output.png"><img width="220" height="150" alt="E-mail Output in Different Time Zones" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Email_Output-220x150.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 3 &#8211; E-mail Output in Different Time Zones</p>
<p>As seen in Figure 3, the e-mail message was printed with a sent time of 4:08 PM when processed using the Eastern Daylight Time (EDT) zone. While this may not always be an issue, it is important to ensure that legal teams understand that the printed date/time values may or may not be the actual local date/time the e-mail was sent by its author or received by its recipient. In some cases, it may be possible to process e-mails in a way that the actual time offset is displayed in the output (see Figure 4).</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/04/Email_w_Offset.png"><img width="220" height="150" alt="E-mail Printed with Time Offset" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Email_w_Offset-220x150.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 4 &#8211; E-mail Printed with Time Offset</p>
<p>Additionally, making sure that the review database always contains fielded information regarding the selected processing time zone and the UTC timestamps for each document would help alleviate this issue.</p>
<h4>2. De-Duplication</h4>
<p>Most e-mail platforms create cryptographic hash values for e-mails based on a combination of several metadata fields such as sent date, author, e-mail subject etc. While calculating these hash values, e-Discovery software should use UTC timestamps rather than local time. Otherwise, e-mail hashes would be a function of local time and would change depending on the selected time zone. This means that two copies of the same e-mail processed in different time zones would not be treated as duplicates.</p>
<h4>3. Date Restrictions</h4>
<p>When documents are culled using date restrictions, the selected time zone may determine whether or not a document falls within the relevant date range. For example, imagine an e-mail that was dated 04/30/2009 9:05 PM (PDT). If a date restriction was performed using Eastern Time, this e-mail would not fall into the date range 1/1/2009 &#8211; 4/30/2009 since the e-mail&#8217;s local date would be interpreted as 5/1/2009 12:05 AM (EDT).</p>
<h4>4. Daylight Saving Time</h4>
<p>Daylight Saving Time (DST) is the practice of advancing clocks forward by one hour in spring and moving them again backwards in autumn so that afternoons have more day light and mornings have less. The start and end dates of DST have changed several times in the past, making it a challenge to determine the actual local timestamps of documents in the past. For example, in order to determine the correct local timestamp of a file dated 3/12/1991 12:45:31 PM (UTC) that originated from Sweden, one would have to know whether or not DST was in effect at that time in Sweden. The <a href="http://www.iana.org/time-zones" title="tz database" target="_blank" rel="nofollow">tz database</a> is a valuable resource which contains the history of local time for many representative regions in the world.</p>
<h3>Conclusion</h3>
<p>We believe that taking the time to ensure that time zones are handled correctly in a project should be at the top of a project manager&#8217;s priority list. Briefly, the following key points should be considered:</p>
<ul class="list2 list_color_blue">
<li>Legal teams should decide how time zones should be handled at the onset of each project and communicate their preference to the e-Discovery service provider as part of e-Discovery processing specifications. It is common practice to normalize all times to UTC and capture each custodian&#8217;s time offset in cases involving multiple time zones</li>
<li>e-Discovery software should normalize document timestamps to UTC while performing de-duplication</li>
<li>The time zone used during e-Discovery processing as well as the UTC timestamps of each file should be included as additional fields in the review database</li>
<li>Effects of time zones and Daylight Savings Time should be considered while constructing date restrictions</li>
<li>Legal teams should be aware of the fact that depending on the chosen time zone, timestamps printed as part of processed documents (e.g. e-mail dates) may or may not reflect the actual local time when the e-mail was sent or received</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/articles/time-zones-in-e-discovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Validating Copy Results Using md5deep</title>
		<link>http://www.meridiandiscovery.com/how-to/validating-copy-results-using-md5deep/</link>
		<comments>http://www.meridiandiscovery.com/how-to/validating-copy-results-using-md5deep/#comments</comments>
		<pubDate>Wed, 28 Mar 2012 22:42:45 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[How-to]]></category>
		<category><![CDATA[e-Discovery]]></category>
		<category><![CDATA[md5deep]]></category>
		<category><![CDATA[robocopy]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1454</guid>
		<description><![CDATA[In our previous post Robocopy in e-Discovery, we wrote about copying electronic evidence using Robocopy and preserving file system metadata. Robocopy is a great tool for copying files, but it does not offer an option to hash the source and destination files. While this may not be necessary for casual personal use, being able to ...]]></description>
			<content:encoded><![CDATA[<p>In our previous post <a href="http://www.meridiandiscovery.com/software/robocopy-in-ediscovery/" title="Robocopy in e-Discovery" target="_blank">Robocopy in e-Discovery</a>, we wrote about copying electronic evidence using Robocopy and preserving file system metadata. Robocopy is a great tool for copying files, but it does not offer an option to hash the source and destination files. While this may not be necessary for casual personal use, being able to confirm that the output files are identical to the source files using cryptographic hashes is crucial when working with electronic evidence.</p>
<p>There are commercial off-the-shelf file copy tools which have this functionality built-in, but they usually lack the flexibility that Robocopy offers. If you are a Robocopy fan, and do not mind a little bit of command line work, follow along and we will show you how to validate Robocopy results using the freely available software package md5deep.</p>
<h3>What is md5deep?</h3>
<p>md5deep is a command line application in the public domain. It can be used to calculate cryptographic hashes (MD5, SHA-1, SHA-256, Tiger192 and Whirlpool) of files. It can walk through directories recursively and calculate the hashes of each encountered file or work off of a text-based file listing. We chose to use md5deep for this post because it is fast, robust and free.</p>
<h3>Required Tools</h3>
<ul class="list2 list_color_blue">
<li>md5deep &#8211; Available for free at <a href="http://md5deep.sourceforge.net" target="_blank" rel="nofollow">http://md5deep.sourceforge.net</a></li>
<li>Your text editor of choice (e.g. UltraEdit, TextPad etc.)</li>
</ul>
<p>When you download md5deep, remember to either copy md5deep.exe to your Windows\System32 folder or add its path to your path system variable so that it can be accessed from anywhere.</p>
<p>This post assumes that:</p>
<ul class="list2 list_color_blue">
<li>You have copied a set of files using Robocopy as outlined in <a href="http://www.meridiandiscovery.com/software/robocopy-in-ediscovery/" title="Robocopy in e-Discovery" target="_blank">this post</a></li>
<li>The copy operation completed successfully without any errors</li>
<li>Your source and destination folder paths were as follows:<br/><br />
<b>Source:</b> D:\MySourceFiles\<br />
<b>Destination:</b> E:\MyDestination\
</li>
</ul>
<h3>Step 1: Calculate the Hashes of The Source Files</h3>
<p>We will use md5deep to calculate the hashes of all files in our input folder (&#8220;D:\MySourceFiles\&#8221;). md5deep outputs the calculated MD5 Hash values to the console. In order to save the output, we will redirect it to a text file using the &#8220;>&#8221; symbol. The steps are as follows:</p>
<ul class="list1 list_color_blue">
<li>As always, make sure that your source is write-protected before accessing it</li>
<li>Open a command prompt at your source folder &#8220;D:\MySourceFiles&#8221;</li>
<li>Issue the following command:<br/><br/>
<code class="code">md5deep -rel * &gt; "C:\Temp\InputHashes.md5"</code>
<p><i>The -rel switch instructs md5deep to enable recursive mode, display a progress indicator and use relative file paths.</i>
</li>
</ul>
<p>This will create a list of MD5 hashes for each file contained in your source folder. The list should look as in the example below:<br />
<a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/InputHashes.png"><img width="220" height="150" alt="Input Files MD5 Hash List" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/InputHashes-220x150.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 1 &#8211; Input Files MD5 Hash List</p>
<p>Note that the resultant file &#8220;InputHashes.md5&#8243; is a UTF-8 encoded text file and non-English characters in the file names were preserved.</p>
<h3>Step 2: Calculate the Hashes of The Output Files</h3>
<p>Similarly, we can calculate the MD5 hashes of the output files using md5deep as follows:</p>
<ul class="list1 list_color_blue">
<li>Open a command prompt at your destination folder &#8220;E:\MyDestination&#8221;</li>
<li>Issue the following command:<br/><br/>
<code class="code">md5deep -rel * &gt; "C:\Temp\OutputHashes.md5"</code>
</li>
</ul>
<h3>Step 3: Compare The Hash Lists</h3>
<p>At this point, you should have two hash lists: &#8220;InputHashes.md5&#8243;, which contains a list of MD5 hashes for the source files, and &#8220;OutputHashes.md5&#8243;, which contains a list of MD5 hashes for the output files. Since we chose the relative file path option while using md5deep, both hash lists should contain the same folder paths. Consequently, if all files were copied correctly, both hash lists should be identical.</p>
<p>We can easily check whether or not this is the case by hashing the hash lists and comparing them. We will use the following commands:</p>
<code class="code">md5deep "C:\Temp\InputHashes.md5" &gt; "C:\Temp\Comparison.txt"
md5deep "C:\Temp\OutputHashes.md5" &gt;&gt; "C:\Temp\Comparison.txt"</code>
<p>Contents of &#8220;Comparison.txt&#8221; should be as follows:<br/><br />
<a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/Comparison.png"><img width="220" height="150" alt="Comparison.txt Contents" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Comparison-220x150.png" /></a> </p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 2 &#8211; Comparison.txt Contents</p>
<p>If both hash values are identical, we can conclude that each output file has the same MD5 hash value as the corresponding source file. If you would like to confirm that the process is working correctly, you can edit &#8220;OutputHashes.md5&#8243; and change one of the MD5 hashes. When you re-run Step 3, the MD5 hash of the output hash list should be different than that of the input hash list.</p>
<h3>What If The Hashes Do Not Match?</h3>
<p>In the event that the two hash lists turn out to be different, you may want to determine which files have different hash values. A quick and easy way to accomplish this is to use a file comparison tool such as <a href="http://www.winmerge.org" title="WinMerge" target="_blank" rel="nofollow">WinMerge</a> (open source) or UltraCompare (commercial). These tools allow two files to be opened side by side (&#8220;InputHashes.md5&#8243; and &#8220;OutputHashes.md5&#8243; in this case) and highlight the differences.</p>
<h3>Couldn&#8217;t We Have Used The Negative Matching Mode In md5deep?</h3>
<p>md5deep has an option that enables negative matching mode. In this mode, the program takes a list of known hashes and identifies files that are outside of that list. For example, opening a command prompt at your destination folder &#8220;E:\MyDestination&#8221; and issuing the following command would create a list of all files in your destination folder that have hashes outside of the input hash list:</p>
<code class="code">md5deep -rx "C:\Temp\InputHashes.md5" * &gt; "C:\Temp\Mismatches.txt"</code>
<p>This looks like a very efficient way to determine which files were not copied correctly. However, if we had used this method, we would have missed two scenarios:</p>
<ol>
<li>Files completely missing from the destination</li>
<li>Files that do not match their source file by MD5 hash, but match another file in the source data set</li>
</ol>
<p>The second scenario may sound far-fetched, but consider the following example: The input folder contains a number of 0-byte files. If a file, which was not originally a 0-byte file, does not get copied correctly and becomes a 0-byte file in the destination, it would not be identified using the negative matching method because its hash matches that of other files in the source data set.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/how-to/validating-copy-results-using-md5deep/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embedded Objects in e-Discovery</title>
		<link>http://www.meridiandiscovery.com/articles/embedded-objects-in-e-discovery/</link>
		<comments>http://www.meridiandiscovery.com/articles/embedded-objects-in-e-discovery/#comments</comments>
		<pubDate>Wed, 21 Mar 2012 18:37:22 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[e-Discovery]]></category>
		<category><![CDATA[embedded objects]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1141</guid>
		<description><![CDATA[We believe that discussing project specifications at the onset of a project and getting clear and complete instructions is the first step in completing an e-Discovery project successfully. One of the questions we regularly ask is whether or not embedded objects should be extracted. Over the years, we have found that most of our new ...]]></description>
			<content:encoded><![CDATA[<p>We believe that discussing project specifications at the onset of a project and getting clear and complete instructions is the first step in completing an e-Discovery project successfully. One of the questions we regularly ask is whether or not embedded objects should be extracted. Over the years, we have found that most of our new clients require an explanation of what embedded objects are and the pros and cons of extracting them.</p>
<p>We typically recommend extracting all compound documents. However, we feel it is important that what this really means is understood clearly and an informed decision is made based on case requirements. We have come up with a few points for you to consider when making such a decision that will hopefully help you determine which route you should take.</p>
<h3>What are Embedded Objects?</h3>
<p>Many file types, including Microsoft Office and Adobe Acrobat files, act as containers and allow other documents to be linked to them or embedded in them. For example, one can embed a file into an Ms Word document by simply dragging it into an open Word document.</p>
<p>Depending on the file type and method used, the embedded document may or may not be directly visible in its parent document. For example, the contents of a single-page Visio drawing inserted into an Excel spreadsheet can be visible when the spreadsheet is viewed, while a ZIP file or an MSG file inserted into a Word document would typically be displayed as an icon and its contents would not be directly visible.</p>
<p>Extracting embedded objects means that the e-Discovery software identifies each linked or embedded document and extracts it (and its children recursively) as separate records during processing. Additionally, a parent/child relationship is established between the container document and the files embedded in it.
</p>
<h3>Advantages of Extracting Embedded Objects</h3>
<ul class="list1 list_color_blue">
<li>
<h4>Completeness:</h4>
<p> In some cases embedded documents would not be processed at all unless embedded objects are extracted. This could result in critical content being missing from your review database and production. Take a look at the following three examples:</p>
<h4>Example 1:  Documents Displayed As Icons</h4>
<p> Contents of documents that are displayed as icons in their parent document would not be processed during e-Discovery unless embedded objects are extracted. For example, the embedded archive (Embedded File1.zip) and the embedded e-mail message (Embedded File2.msg) in Figure 1 below would not be extracted and processed unless embedded objects are extracted. Please note that the embedded archive contains several child documents, which also need to be extracted recursively.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/WordScreenshot_wContents.png"><img width="292" height="190" alt="Word Document with Embedded Objects" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/WordScreenshot_wContents-292x190.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 1 &#8211; Word Document with Embedded Objects</p>
<h4>Example 2: Charts in Ms Office Documents</h4>
<p> When a chart is created in or inserted into an Office document, the underlying data is typically stored as an Excel spreadsheet. While the parent document only displays the resultant chart, the underlying spreadsheet can contain much more data than what is represented in the chart.</p>
<p>The example Powerpoint presentation in Figure 2 contains an embedded chart. When the underlying Excel workbook is opened, it becomes apparent that the workbook contains additional worksheets.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/PPT_w_Embedded_Chart.png"><img width="292" height="190" alt="Powerpoint with Embedded Chart" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/PPT_w_Embedded_Chart-292x190.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 2 &#8211; Powerpoint Document with Embedded Chart</p>
<h4>Example 3: PDF Portfolios</h4>
<p> PDF portfolios contain multiple documents combined into a single PDF file. Documents contained in a PDF portfolio can be in various formats such as spreadsheets, e-mails, Powerpoint presentations etc. For example, one can select certain e-mail messages in Ms Outlook and convert them &#8211; including their attachments &#8211; to a PDF portfolio using the Adobe Acrobat PDFMaker Outlook Addin.</p>
<p>Figure 3 exemplifies a PDF portfolio created directly from Ms Outlook. One of the original e-mails contains an attachment (&#8220;Analysis.xls&#8221;), which was included in the portfolio as an embedded Excel spreadsheet.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/Portfolio.png"><img width="292" height="190" alt="Adobe PDF Portfolio" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Portfolio-292x190.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 3 &#8211; Adobe PDF Portfolio</p>
<p>Extracting embedded objects would ensure that each item in the PDF portfolio is extracted as a separate record and processed.
</li>
<li>
<h4>Fully Searchable Database:</h4>
<p> Even though some documents may be visible in their parent document, they may not be fully searchable. Extracting embedded objects would ensure that each object is tested separately and OCR&#8217;ed if it is found to be missing extractable text.</p>
<h4>Example 1: Images Embedded Into Searchable Documents</h4>
<p> We often run into images embedded in otherwise searchable documents (for example, an excerpt from a scanned page embedded into the first page of an e-mail). This poses an interesting challenge. Most modern e-Discovery software detect non-searchable documents and run them through optical character recognition (OCR) on-the-fly to make them searchable. However, this analysis is usually performed at a document level, or page level. If a page contains some searchable content, it passes as searchable and is not run through OCR.</p>
<p>In the scenario below (see figure 4), we have a scanned image inserted into an otherwise searchable page. Most e-Discovery software would treat this page as a searchable page and would not attempt to OCR it. This would result in the text that could be extracted from the embedded image via OCR being lost. On the other end of the spectrum, if the scanned image is detected and the entire page is OCR&#8217;ed, the accuracy of text that could be extracted from the searchable part of the page would be reduced. Extracting embedded objects is a good option to make such documents searchable. When extracted, the embedded image would become a separate document itself. It would easily be detected as a document without extracted text and OCR&#8217;ed.</p>
<p><a href="http://www.meridiandiscovery.com/base/wp-content/uploads/2012/03/Email.png"><img width="292" height="190" alt="Email with Embedded Image" src="http://www.meridiandiscovery.com/base/wp-content/themes/meridian/cache/images/Email-292x190.png" /></a></p>
<p style="text-align:center; font-size:10px; margin-top:-10px;">Figure 4 &#8211; E-mail with Embedded Inline Image</p>
</li>
<li>
<h4>Native Productions:</h4>
<p> Producing complex documents in native form is a common trend. One of the greatest risks of a native production is producing more than intended. For example, a  spreadsheet or a complex file type can contain embedded documents that can be overlooked during review unless they were extracted as separate database records. Producing such a document in native format would mean giving your opponent data that you haven&#8217;t seen during review.</p>
<p>Extracting embedded objects helps make sure documents can be fully reviewed and minimizes the possibility of  missing hidden content during review.</p>
</li>
</ul>
<h3>Disadvantages of Extracting Embedded Objects</h3>
<ul class="list1 list_color_blue">
<li>
<h4>More Documents to Review:</h4>
<p> When embedded objects are extracted, separate database records are created for each child document. This results in a more crowded database, and potentially more documents to be reviewed. Imagine a Word document with 20 embedded objects: When embedded objects are extracted, the number of database records for this document increases from 1 to 21. If the child documents in turn have objects embedded in them, the number of database records after extraction can be even larger.</li>
<li>
<h4>Duplicative Content:</h4>
<p> Some embedded objects may be entirely visible in their parent document, making the extracted version redundant. Keep in mind that, even though the embedded object may be visible, it may not be searchable (see &#8220;Images embedded into searchable documents&#8221; above). So, this is not always a disadvantage.</li>
<li>
<h4>Incomplete Objects:</h4>
<p> Some embedded objects may not be complete, self-sufficient documents. Consequently, some of the extracted documents may not be directly viewable using their native application. Most modern e-Discovery software can work around this issue and still extract text &#038; metadata from these objects and convert them to TIFF.</li>
<li>
<h4>De-Duplication Issues:</h4>
<p>Until recently, some off-the-shelf e-Discovery processing tools had problems when embedded objects were extracted and de-duplication was performed at the attachment family level. The issue was that some embedded objects were having minute differences in content (and therefore MD5 hash) each time they were extracted. Consequently, two identical compound documents were not being de-duplicated against each other due to the fact that one of their children had a different hash. To our knowledge, most software vendors have resolved this issue by now. That said, it is still worth keeping in mind when working with a new tool or service provider.
</li>
</ul>
<h3>Conclusion</h3>
<p>We are a proponent of extracting all compound documents. It is true that you may end up with more database records and it may take longer to review, tag and endorse the documents, but at least you can be confident that every document in the data set has been made searchable and reviewed. Bear in mind that it is easier to exclude database records from a production than it is to insert new records between existing documents. If you have embedded objects extracted up front, you would have the option to easily exclude them from your production if needed. On the other hand, if you don&#8217;t, and you later find out that some embedded objects should have been separate records, it would be more tedious to extract those objects after the fact and insert them where they belong.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/articles/embedded-objects-in-e-discovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Robocopy in e-Discovery</title>
		<link>http://www.meridiandiscovery.com/software/robocopy-in-ediscovery/</link>
		<comments>http://www.meridiandiscovery.com/software/robocopy-in-ediscovery/#comments</comments>
		<pubDate>Mon, 05 Mar 2012 21:06:00 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Computer Forensics]]></category>
		<category><![CDATA[e-Discovery]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1046</guid>
		<description><![CDATA[Most legal professionals regularly handle electronic evidence in one form or another. Even if you are not an e-Discovery or computer forensics expert, there are steps you can take to make sure you are not spoiling electronic evidence. Most of us are aware of the fact that opening a file usually changes file metadata as ...]]></description>
			<content:encoded><![CDATA[<p>Most legal professionals regularly handle electronic evidence in one form or another. Even if you are not an e-Discovery or computer forensics expert, there are steps you can take to make sure you are not spoiling electronic evidence.</p>
<p>Most of us are aware of the fact that opening a file usually changes file metadata as well as, in some cases, file contents. However, did you know that the mere act of copying a file from one folder to another using Windows Explorer causes the following changes?</p>
<ul class="list1 list_color_blue">
<li>The file system last accessed date of the source file is updated to the present date/time</li>
<li>The copy (destination file) receives the present date/time as its file system creation and last accessed dates</li>
</ul>
<p>File system date/time values are valuable information that can be captured during e-Discovery processing or a forensic examination and can be used to shed light on, among other things, when a document was created, accessed and modified.  You can help preserve this information by utilizing Robocopy to copy files instead of Windows Explorer.</p>
<p>Robust File Copy (Robocopy) is a free command-line replication tool from Microsoft. It has been a part of Windows distributions since Windows Vista, and is available as a separate download for earlier versions of Windows such as Windows 2003 and Windows XP. Even though it can take a large number of parameters and can look intimidating to use at first, it only takes a few minutes to get the hang of it and figure out which parameters suit your purposes the best.<br />
<br/></p>
<h3>Versions of Robocopy</h3>
<div class="table_style">
<table>
<thead>
<tr>
<th><b>Version</b></th>
<th><b>Year</b></th>
<th><b>Source</b></th>
</tr>
</thead>
<tr>
<td>1.7</td>
<td>1997</td>
<td>Windows NT Resource Kit</td>
</tr>
<tr>
<td>1.71</td>
<td>1997</td>
<td>Windows NT Resource Kit</td>
</tr>
<tr>
<td>1.95</td>
<td>1999</td>
<td>Windows 2000 Resource Kit</td>
</tr>
<tr>
<td>1.96</td>
<td>1999</td>
<td>Windows 2000 Resource Kit</td>
</tr>
<tr>
<td>XP010 </td>
<td>2003</td>
<td>Windows 2003 Resource Kit</td>
</tr>
<tr>
<td>XP026 </td>
<td>2005</td>
<td>Distributed with Robocopy GUI v.3.1.2</td>
</tr>
<tr>
<td>XP027 </td>
<td>2008</td>
<td>Bundled with Windows Vista, Server 2008 and later</td>
</tr>
<tr>
<td>6.1</td>
<td>2009</td>
<td>Bundled with Windows 7</td>
</tr>
</table>
</div>
<p>Using Robocopy version XP026 or higher is recommended as some of the options that we will refer to here were not available before that version. Robocopy XP026 can be downloaded as part of <a href="http://download.microsoft.com/download/f/d/0/fd05def7-68a1-4f71-8546-25c359cc0842/UtilitySpotlight2006_11.exe" title="Robocopy GUI 3.1.2" target="_blank">Robocopy GUI v.3.1.2</a>.<br />
<br/></p>
<h3>Robocopy Usage</h3>
<h4>Syntax</h4>
<p>The basic Robocopy syntax is a follows:</p>
<code class="code">robocopy &lt;Source&gt; &lt;Destination&gt; [&lt;File&gt;[ ...]] [&lt;Options&gt;]</code>
<table>
<tr>
<td><b>Source:</b></td>
<td style="padding-left:10px;">The source directory path</td>
</tr>
<tr>
<td><b>Destination:</b></td>
<td style="padding-left:10px;">Destination directory path</td>
</tr>
<tr>
<td><b>File:</b></td>
<td style="padding-left:10px;">Files or file types to be copied (e.g. &#8220;*.txt&#8221; to copy files with the &#8220;.txt&#8221; extension. Defaults to &#8220;*.*&#8221; if not specified.)</td>
</tr>
<tr>
<td><b>Options:</b></td>
<td style="padding-left:10px;">Options to be used during the copy operation.</td>
</tr>
</table>
<h4>Notable Options</h4>
<table style="border: 1px solid #dddddd;">
<tr style="border: 1px solid #dddddd;">
<td><b>/copy:DAT</b></td>
<td style="padding-left:10px;">This option tells robocopy to copy the file <u>D</u>ata, <u>A</u>ttributes and <u>T</u>ime stamps. Depending on the scenario, Robocopy can also copy NTF<u>S</u> access control list, <u>O</u>wner information and A<u>u</u>diting information (/copy:DATSOU)</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/dcopy:T</b></td>
<td style="padding-left:10px;">This option is used to copy directory time stamps</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/e</b></td>
<td style="padding-left:10px;">This option copies subfolders, including empty ones</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/r:3</b></td>
<td style="padding-left:10px;">This is for Robocopy to retry 3 times in the event of a failed copy</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/w:2</b></td>
<td style="padding-left:10px;">This is to wait for 2 seconds between each retry attempt</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/xj</b></td>
<td style="padding-left:10px;">Excludes junction points (see section below)</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/ndl</b></td>
<td style="padding-left:10px;">Prevents directory names from being logged</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/np</b></td>
<td style="padding-left:10px;">Prevents the progress information from being displayed or logged</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/tee</b></td>
<td style="padding-left:10px;">Writes the status output to the console window in addition to the log file</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/ts</b></td>
<td style="padding-left:10px;">Includes source file time stamps in the log</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/unilog+:[log file path]</b></td>
<td style="padding-left:10px;">Keeps a written log of the copy operation in Unicode (appends the output to the existing log file). This should be preferred to the &#8220;/log+&#8221; switch if there is a possibility that folder/file names can contain non-ANSI characters</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/maxage:yyyymmdd</b></td>
<td style="padding-left:10px;">Excludes files older than date by last modification date</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/minage:yyyymmdd</b></td>
<td style="padding-left:10px;">Excludes files newer than date by last modification date</td>
</tr>
<tr style="border: 1px solid #dddddd;">
<td><b>/mt:N</b></td>
<td style="padding-left:10px;">Performs multi-threaded copy operation using N (1-128) threads. Applies only to Windows Server 2008 R2 and Windows 7</td>
</tr>
</table>
<p>The &#8220;/ndl&#8221; and &#8220;/np&#8221; switches are used here to control how the log file is formatted so that a file listing with full directory paths can be obtained. This listing can be fed into other scripts or software for further processing, such as generating MD5 hash values.</p>
<p><i>For a full list of options and more detailed information, visit <a href="http://technet.microsoft.com/en-us/library/cc733145%28v=ws.10%29.aspx" title="Robocopy" target="_blank" rel="nofollow">http://technet.microsoft.com/en-us/library/cc733145%28v=ws.10%29.aspx</a></i></p>
<h4>Sample log file</h4>
<pre><code class="code" style="font-size:11px;">-----------------------------------------------------------------------------
 ROBOCOPY     ::     Robust File Copy for Windows     ::     Version XP026
-----------------------------------------------------------------------------

Started : Mon Mar 05 14:39:25 2012

 Source : D:\MySourceFiles\
   Dest : E:\MyDestination\

  Files : *.*

Options : *.* /TS /NDL /TEE /S /E /COPY:DAT /DCOPY:T /NP /R:3 /W:2 

----------------------------------------------------------------------------

   New File  		      19 2012/03/05 20:44:40	D:\MySourceFiles\File1.txt
   New File  		      21 2012/03/05 20:44:43	D:\MySourceFiles\File2_日本語.txt
   New File  		       7 2012/03/05 20:44:34	D:\MySourceFiles\Folder1\File3.txt

----------------------------------------------------------------------------

              Total    Copied   Skipped  Mismatch    FAILED    Extras
   Dirs :         2         1         1         0         0         0
  Files :         3         3         0         0         0         0
  Bytes :        47        47         0         0         0         0
  Times :   0:00:00   0:00:00                       0:00:00   0:00:00

  Ended : Mon Mar 05 14:39:25 2012</code></pre>
<h4>Examples</h4>
<p>The following command would copy all files/folders from the file path &#8220;D:\MySourceFiles&#8221; to the file path &#8220;E:\MyDestination&#8221; and create a copy log at &#8220;E:\CopyLogs\MyCopyLog.log&#8221;</p>
<code class="code">robocopy "D:\MySourceFiles" "E:\MyDestination" /copy:DAT /dcopy:T /e /r:3 /w:2 /ndl /np /tee /ts /unilog+:"E:\CopyLogs\MyCopyLog.log"</code>
<p>The following command would copy only the files with &#8220;.txt&#8221;, &#8220;.jpg&#8221; and &#8220;.tif&#8221; extensions:</p>
<code class="code">robocopy "D:\MySourceFiles" "E:\MyDestination" *.txt *.jpg *.tif /copy:DAT /dcopy:T /e /r:3 /w:2 /ndl /np /tee /ts /unilog+:"E:\CopyLogs\MyCopyLog.log"</code>
<p>The following command would copy only the files with &#8220;.txt&#8221;, &#8220;.jpg&#8221; and &#8220;.tif&#8221; extensions that have last modification dates within the 02/01/2010 &#8211; 04/30/2010 date range (not inclusive):</p>
<code class="code">robocopy "D:\MySourceFiles" "E:\MyDestination" *.txt *.jpg *.tif /copy:DAT /dcopy:T /MAXAGE:20100201 /MINAGE:20100430 /e /r:3 /w:2 /ndl /np /tee /ts /unilog+:"E:\CopyLogs\MyCopyLog.log"</code>
<h4>Windows Vista, Windows 7 and NTFS Junction Points</h4>
<p>NTFS Junction points are a feature of the New Technology File System (NTFS) and allow symbolic links to a directory to be created. These symbolic links act as an alias of that directory. </p>
<p>Starting with Windows Vista, Microsoft changed the way certain critical folders were stored on the hard drive. For backward compatibility, the old folder names were also retained as junction points. For example, the &#8220;C:\Documents and Settings&#8221; location does not actually exist in a Windows Vista system, but points to the actual &#8220;C:\Users&#8221; folder.</p>
<p>In certain instances, a junction point can redirect to a parent folder, causing Robocopy to fall into an infinite loop. To prevent this from happening, you can use the &#8220;/XJ&#8221; switch to prevent Robocopy from parsing NTFS junction points.</p>
<p>The following command would copy all files/folders from the C: drive of a Windows 7 system to the file path &#8220;E:\MyDestination&#8221; and create a copy log at &#8220;E:\CopyLogs\MyCopyLog.log&#8221;</p>
<code class="code">robocopy "C:" "E:\MyDestination" /copy:DAT /dcopy:T /e /r:3 /w:2 /ndl /np /tee /ts /xj /unilog+:"E:\CopyLogs\MyCopyLog.log"</code>
<p style="font-size:10px;"><i>Robocopy is a trademark of Microsoft. Windows is a registered trademark of Microsoft. Other products or services may be trademarks or registered trademarks of their respective companies.</i></p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/software/robocopy-in-ediscovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Concordance Field Naming Requirements</title>
		<link>http://www.meridiandiscovery.com/software/concordance-field-naming-requirements/</link>
		<comments>http://www.meridiandiscovery.com/software/concordance-field-naming-requirements/#comments</comments>
		<pubDate>Mon, 27 Feb 2012 19:16:20 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Concordance]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=1020</guid>
		<description><![CDATA[Subsequent to releasing our Concordance&#174; load file (.DAT) to Concordance Database (.DCB) CPL last week (original post), we have received a few questions regarding Concordance field naming conventions. Concordance field types and field naming requirements are as follows: Concordance Field Naming Requirements Field names have to: As of this writing, a Concordance database can contain ...]]></description>
			<content:encoded><![CDATA[<p>Subsequent to releasing our Concordance<sup style="font-size:8px;">&reg;</sup> load file (.DAT) to Concordance Database (.DCB) CPL last week (<a href="http://www.meridiandiscovery.com/software/concordance-cpl-to-create-database-dcb-from-load-file/" title="Concordance CPL to Create Database (DCB) from Load File" target="_blank">original post</a>), we have received a few questions regarding Concordance field naming conventions. Concordance field types and field naming requirements are as follows:<br />
<br/></p>
<h3>Concordance Field Naming Requirements</h3>
<p>Field names have to:</p>
<ul class="list2 list_color_blue">
<li>Be no more than 12 characters long</li>
<li>Start with a letter</li>
<li>Continue with letters, numbers or the underscore (&#8220;_&#8221;) character in the middle</li>
</ul>
<p>As of this writing, a Concordance database can contain no more than 250 fields.<br />
<br/></p>
<h3>Concordance Field Types</h3>
<div class="table_style">
<table>
<thead>
<tr>
<th scope="col"><b>Field Type</b></th>
<th scope="col"><b>Capacity</b></th>
<th scope="col"><b>Comments</b></th>
</tr>
</thead>
<tr>
<td>Paragraph</td>
<td>12 million characters</td>
<td>Indexed by default, allows rich text</td>
</tr>
<tr>
<td>Text</td>
<td>Up to 60 characters</td>
<td>Keyed, but not indexed by default</td>
</tr>
<tr>
<td>Numeric</td>
<td>Up to 20 digits</td>
<td>Keyed by default</td>
</tr>
<tr>
<td>Date</td>
<td>8 bytes</td>
<td>Keyed by default</td>
</tr>
</table>
</div>
<p style="font-size:10px;"><i>Concordance is a registered trademark of LexisNexis. Other products or services may be trademarks or registered trademarks of their respective companies.</i></p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/software/concordance-field-naming-requirements/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Concordance CPL to Create Database (DCB) from Load File</title>
		<link>http://www.meridiandiscovery.com/software/concordance-cpl-to-create-database-dcb-from-load-file/</link>
		<comments>http://www.meridiandiscovery.com/software/concordance-cpl-to-create-database-dcb-from-load-file/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 22:22:39 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[Concordance]]></category>
		<category><![CDATA[CPL]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=962</guid>
		<description><![CDATA[LexisNexis Concordance&#174; is currently one of the most popular discovery management software and many service providers and legal departments deal with Concordance Load Files on a regular basis. In some cases, the Concordance load file is received without an accompanying database structure. Unless there is an existing database that the load file will be imported ...]]></description>
			<content:encoded><![CDATA[<p>LexisNexis Concordance<sup style="font-size:8px;">&reg;</sup> is currently one of the most popular discovery management software and many service providers and legal departments deal with Concordance Load Files on a regular basis. In some cases, the Concordance load file is received without an accompanying database structure. Unless there is an existing database that the load file will be imported into, the person performing the import has to create a Concordance Database (DCB).</p>
<p>The above scenario is usually not an issue as long as the load file contains only a few fields. However, manually creating an e-Discovery database to accommodate a 100+ field Concordance Load File can be a tedious task. To make things a bit easier, we have created a Concordance program called &#8220;Create_DCB&#8221; using the Concordance Programming Language (CPL). The CPL reads the header row of a Concordance load file (DAT), extracts the field names and creates a Concordance Database (DCB) with matching fields. It requires that your load file starts with a header row which contains the field names.</p>
<p><u>Instructions:</u></p>
<ul class="list2 list_color_blue">
<li>Download the program using one of the hyperlinks below and save it to your computer.</li>
<li>Launch Concordance and start the program using the &#8220;File/Begin program&#8230;&#8221; menu item. You do not need to have a database open.</li>
<li>Choose the Concordance Load File that you would like to work with. Make sure it has a header row. The load file can be ANSI, UTF-8 or UTF-16 encoded depending on the Concordance version.</li>
<li>If your load file is not using the standard Concordance delimiters, you will be prompted to specify the ASCII codes of the delimiters.</li>
<li>The program will check each field name to make sure it complies with Concordance field name requirements and will prompt you if it runs into anything problematic.</li>
<li>Specify the destination path where the new database should be saved.</li>
<li>Finally, you will be taken to your new database in edit mode where you can make changes to the database structure. For example, you can change the types of certain fields or specify an Image field.</li>
</ul>
<p>This program is available for download free of charge. Feel free to give it a try and let us know your thoughts.</p>
<div class="toggle">
<h4 class="toggle_title">View Change Log</h4>
<div class="toggle_content">
<h4>v1.27 (Concordance v9 Only)</h4>
<p><b>Released</b>: 03/22/2012</p>
<ul>
<li>Created database is now closed in Concordance v9 before the program exits. This was done to work around a Concordance v9 issue that prevents the end user from accessing certain Concordance menus when a database is opened programmatically via CPL. </li>
</ul>
</div>
</div>
<h4>For Concordance 8:</h4>
<p><i><span class="icon_text icon_download"><a href="http://www.meridiandiscovery.com/downloads/Create_DCB_v1.26.cpl+for+Concordance+v8">Create_DCB_v1.26.cpl for Concordance v8</a></span></i></p>
<h4>For Concordance 9:</h4>
<p><i><span class="icon_text icon_download"><a href="http://www.meridiandiscovery.com/downloads/Create_DCB_v1.27.cpl+for+Concordance+v9">Create_DCB_v1.27.cpl for Concordance v9</a></span></i></p>
<h4>For Concordance 10:</h4>
<p><i><span class="icon_text icon_download"><a href="http://www.meridiandiscovery.com/downloads/Create_DCB_v1.26.cpl+for+Concordance+v10">Create_DCB_v1.26.cpl for Concordance v10</a></span></i><br />
<br/></p>
<h4>Free Software Updates</h4>
<p>Would you like to receive e-mail updates when a new version of this CPL becomes available? Leave us your e-mail address below and we will keep you updated.<br />
<!-- Begin MailChimp Signup Form --></p>
<div id="mc_embed_signup">
<form action="http://meridiandiscovery.us4.list-manage.com/subscribe/post?u=79cf752bfd5282ec734c24456&amp;id=a1298a3819" method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank">
<input type="email" value="" name="EMAIL" class="email" id="mce-EMAIL" placeholder="email address" required>
<div class="clear">
<input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="button"></div>
</form>
</div>
<p><!--End mc_embed_signup--></p>
<p style="font-size:10px;"><i>Concordance is a registered trademark of LexisNexis. Other products or services may be trademarks or registered trademarks of their respective companies.</i></p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/software/concordance-cpl-to-create-database-dcb-from-load-file/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Meridian Discovery Launches New Website</title>
		<link>http://www.meridiandiscovery.com/announcements/meridian-discovery-launches-new-website/</link>
		<comments>http://www.meridiandiscovery.com/announcements/meridian-discovery-launches-new-website/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 00:54:09 +0000</pubDate>
		<dc:creator>Arman Gungor</dc:creator>
				<category><![CDATA[Announcements]]></category>

		<guid isPermaLink="false">http://www.meridiandiscovery.com/?p=904</guid>
		<description><![CDATA[We decided to refresh our website a few months ago and the new version is finally live. We hope that you will find our new design not only aesthetically appealing, but also easy to navigate. Please feel free to contact us with any comments.]]></description>
			<content:encoded><![CDATA[<p>We decided to refresh our website a few months ago and the new version is finally live. We hope that you will find our new design not only aesthetically appealing, but also easy to navigate. Please feel free to contact us with any comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.meridiandiscovery.com/announcements/meridian-discovery-launches-new-website/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

