Word Last 10 Authors Metadata in Computer Forensics

By November 6, 2014Articles

Microsoft Office documents typically contain a great amount of metadata, some of which can be instrumental in computer forensics. While e-Discovery and computer forensics software can handle extracting and displaying most of the metadata, I found that a crucial piece of information is usually not extracted: Microsoft Word last 10 authors — also known as Word save history.

What is Word Last 10 Authors?

Certain versions of Microsoft Word such as Word 8.0 (Word 97) through Word 10.0 (Word 2002) store the names of the last 10 people who edited the document as well as the file locations. This information is not displayed to the end user through the Microsoft Word user interface, and according to the Microsoft Support website, this is an automatic feature that cannot be disabled (see WD97: How to Minimize Metadata in Microsoft Word Documents [KB 223790]). The following is an example of what may be found in the Word last 10 authors metadata (labels and numbers added for clarity, test data used for demonstrative purposes):

1 – Author: johnd Path: D:\Documents and Settings\mdd.LAB\Desktop\Sample_v2.doc
2 – Author: johnd Path: D:\DOCUME~1\mdd.LAB\LOCALS~1\Temp\AutoRecovery save of Sample_v2.asd
3 – Author: johnd Path: D:\Documents and Settings\mdd.LAB\Desktop\Sample_v2.doc
4 – Author: johnd Path: D:\Documents and Settings\mdd.LAB\Desktop\Sample_v2.doc
5 – Author: jdoe Path: C:\WINDOWS\DESKTOP\Sample_v2.doc
6 – Author: jdoe Path: C:\WINDOWS\DESKTOP\Sample_v3.doc
7 – Author: jdoe Path: C:\WINDOWS\DESKTOP\Sample_v3.doc
8 – Author: jdoe Path: C:\WINDOWS\DESKTOP\Sample_v3.doc
9 – Author: jdoe Path: C:\WINDOWS\DESKTOP\Sample_v3.doc
10 – Author: jwhite Path: C:\WINDOWS\DESKTOP\Sample_v3.doc

As you can imagine, sending out a document with such a revision log can sometimes be problematic (see Richard M. Smith’s posts on the Blair Document and Microsoft’s 1999 Annual Report—original links appear to be dead at this point). On the other hand, such information can be a gold mine for a computer forensics expert.

Extracting Word Last 10 Authors Metadata

Word Documents containing Word Last 10 Authors Metadata are Object Linking and Embedding (OLE) compound files as specified by the Microsoft Compound File Binary File Format (CFB). The following two Microsoft documents outline the Compound File Binary File Format as well as the Word Binary File Format.

Briefly, extracting the Word last 10 authors metadata requires locating the File Information Block (FIB) and reading the fcSttbSavedBy, lcbSttbSavedBy and fWhichTblStm values. The fcSttbSavedBy and lcbSttbSavedBy values specify the offset in the Table Stream where the SttbSavedBy structure — containing the save history of the file — is located and the size of the SttbSavedBy structure, while the fWhichTblStm bit indicates the Table Stream the FIB is referring to. Depending on the value of the fWhichTblStm bit, the 0Table or 1Table Stream is read and the SttbSavedBy structure is extracted using the fcSttbSavedBy and lcbSttbSavedBy values.

The SttbSavedBy structure is a string table (STTB structure) which contains string pairs indicating the name of the author who saved the document and the path and name of the saved file. Parsing the SttbSavedBy structure reveals the save history of the document, also known as Word last 10 authors metadata.

In order to do the parsing, I wrote a Python script which utilizes the olefile Python package to read the Table Stream. I also added an optional ‘-m’ switch which outputs olefile’s metadata dump. Feel free to give it a try and let me know your thoughts.

Download Link

Word Last 10 Authors Metadata Parser

Free Software Updates

Would you like to receive e-mail updates when a new version of this script becomes available? Leave us your e-mail address below and we will keep you updated.



If you prefer Perl, check out Harlan Carvey’s related post on MetaData and eDiscovery.

Conclusion

Some of the file types we regularly deal with can be very complex, and may contain hidden metadata. It is important for computer forensics experts to understand the underlying structure of the electronic evidence they are working with so that they can validate the results of the tools they are using as well as go beyond what the tools can accomplish.

Arman Gungor

About Arman Gungor

Arman Gungor is a certified computer forensic examiner (CCE) and an adept e-Discovery expert with over 21 years of computer and technology experience. Arman has been appointed by courts as a neutral computer forensics expert as well as a neutral e-Discovery consultant. His electrical engineering background gives him a deep understanding of how computer systems are designed and how they work.