File names are stored as strings in almost every operating system and database management system. While this works well in most cases, it causes files with names containing numerals to be sorted counter intuitively. For example, contents of a folder containing 7 files with numeric suffixes would ordinarily look as follows:
Figure 1 – Files Sorted Alphabetically
In scenarios where the order of the files is crucial (e.g. in the legal industry), end users typically pad the file names with zeros so that they are ordered correctly when sorted alphabetically. For example:
Figure 2 – Files with Zero Padding
What is Windows Numerical Sort?
The Shell team at Microsoft at some point decided to improve things a bit and implemented a new way of comparing Unicode strings that contain numerals (see StrCmpLogicalW). The change took effect after Windows 2000, so operating systems such as Windows Server 2003, Windows XP, Windows Vista and Windows 7 sort numerals in folder and file names according to their numeric value. For example, our example folder would look as follows in Windows XP using Windows numerical sort:
Issues Associated with Windows Numerical Sort
While this seems logical and may be helpful to most people, we believe that it brings new issues, especially in the legal industry.
1. Compatibility with e-Discovery and Computer Forensics Software:
Imagine a lawyer organizing exhibits to be processed to TIFF, endorsed and produced. Looking at the files in Windows Explorer, he would naturally assume that the files would be processed in the order as he sees them on his computer. However, computer forensics and e-discovery tools do not implement Microsoft’s sort algorithm, and treat the file and folder names as strings while sorting. Consequently, files would be processed and numbered in a different order than what the attorney had anticipated. Had Windows sorted the files without any special handling, the attorney or litigation support team would have noticed the incorrect sort order and compensated for it by correctly padding the file names or applying a custom sort order.
2. Consistency within the Operating System:
Even though Windows Explorer takes advantage of the StrCmpLogicalW API and sorts files and folders with names containing numerals in a logical manner, other areas of the operating system (such as the command line interface) still use the traditional sort method, causing inconsistencies in the way files are displayed in different parts of the same operating system. Please see Figure 4 below for a comparison of how Windows Explorer and the Command Line Interface (CLI) display the same set of files.
3. Consistency among Operating Systems:
Microsoft’s proprietary sort algorithm does not match how files are displayed in other operating systems such as Linux and Mac OS. Furthermore, Microsoft has changed the StrCmpLogicalW API in different versions of its operating systems such as Windows XP, Windows Vista and Windows 7. Consequently, the way files are displayed in Windows Explorer varies slightly among Microsoft’s own operating systems.
How to Disable Windows Numerical Sort
Luckily, starting with Windows XP SP-1, Microsoft has made available a registry key that can suppress the use of StrCmpLogicalW API, turning off Windows numerical sort and reverting Windows Explorer to treating file names as strings. The registry key is as follows:
The value of the NoStrCmpLogical (DWORD) key should be set to 1 to prevent Windows XP and later versions from using Windows numerical sort. The Microsoft Support Website provides additional details about this issue. Please note that the above change requires a restart or log off to take effect. Remember to back-up your registry before making any changes.