Java - Sort Strings like Windows Explorer Java - Sort Strings like Windows Explorer windows windows

Java - Sort Strings like Windows Explorer


This is my second try to answer this. I used http://www.interact-sw.co.uk/iangblog/2007/12/13/natural-sorting as a start. Unfortunatly I think I found there problems as well. But I think in my code these problems are correctly adressed.

Info: Windows Explorer uses the API function StrCmpLogicalW() function to do its sorting. There it is called natural sort order.

So here is my unterstanding of the WindowsExplorerSort - Algorithm:

  • Filenames are compared part wise. As for now I identified the following parts: numbers, '.', spaces and the rest.
  • Each number within the filename is considered for a possible number compare.
  • Numbers are compared as numbers but if they are equal, the longer base string comes first. This happens with leading zeros.
    • filename00.txt, filename0.txt
  • If one compares a number part with a non number part, it will be compared as text.
  • Text will be compared case insensitive.

This list is based partly on try and error. I increased the number of test filenames, to adress more of the in comments mentioned pitfalls and the result was checked against a Windows Explorer.

So here is the output of this:

filenamefilename 00filename 0filename 01filename.jpgfilename.txtfilename00.jpgfilename00a.jpgfilename00a.txtfilename0filename0.jpgfilename0a.txtfilename0b.jpgfilename0b1.jpgfilename0b02.jpgfilename0c.jpgfilename01.0hjh45-test.txtfilename01.0hjh46filename01.1hjh45.txtfilename01.hjh45.txtFilename01.jpgFilename1.jpgfilename2.hjh45.txtfilename2.jpgfilename03.jpgfilename3.jpg

The new comparator WindowsExplorerComparator splits the filename in the already mentioned parts and does a part wise comparing of two filenames. To be correct, the new comparator uses Strings as its input so one has to create an adaptor Comparator like

new Comparator<File>() {    private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();    @Override    public int compare(File o1, File o2) {;        return NATURAL_SORT.compare(o1.getName(), o2.getName());    }}

So here is the new Comparators source code and its test:

import java.io.File;import java.util.ArrayList;import java.util.Arrays;import java.util.Collections;import java.util.Comparator;import java.util.Iterator;import java.util.List;import java.util.regex.Matcher;import java.util.regex.Pattern;public class WindowsSorter {    public static void main(String args[]) {        //huge test data set ;)        List<File> filenames = Arrays.asList(new File[]{new File("Filename01.jpg"),            new File("filename"), new File("filename0"), new File("filename 0"),            new File("Filename1.jpg"), new File("filename.jpg"), new File("filename2.jpg"),             new File("filename03.jpg"), new File("filename3.jpg"), new File("filename00.jpg"),            new File("filename0.jpg"), new File("filename0b.jpg"), new File("filename0b1.jpg"),            new File("filename0b02.jpg"), new File("filename0c.jpg"), new File("filename00a.jpg"),            new File("filename.txt"), new File("filename00a.txt"), new File("filename0a.txt"),            new File("filename01.0hjh45-test.txt"), new File("filename01.0hjh46"),            new File("filename2.hjh45.txt"), new File("filename01.1hjh45.txt"),            new File("filename01.hjh45.txt"), new File("filename 01"),            new File("filename 00")});        //adaptor for comparing files        Collections.sort(filenames, new Comparator<File>() {            private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();            @Override            public int compare(File o1, File o2) {;                return NATURAL_SORT.compare(o1.getName(), o2.getName());            }        });        for (File f : filenames) {            System.out.println(f);        }    }    public static class WindowsExplorerComparator implements Comparator<String> {        private static final Pattern splitPattern = Pattern.compile("\\d+|\\.|\\s");        @Override        public int compare(String str1, String str2) {            Iterator<String> i1 = splitStringPreserveDelimiter(str1).iterator();            Iterator<String> i2 = splitStringPreserveDelimiter(str2).iterator();            while (true) {                //Til here all is equal.                if (!i1.hasNext() && !i2.hasNext()) {                    return 0;                }                //first has no more parts -> comes first                if (!i1.hasNext() && i2.hasNext()) {                    return -1;                }                //first has more parts than i2 -> comes after                if (i1.hasNext() && !i2.hasNext()) {                    return 1;                }                String data1 = i1.next();                String data2 = i2.next();                int result;                try {                    //If both datas are numbers, then compare numbers                    result = Long.compare(Long.valueOf(data1), Long.valueOf(data2));                    //If numbers are equal than longer comes first                    if (result == 0) {                        result = -Integer.compare(data1.length(), data2.length());                    }                } catch (NumberFormatException ex) {                    //compare text case insensitive                    result = data1.compareToIgnoreCase(data2);                }                if (result != 0) {                    return result;                }            }        }        private List<String> splitStringPreserveDelimiter(String str) {            Matcher matcher = splitPattern.matcher(str);            List<String> list = new ArrayList<String>();            int pos = 0;            while (matcher.find()) {                list.add(str.substring(pos, matcher.start()));                list.add(matcher.group());                pos = matcher.end();            }            list.add(str.substring(pos));            return list;        }    }}


If what you're sorting is or can be represented as a Collection of Files, you might want to take a look at the Apache Commons IO library NameFileComparator class. This provides several pre-built comparators that you can probably leverage to accomplish what you're looking for. For example, the NAME_INSENSITIVE_COMPARATOR should do what you want.

List<File> filenames = Arrays.asList(new File[] {        new File("Filename01.jpg"),         new File("Filename1.jpg"),         new File("filename.jpg"),        new File("filename2.jpg"),        new File("filename03.jpg"),        new File("filename3.jpg")});Collections.sort(filenames, NameFileComparator.NAME_INSENSITIVE_COMPARATOR);for (File f : filenames) {    System.out.println(f);}

Output:

filename.jpgFilename01.jpgfilename03.jpgFilename1.jpgfilename2.jpgfilename3.jpg


Switch the signs of the first -1 and 1 in the compare method:

if (Character.isDigit(ch1)){    result = Character.isDigit(ch2) ? compareNumbers() : 1;}else if (Character.isLetter(ch1)){    result = Character.isLetter(ch2) ? compareOther(true) : 1;}

These determine the ordering when the first string has a number but the second one does not, or the first one does not but the second one does.