Sort order with Hadoop MapRed Sort order with Hadoop MapRed hadoop hadoop

Sort order with Hadoop MapRed


If you are using the older API (mapred.*), then set the OutputKeyComparatorClass in the job conf:

jobConf.setOutputKeyComparatorClass(ReverseComparator.class);

ReverseComparator can be something like this:

static class ReverseComparator extends WritableComparator {        private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();        public ReverseComparator() {            super(Text.class);        }        @Override        public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {            try {                return (-1)* TEXT_COMPARATOR                        .compare(b1, s1, l1, b2, s2, l2);            } catch (IOException e) {                throw new IllegalArgumentException(e);            }        }        @Override        public int compare(WritableComparable a, WritableComparable b) {            if (a instanceof Text && b instanceof Text) {                return (-1)*(((Text) a)                        .compareTo((Text) b)));            }            return super.compare(a, b);        }    }

In the new API (mapreduce.*), I think you need to use the Job.setSortComparator() method.


This one is almost the same as above, just looks a bit simpler

class MyKeyComparator extends WritableComparator {    protected DescendingKeyComparator() {        super(Text.class, true);    }    @SuppressWarnings("rawtypes")    @Override    public int compare(WritableComparable w1, WritableComparable w2) {        Text key1 = (Text) w1;        Text key2 = (Text) w2;                  return -1 * key1.compareTo(key2);    }}

Then add it it to the job

job.setSortComparatorClass(MyKeyComparator.class);

Text key1 = (Text) w1;            Text key2 = (Text) w2; 

you can change the above text type as per ur use.