Which key class is suitable for secondary sort?

java sorting hadoop mapreduce

I was running into this situation all the time and getting tired of writing custom composite key classes. I wrote a generic Tuple class which is a list of objects and can act as a composite key. The list may contain arbitrary number of objects of Java primitive wrapper types. It implements WritableComparable. The source can be viewed here

https://github.com/pranab/chombo/blob/master/src/main/java/org/chombo/util/Tuple.java

java sorting hadoop mapreduce

I am not able to understand the question. I do have a working copy SecondarySort, which prints the max value from the list of values.

https://github.com/kapild/hadoop-examples/tree/master/src/SecondarySort

java sorting hadoop mapreduce

You need to change the way keys repartitioned and grouped, and thisbasicakly means that you put more than 1 data type in keys, whole overriding the comparator method for partitioning and grouping....

-You can serialize/deserialize your keys, and deal with input data as objects or beans if you want strongly typed , robust code for secondary sorting...

-for simpler scenarios, just put a "#" sign between the values!

There is a great high level article on this here :

http://pkghosh.wordpress.com/2011/04/13/map-reduce-secondary-sort-does-it-all/

CodeHunter

Which key class is suitable for secondary sort?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last