MapReduce (Python) - How to sort reducer output for Top-N list?
I believe you can use the collections.Counter class here:
Example: (modified from your code)
#!/usr/bin/pythonimport sysimport collectionscounter = collections.Counter()for line in sys.stdin: k, v = line.strip().split("\t", 2) counter[k] += int(v)print counter.most_common(10)
The collections.Counter()
class implements this exact use-case and many other common use-cases around counting things and collecting various stats, etc.
8.3.1. Counter objects A counter tool is provided to support convenient and rapid tallies. For example: