What is the cleanest way to do a sort plus uniq on a Python list? What is the cleanest way to do a sort plus uniq on a Python list? python python

What is the cleanest way to do a sort plus uniq on a Python list?


my_list = sorted(set(my_list))


# Python ≥ 2.4# because of (generator expression) and itertools.groupby, sortedimport itertoolsdef sort_uniq(sequence):    return (x[0] for x in itertools.groupby(sorted(sequence)))

Faster:

import itertools, operatorimport sysif sys.hexversion < 0x03000000:    mapper= itertools.imap # 2.4 ≤ Python < 3else:    mapper= map # Python ≥ 3def sort_uniq(sequence):    return mapper(        operator.itemgetter(0),        itertools.groupby(sorted(sequence)))

Both versions return an generator, so you might want to supply the result to the list type:

sequence= list(sort_uniq(sequence))

Note that this will work with non-hashable items too:

>>> list(sort_uniq([[0],[1],[0]]))[[0], [1]]


The straightforward solution is provided by Ignacio—sorted(set(foo)).

If you have unique data, there's a reasonable chance you don't just want to do sorted(set(...)) but rather to store a set all the time and occasionally pull out a sorted version of the values. (At that point, it starts sounding like the sort of thing people often use a database for, too.)

If you have a sorted list and you want to check membership on logarithmic and add an item in worst case linear time, you can use the bisect module.

If you want to keep this condition all the time and you want to simplify things or make some operations perform better, you might consider blist.sortedset.