Is it possible to remove duplicates from a sorted list in less than O(n) time? Is it possible to remove duplicates from a sorted list in less than O(n) time? arrays arrays

Is it possible to remove duplicates from a sorted list in less than O(n) time?


In general, no. Imagine a list of N duplicates. You would have to make N-1 removals, hence O(N).

If you specify a particular data structure with better than O(1) removal of elements, then there might better way for certain sorts of inputs.

Even if you can efficiently remove a range of elements in O(1), and it takes O(1) time to find a duplicate - imagine a list where there are N/2 pairs of duplicates. You'll still have to do N/2 searches and remove N/2 ranges, both of which are O(N).

(there's also a bit of ambiguity as the question title is 'remove duplicates', but the body is specific to removing one range)

If the list resulting from your sort has the following representation - each node has a value, and an occurrence count for that, then removing the duplications for one value will trivially set the count to 1 for that node. ( A skip list probably has similar characteristics, assuming a decent garbage collected environment where there's no cost to reclaiming memory), so that would be O(1) for one duplication. If you need to remove all duplicates from the list, it would still be O(N).


In general there is not, because you can always construct a case where you have O(n) (a list with no duplicates). If you start making assumptions on the data however (for instance that there are at most log n distinct elements), you may get something better (I'm not sure in this particular case though).

This does of course assume that you have some way of doing efficient "bulk removes", meaning that you can remove any range of equal elements in O(1), regardless of its size.


There cant be

as for comparing all the elements with the other we need to do n*(n-1) = n2-n comparisions...`