uniquify an array/list with a tolerance in python (uniquetol equivalent) uniquify an array/list with a tolerance in python (uniquetol equivalent) numpy numpy

uniquify an array/list with a tolerance in python (uniquetol equivalent)


With A as the input array and tol as the tolerance value, we could have a vectorized approach with NumPy broadcasting, like so -

A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]

Sample run -

In [20]: A = np.array([2.1,  1.3 , 1.9 , 1.1 , 2.0 , 2.5 , 2.9])In [21]: tol = 0.3In [22]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]Out[22]: array([ 2.1,  1.3,  2.5,  2.9])

Notice 1.9 being gone because we had 2.1 within the tolerance of 0.3. Then, 1.1 gone for 1.3 and 2.0 for 2.1.

Please note that this would create a unique array with "chained-closeness" check. As an example :

In [91]: A = np.array([ 1.1,  1.3,  1.5,  2. ,  2.1,  2.2, 2.35, 2.5,  2.9])In [92]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]Out[92]: array([ 1.1,  2. ,  2.9])

Thus, 1.3 is gone because of 1.1 and 1.5 is gone because of 1.3.


In pure Python 2, I wrote the following:

a = [1.1, 1.3, 1.9, 2.0, 2.5, 2.9]                                              # Per http://fr.mathworks.com/help/matlab/ref/uniquetol.html                                                                                    tol = max(map(lambda x: abs(x), a)) * 0.3                                       a.sort()                                                                        results = [a.pop(0), ]                                                          for i in a:    # Skip items within tolerance.                                                                         if abs(results[-1] - i) <= tol:                                                     continue                                                                    results.append(i)                                                           print a                                                                         print results

Which results in

[1.3, 1.9, 2.0, 2.5, 2.9][1.1, 2.0, 2.9]

Which is what the spec seems to agree with, but isn't consistent with your example.

If I just set the tolerance to 0.3 instead of max(map(lambda x: abs(x), a)) * 0.3, I get:

[1.3, 1.9, 2.0, 2.5, 2.9][1.1, 1.9, 2.5, 2.9]

...which is consistent with your example.