uniquify an array/list with a tolerance in python (uniquetol equivalent)
With A
as the input array and tol
as the tolerance value, we could have a vectorized approach with NumPy broadcasting
, like so -
A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]
Sample run -
In [20]: A = np.array([2.1, 1.3 , 1.9 , 1.1 , 2.0 , 2.5 , 2.9])In [21]: tol = 0.3In [22]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]Out[22]: array([ 2.1, 1.3, 2.5, 2.9])
Notice 1.9
being gone because we had 2.1
within the tolerance of 0.3
. Then, 1.1
gone for 1.3
and 2.0
for 2.1
.
Please note that this would create a unique array with "chained-closeness" check. As an example :
In [91]: A = np.array([ 1.1, 1.3, 1.5, 2. , 2.1, 2.2, 2.35, 2.5, 2.9])In [92]: A[~(np.triu(np.abs(A[:,None] - A) <= tol,1)).any(0)]Out[92]: array([ 1.1, 2. , 2.9])
Thus, 1.3
is gone because of 1.1
and 1.5
is gone because of 1.3
.
In pure Python 2, I wrote the following:
a = [1.1, 1.3, 1.9, 2.0, 2.5, 2.9] # Per http://fr.mathworks.com/help/matlab/ref/uniquetol.html tol = max(map(lambda x: abs(x), a)) * 0.3 a.sort() results = [a.pop(0), ] for i in a: # Skip items within tolerance. if abs(results[-1] - i) <= tol: continue results.append(i) print a print results
Which results in
[1.3, 1.9, 2.0, 2.5, 2.9][1.1, 2.0, 2.9]
Which is what the spec seems to agree with, but isn't consistent with your example.
If I just set the tolerance to 0.3
instead of max(map(lambda x: abs(x), a)) * 0.3
, I get:
[1.3, 1.9, 2.0, 2.5, 2.9][1.1, 1.9, 2.5, 2.9]
...which is consistent with your example.