Fastest way to zero out low values in array? Fastest way to zero out low values in array? arrays arrays

Fastest way to zero out low values in array?


This is a typical job for NumPy, which is very fast for these kinds of operations:

array_np = numpy.asarray(array)low_values_flags = array_np < lowValY  # Where values are lowarray_np[low_values_flags] = 0  # All low values set to 0

Now, if you only need the highCountX largest elements, you can even "forget" the small elements (instead of setting them to 0 and sorting them) and only sort the list of large elements:

array_np = numpy.asarray(array)print numpy.sort(array_np[array_np >= lowValY])[-highCountX:]

Of course, sorting the whole array if you only need a few elements might not be optimal. Depending on your needs, you might want to consider the standard heapq module.


from scipy.stats import thresholdthresholded = threshold(array, 0.5)

:)


There's a special MaskedArray class in NumPy that does exactly that. You can "mask" elements based on any precondition. This better represent your need than assigning zeroes: numpy operations will ignore masked values when appropriate (for example, finding mean value).

>>> from numpy import ma>>> x = ma.array([.06, .25, 0, .15, .5, 0, 0, 0.04, 0, 0])>>> x1 = ma.masked_inside(0, 0.1) # mask everything in 0..0.1 range>>> x1masked_array(data = [-- 0.25 -- 0.15 0.5 -- -- -- -- --],         mask = [ True False True False False True True True True True],   fill_value = 1e+20)>>> print x.filled(0) # Fill with zeroes[ 0 0.25 0 0.15 0.5 0 0 0 0 0 ]

As an addded benefit, masked arrays are well supported in matplotlib visualisation library if you need this.

Docs on masked arrays in numpy