How can I ignore zeros when I take the median on columns of an array? How can I ignore zeros when I take the median on columns of an array? arrays arrays

How can I ignore zeros when I take the median on columns of an array?


Masked array is always handy, but slooooooow:

In [14]:%timeit np.ma.median(y, axis=0).filled(0)1000 loops, best of 3: 1.73 ms per loopIn [15]:%%timeitans=np.apply_along_axis(lambda v: np.median(v[v!=0]), 0, x)ans[np.isnan(ans)]=0.1000 loops, best of 3: 402 µs per loopIn [16]:ans=np.apply_along_axis(lambda v: np.median(v[v!=0]), 0, x)ans[np.isnan(ans)]=0.; ansOut[16]:array([ 9.,  9.,  9.,  0.])

np.nonzero is even faster:

In [25]:%%timeitans=np.apply_along_axis(lambda v: np.median(v[np.nonzero(v)]), 0, x)ans[np.isnan(ans)]=0.1000 loops, best of 3: 384 µs per loop


Use masked arrays and np.ma.median(axis=0).filled(0) to get the medians of the columns.

In [1]: x = np.array([[10, 0, 10, 0], [1, 1, 0, 0], [9, 9, 9, 0], [0, 10, 1, 0]])In [2]: y = np.ma.masked_where(x == 0, x)In [3]: xOut[3]: array([[10,  0, 10, 0],       [ 1,  1,  0, 0],       [ 9,  9,  9, 0],       [ 0, 10,  1, 0]])In [4]: yOut[4]: masked_array(data = [[10 -- 10 --] [1 1 -- --] [9 9 9 --] [-- 10 1 --]],             mask = [[False  True False True] [False False  True True] [False False False True] [ True False False True]],       fill_value = 999999)In [6]: np.median(x, axis=0)Out[6]: array([ 5.,  5.,  5., 0.])In [7]: np.ma.median(y, axis=0).filled(0)Out[7]: array(data = [ 9.  9.  9., 0.])


You can use masked arrays.

a = np.array([[10, 0, 10, 0], [1, 1, 0, 0],[9,9,9,0],[0,10,1,0]])m = np.ma.masked_equal(a, 0)In [44]: np.median(a)Out[44]: 1.0In [45]: np.ma.median(m)Out[45]: 9.0In [46]: mOut[46]:masked_array(data = [[10 -- 10 --] [1 1 -- --] [9 9 9 --] [-- 10 1 --]],             mask = [[False  True False  True] [False False  True  True] [False False False  True] [ True False False  True]],       fill_value = 0)