mask a 2D numpy array based on values in one column mask a 2D numpy array based on values in one column numpy numpy

mask a 2D numpy array based on values in one column


import numpy as npa = np.array([[1, 5, 6],              [2, 4, 1],              [3, 1, 5]])np.ma.MaskedArray(a, mask=(np.ones_like(a)*(a[:,0]==1)).T)# Returns: masked_array(data = [[-- -- --] [2 4 1] [3 1 5]],             mask = [[ True  True  True] [False False False] [False False False]])


You can create the desired mask by

mask = numpy.repeat(a[:,0]==1, a.shape[1])

and the masked array by

masked_a = numpy.ma.array(a, mask=numpy.repeat(a[:,0]==1, a.shape[1]))


You could simply create an empty mask and then use numpy-broadcasting (like @eumiro showed) but using the element- and bitwise "or" operator |:

>>> a = np.array([[1, 5, 6], [2, 4, 1], [3, 1, 5]])>>> mask = np.zeros(a.shape, bool) | (a[:, 0] == 1)[:, None]>>> np.ma.array(a, mask=mask)masked_array(data = [[-- -- --] [2 4 1] [3 1 5]],             mask = [[ True  True  True] [False False False] [False False False]],       fill_value = 999999)

A bit further explanation:

>>> # select first column>>> a[:, 0]  array([1, 2, 3])>>> # where the first column is 1>>> a[:, 0] == 1  array([ True, False, False], dtype=bool)>>> # added dimension so that it correctly broadcasts to the empty mask>>> (a[:, 0] == 1)[:, None]  array([[ True],       [False],       [False]], dtype=bool)>>> # create the final mask>>> np.zeros(a.shape, bool) | (a[:, 0] == 1)[:, None]  array([[ True,  True,  True],       [False, False, False],       [False, False, False]], dtype=bool)

One further advantage of this approach is that it doesn't need to use potentially expensive multiplications or np.repeat so it should be quite fast.