Getting the row index for a 2D numPy array when multiple column values are known
In [80]: a = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])In [81]: aOut[81]: array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
a==2
returns a boolean numpy array, showing where the condition is True:
In [82]: a==2Out[82]: array([[False, True, False], [False, False, False], [False, False, False]], dtype=bool)
You can find any columns where this is True by using np.any(...,axis=0)
:
In [83]: np.any(a==2,axis=0)Out[83]: array([False, True, False], dtype=bool)In [84]: np.any(a==5,axis=0)Out[84]: array([False, True, False], dtype=bool)
You can find where both conditions are simultaneously true by using &
:
In [85]: np.any(a==2,axis=0) & np.any(a==5,axis=0)Out[85]: array([False, True, False], dtype=bool)
Finally, you can find the index of the columns where the conditions are simultaneously True using np.where
:
In [86]: np.where(np.any(a==2,axis=0) & np.any(a==5,axis=0))Out[86]: (array([1]),)
Here are ways to handle conditions on columns or rows, inspired by the Zen of Python.
In []: import thisThe Zen of Python, by Tim PetersBeautiful is better than ugly.Explicit is better than implicit....
So following the second advice:
a) conditions on column(s), applied to row(s):
In []: a= arange(12).reshape(3, 4)In []: aOut[]:array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])In []: a[2, logical_and(1== a[0, :], 5== a[1, :])]+= 12In []: aOut[]:array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 21, 10, 11]])
b) conditions on row(s), applied to column(s):
In []: a= a.TIn []: aOut[]:array([[ 0, 4, 8], [ 1, 5, 21], [ 2, 6, 10], [ 3, 7, 11]])In []: a[logical_and(1== a[:, 0], 5== a[:, 1]), 2]+= 12In []: aOut[]:array([[ 0, 4, 8], [ 1, 5, 33], [ 2, 6, 10], [ 3, 7, 11]])
So I hope this really makes sense to allways be explicit when accessing columns and rows. Code is typically read by people with various backgrounds.
Doing
np.where(np.any(a==2,axis=0) & np.any(a==5,axis=0))
as unutbu suggested will not use the information that 2 is in the 0th column, and 5 is in the 1st. So, for a = np.array([[5, 2, 3], [2, 5, 6], [7, 8, 9]])
, it will mistakenly return (array([0, 1]),)
Instead, you can use
np.where((a[0]==2) & (a[1]==5))
to get the correct result (array([1]),)
.
Furthermore, if you want to edit the 2nd column of that particular row, you can skip the np.where
and just reference it with: a[2][(a[0]==2) & (a[1]==5)]
. This will work also for assignments, for example a[2][(a[0]==2) & (a[1]==5)] = 11
.