Most efficient way to pull specified rows from a 2-d array? Most efficient way to pull specified rows from a 2-d array? numpy numpy

Most efficient way to pull specified rows from a 2-d array?


EDIT: Deleted my original answer since it was a misunderstanding of the question. Instead try:

ii = np.where((a[:,0] - b.reshape(-1,1)) == 0)[1]c = a[ii,:]

What I'm doing is using broadcasting to subtract each element of b from a, and then searching for zeros in that array which indicate a match. This should work, but you should be a little careful with comparison of floats, especially if b is not an array of ints.

EDIT 2 Thanks to Sven's suggestion, you can try this slightly modified version instead:

ii = np.where(a[:,0] == b.reshape(-1,1))[1]c = a[ii,:]

It's a bit faster than my original implementation.

EDIT 3 The fastest solution by far (~10x faster than Sven's second solution for large arrays) is:

c = a[np.searchsorted(a[:,0],b),:]

Assuming that a[:,0] is sorted and all values of b appear in a[:,0].


A slightly more concise way to do this is

c = a[(a[:,0] == b[:,None]).any(0)]

The usual caveats for floating point comparisons apply.

Edit: If b is not too small, the following slightly quirky solution performs better:

b.sort()c = a[b[np.searchsorted(b, a[:, 0]) - len(b)] == a[:,0]]