Find common elements in 2D numpy arrays Find common elements in 2D numpy arrays numpy numpy

Find common elements in 2D numpy arrays


If you have these arrays:

import numpy as nparray1 = np.array([[1, 100.0, 0.0, 0.0], [2, 110.0, 0.0, 0.0], [3, 120.0, 0.0, 0.0]])array2 = np.array([[1, 101.0, 0.0, 0.0], [3, 119, 0.0, 0.0]])

As you said you can use np.intersect1d to get the intersection, the only thing remaining is to index the arrays:

intersect = np.intersect1d(array1[:, 0], array2[:, 0])array1_matches = array1[np.any(array1[:, 0] == intersect[:, None], axis=0)]array2_matches = array2[np.any(array2[:, 0] == intersect[:, None], axis=0)]

And then you can subtract them:

>>> array1_matches - array2_matchesarray([[ 0., -1.,  0.,  0.],       [ 0.,  1.,  0.,  0.]])

This assumes that your times are unique and sorted. In case they are unsorted you could sort them before:

>>> array1 = array1[np.argsort(array1[:, 0])]>>> array2 = array2[np.argsort(array2[:, 0])]

In case the times are not-unique I have no idea how you want to handle that, so I can't advise you there.


You want to use numpy.in1d.

array1 = array1[np.in1d(array1[:,0], array2[:,0]), assume_unique=True]array2 = array2[np.in1d(array2[:,0], array1[:,0]), assume_unique=True]

Or if you don't want to change your originals:

array3 = array1[np.in1d(array1[:,0], array2[:,0]), assume_unique=True]array4 = array2[np.in1d(array2[:,0], array3[:,0]), assume_unique=True]

Notice in both cases I'm using the reduced array as the target of the second in1d to reduce search time. If you want to optimize even more you can wrap it in an if statement to assure the smaller array is the target of the first in1d.

Then just do array3-array4

def common_subtract(a1, a2, i = 0, unique = True):    a1, a2 = np.array(a1), np.array(a2)    if a1.shape[0] > a2.shape[0]:          a1 = a1[np.in1d(a1[:, i], a2[:, i], assume_unique = unique)]        a2 = a2[np.in1d(a2[:, i], a1[:, i], assume_unique = unique)]    else:        a2 = a2[np.in1d(a2[:, i], a1[:, i], assume_unique = unique)]        a1 = a1[np.in1d(a1[:, i], a2[:, i], assume_unique = unique)]    return a1 - a2


I found using intersect1d more clearer way to find common elements in 2D numpy array. In this case recent_books and coding_books have been defined.

start = time.time()recent_coding_books = np.intersect1d([recent_books], [coding_books]) print(len(recent_coding_books))print('Duration: {} seconds'.format(time.time() - start))