Nearest Neighbor Search in Python without k-d tree Nearest Neighbor Search in Python without k-d tree numpy numpy

Nearest Neighbor Search in Python without k-d tree


If concise is your goal, you can do this one-liner:

In [14]: X = scipy.randn(10,2)In [15]: XOut[15]: array([[ 0.85831163,  1.45039761],       [ 0.91590236, -0.64937523],       [-1.19610431, -1.07731673],       [-0.48454195,  1.64276509],       [ 0.90944798, -0.42998205],       [-1.17765553,  0.20858178],       [-0.29433563, -0.8737285 ],       [ 0.5115424 , -0.50863231],       [-0.73882547, -0.52016481],       [-0.14366935, -0.96248649]])In [16]: q = scipy.array([0.91, -0.43])In [17]: scipy.argmin([scipy.inner(q-x,q-x) for x in X])Out[17]: 4

If you have several query points:

In [18]: Q = scipy.array([[0.91, -0.43], [-0.14, -0.96]])In [19]: [scipy.argmin([scipy.inner(q-x,q-x) for x in X]) for q in Q]Out[19]: [4, 9]


Broadcasting is very useful for this kind of thing. I'm not sure if this is what you need, but here I use broadcasting to find the displacement between p (one point in 3 space) and X (a set of 10 points in 3-space).

import numpy as npdef closest(X, p):    disp = X - p    return np.argmin((disp*disp).sum(1))X = np.random.random((10, 3))p = np.random.random(3)print X#array([[ 0.68395953,  0.97882991,  0.68826511],#       [ 0.57938059,  0.24713904,  0.32822283],#       [ 0.06070267,  0.06561339,  0.62241713],#       [ 0.93734468,  0.73026772,  0.33755815],#       [ 0.29370809,  0.76298588,  0.68728743],#       [ 0.66248449,  0.6023311 ,  0.76704199],#       [ 0.53490144,  0.96555923,  0.43994738],#       [ 0.23780428,  0.75525843,  0.46067472],#       [ 0.84240565,  0.82573202,  0.56029917],#       [ 0.66751884,  0.31561133,  0.19244683]])print p#array([ 0.587416 ,  0.4181857,  0.2539029])print closest(X, p)#9


You can compute all distances scipy.spatial.distance.cdist( X, Y ) or use RTree for dynamic data: http://gispython.org/rtree/docs/class.html .