Identifying points with the smallest Euclidean distance
Try scipy.spatial.distance.pdist(myArr)
. This will give you a condensed distance matrix. You can use argmin
on it and find the index of the smallest value. This can be converted into the pair information.
There's a whole Wikipedia page on just this problem, see: http://en.wikipedia.org/wiki/Closest_pair_of_points
Executive summary: you can achieve O(n log n) with a recursive divide and conquer algorithm (outlined on the Wiki page, above).
You could take advantage of the latest version of SciPy's (v0.9) Delaunay triangulation tools. You can be sure that the closest two points will be an edge of a simplex in the triangulation, which is a much smaller subset of pairs than doing every combination.
Here's the code (updated for general N-D):
import numpyfrom scipy import spatialdef closest_pts(pts): # set up the triangluataion # let Delaunay do the heavy lifting mesh = spatial.Delaunay(pts) # TODO: eliminate reduncant edges (numpy.unique?) edges = numpy.vstack((mesh.vertices[:,:dim], mesh.vertices[:,-dim:])) # the rest is easy x = mesh.points[edges[:,0]] y = mesh.points[edges[:,1]] dists = numpy.sum((x-y)**2, 1) idx = numpy.argmin(dists) return edges[idx] #print 'distance: ', dists[idx] #print 'coords:\n', pts[closest_verts]dim = 3N = 1000*dimpts = numpy.random.random(N).reshape(N/dim, dim)
Seems closely O(n):