Identifying points with the smallest Euclidean distance Identifying points with the smallest Euclidean distance numpy numpy

Identifying points with the smallest Euclidean distance


Try scipy.spatial.distance.pdist(myArr). This will give you a condensed distance matrix. You can use argmin on it and find the index of the smallest value. This can be converted into the pair information.


There's a whole Wikipedia page on just this problem, see: http://en.wikipedia.org/wiki/Closest_pair_of_points

Executive summary: you can achieve O(n log n) with a recursive divide and conquer algorithm (outlined on the Wiki page, above).


You could take advantage of the latest version of SciPy's (v0.9) Delaunay triangulation tools. You can be sure that the closest two points will be an edge of a simplex in the triangulation, which is a much smaller subset of pairs than doing every combination.

Here's the code (updated for general N-D):

import numpyfrom scipy import spatialdef closest_pts(pts):    # set up the triangluataion    # let Delaunay do the heavy lifting    mesh = spatial.Delaunay(pts)    # TODO: eliminate reduncant edges (numpy.unique?)    edges = numpy.vstack((mesh.vertices[:,:dim], mesh.vertices[:,-dim:]))    # the rest is easy    x = mesh.points[edges[:,0]]    y = mesh.points[edges[:,1]]    dists = numpy.sum((x-y)**2, 1)    idx = numpy.argmin(dists)    return edges[idx]    #print 'distance: ', dists[idx]    #print 'coords:\n', pts[closest_verts]dim = 3N = 1000*dimpts = numpy.random.random(N).reshape(N/dim, dim)

Seems closely O(n):

enter image description here