Finding two most far away points in plot with many points in Python Finding two most far away points in plot with many points in Python numpy numpy

Finding two most far away points in plot with many points in Python


You can avoid computing all pairwise distances by observing that the two points which are furthest apart will occur as vertices in the convex hull. You can then compute pairwise distances between fewer points.

For example, with 100,000 points distributed uniformly in a unit square, there are only 22 points in the convex hull in my instance.

import numpy as npfrom scipy import spatial# test pointspts = np.random.rand(100_000, 2)# two points which are fruthest apart will occur as vertices of the convex hullcandidates = pts[spatial.ConvexHull(pts).vertices]# get distances between each pair of candidate pointsdist_mat = spatial.distance_matrix(candidates, candidates)# get indices of candidates that are furthest aparti, j = np.unravel_index(dist_mat.argmax(), dist_mat.shape)print(candidates[i], candidates[j])# e.g. [  1.11251218e-03   5.49583204e-05] [ 0.99989971  0.99924638]

enter image description here


If your data is 2-dimensional, you can compute the convex hull in O(N*log(N)) time where N is the number of points. By concentration of measure, this method deteriorates in performance for many common distributions as the number of dimensions grows.