Efficiently finding the closest coordinate pair from a set in Python
Using a k-dimensional tree:
>>> from scipy import spatial>>> airports = [(10,10),(20,20),(30,30),(40,40)]>>> tree = spatial.KDTree(airports)>>> tree.query([(21,21)])(array([ 1.41421356]), array([1]))
Where 1.41421356 is the distance between the queried point and the nearest neighbour and 1 is the index of the neighbour.
If your coordinates are unsorted, your search can only be improved slightly assuming it is (latitude,longitude)
by filtering on latitude first as for earth
1 degree of latitude on the sphere is 111.2 km or 69 miles
but that would not give a huge speedup.
If you sort the airports by latitude first then you can use a binary search for finding the first airport that could match (airport_lat >= point_lat-tolerance
) and then only compare up to the last one that could match (airport_lat <= point_lat+tolerance
) - but take care of 0 degrees equaling 360. While you cannot use that library directly, the sources of bisect are a good start for implementing a binary search.
While technically this way the search is still O(n), you have much fewer actual distance calculations (depending on tolerance) and few latitude comparisons. So you will have a huge speedup.
From this SO question:
import numpy as npdef closest_node(node, nodes): nodes = np.asarray(nodes) deltas = nodes - node dist_2 = np.einsum('ij,ij->i', deltas, deltas) return np.argmin(dist_2)
where node
is a tuple with two values (x, y) and nodes
is an array of tuples with two values ([(x_1, y_1), (x_2, y_2),]
)