Finding k closest numbers to a given number
The short answer
The heapq.nsmallest() function will do this neatly and efficiently:
>>> from heapq import nsmallest>>> s = [1,2,3,4,5,6,7]>>> nsmallest(3, s, key=lambda x: abs(x - 6.5))[6, 7, 5]
Essentially this says, "Give me the three input values that have the smallest absolute difference from the number 6.5".
Optimizing for repeated lookups
In the comments, @Phylliida, asked how to optimize for repeated lookups with differing start points. One approach would be to pre-sort the data and then use bisect to locate the center of a small search segment:
from bisect import bisectdef k_nearest(k, center, sorted_data): 'Return *k* members of *sorted_data* nearest to *center*' i = bisect(sorted_data, center) segment = sorted_data[max(i-k, 0) : i+k] return nsmallest(k, segment, key=lambda x: abs(x - center))
For example:
>>> s.sort()>>> k_nearest(3, 6.5, s)[6, 7, 5]>>> k_nearest(3, 0.5, s)[1, 2, 3]>>> k_nearest(3, 4.5, s) [4, 5, 3]>>> k_nearest(3, 5.0, s)[5, 4, 6]
You could compute distances, and sort:
[n for d, n in sorted((abs(x-myNumber), x) for x in myList)[:k]]
This does the following:
- Create a sequence of tuples
(d, x)
whered
is the distance to your target - Select the first
k
elements of that list - Extract just the number values from the result, discarding the distance
Both answers were good, and Greg was right, Raymond's answer is more high level and easier to implement, but I built upon Greg's answer because it was easier to manipulate to fit my need.
In case anyone is searching for a way to find the n closest values from a list of dicts.
My dict looks like this, where npi is just an identifier that I need along with the value:
mydict = {u'fnpi': u'1982650024', u'snpi': {u'npi': u'1932190360', u'value': 2672}, u'snpis': [{u'npi': u'1831289255', u'value': 20}, {u'npi': u'1831139799', u'value': 20}, {u'npi': u'1386686137', u'value': 37}, {u'npi': u'1457355257', u'value': 45}, {u'npi': u'1427043645', u'value': 53}, {u'npi': u'1477548675', u'value': 53}, {u'npi': u'1851351514', u'value': 57}, {u'npi': u'1366446171', u'value': 60}, {u'npi': u'1568460640', u'value': 75}, {u'npi': u'1326046673', u'value': 109}, {u'npi': u'1548281124', u'value': 196}, {u'npi': u'1912989989', u'value': 232}, {u'npi': u'1336147685', u'value': 284}, {u'npi': u'1801894142', u'value': 497}, {u'npi': u'1538182779', u'value': 995}, {u'npi': u'1932190360', u'value': 2672}, {u'npi': u'1114020336', u'value': 3264}]}value = mydict['snpi']['value'] #value i'm working with belownpi = mydict['snpi']['npi'] #npi (identifier) i'm working with belowsnpis = mydict['snpis'] #dict i'm working with below
To get an [id, value]
list (not just a list of values) , I use this:
[[id,val] for diff, val, id in sorted((abs(x['value']-value), x['value'], x['npi']) for x in snpis)[:6]]
Which produces this:
[[u'1932190360', 2672], [u'1114020336', 3264], [u'1538182779', 995], [u'1801894142', 497], [u'1336147685', 284], [u'1912989989', 232]]
EDIT
I actually found it pretty easy to manipulate Raymond's answer too, if you're dealing with a dict (or list of lists).
from heapq import nsmallest[[i['npi'], i['value']] for i in nsmallest(6, snpis, key=lambda x: abs(x['value']-value))]
This will produce the same as the above output.
And this
nsmallest(6, snpis, key=lambda x: abs(x['value']-value))
will produce a dict instead.