Faster than binary search for ordered list Faster than binary search for ordered list arrays arrays

Faster than binary search for ordered list


You can do better than O(log n) if the values are integers, in which case the best worst-case running time you can achieve, in terms of n, is O(sqrt(log n)). Otherwise, there is no way to beat O(log n) unless there are patterns in the input sequence. There are two approaches used to beat O(log n) in the case of integers.

First, you can use y-fast trees which work by storing in a hash table all prefixes for which you are storing at least one integer with that prefix. This enables you to perform a binary search to find the length of the longest matching prefix. This enables you to find the successor of an element for which you are searching in time O(log w) where w is the number of bits in a word. There are some details to work though to make this work and use only linear space, but they aren't too bad (see the link below).

Second, you can use fusion trees, which use bit tricks to enable you to perform w^O(1) comparisons in just a constant number of instructions, yielding a running time of O(log n / log w).

The optimum tradeoff between these two data structures occurs when log w = sqrt(log n), giving a running time of O(sqrt(log n)).

For details on the above, see lectures 12 and 13 of Erik Demaine's course: http://courses.csail.mit.edu/6.851/spring07/lec.html


One possibility is to treat it like finding the roots of a function. Basically, finding:

a[i] <= i <= a[i + 1]

Is equivalent to:

a[i] - i <= 0 <= a[i + 1] - i

Then you could try something like Newton's method and so on. These kinds of algorithms frequently converge faster than a binary search when they work, but I don't know of one that is guaranteed to converge for all input.

http://en.wikipedia.org/wiki/Root-finding_algorithm


If the values in the list are evenly distributed then you could try a weighted split instead of a binary split, e.g. if the desired value is a third of the way from the current lower limit to the current value then you could try the element that is also a third of the way. This could suffer badly on lists where values are bunched up though.