Given 2 sorted arrays of integers, find the nth largest number in sublinear time [duplicate] Given 2 sorted arrays of integers, find the nth largest number in sublinear time [duplicate] arrays arrays

Given 2 sorted arrays of integers, find the nth largest number in sublinear time [duplicate]


I think this is two concurrent binary searches on the subarrays A[0..n-1] and B[0..n-1], which is O(log n).

  • Given sorted arrays, you know that the nth largest will appear somewhere before or at A[n-1] if it is in array A, or B[n-1] if it is in array B
  • Consider item at index a in A and item at index b in B.
  • Perform binary search as follows (pretty rough pseudocode, not taking in account 'one-off' problems):
    • If a + b > n, then reduce the search set
      • if A[a] > B[b] then b = b / 2, else a = a / 2
    • If a + b < n, then increase the search set
      • if A[a] > B[b] then b = 3/2 * b, else a = 3/2 * a (halfway between a and previous a)
    • If a + b = n then the nth largest is max(A[a], B[b])

I believe worst case O(ln n), but in any case definitely sublinear.


I believe that you can solve this problem using a variant on binary search. The intuition behind this algorithm is as follows. Let the two arrays be A and B and let's assume for the sake of simplicity that they're the same size (this isn't necessary, as you'll see). For each array, we can construct parallel arrays Ac and Bc such that for each index i, Ac[i] is the number of elements in the two arrays that are no larger than A[i] and Bc[i] is the number of elements in the two arrays that are no larger than B[i]. If we could construct these arrays efficiently, then we could find the kth smallest element efficiently by doing binary searches on both Ac and Bc to find the value k. The corresponding entry of A or B for that entry is then the kth largest element. The binary search is valid because the two arrays Ac and Bc are sorted, which I think you can convince yourself of pretty easily.

Of course, this solution doesn't work in sublinear time because it takes O(n) time to construct the arrays Ac and Bc. The question then is - is there some way that we can implicitly construct these arrays? That is, can we determine the values in these arrays without necessarily constructing each element? I think that the answer is yes, using this algorithm. Let's begin by searching array A to see if it has the kth smallest value. We know for a fact that the kth smallest value can't appear in the array in array A after position k (assuming all the elements are distinct). So let's focus just on the the first k elements of array A. We'll do a binary search over these values as follows. Start at position k/2; this is the k/2th smallest element in array A. Now do a binary search in array B to find the largest value in B smaller than this value and look at its position in the array; this is the number of elements in B smaller than the current value. If we add up the position of the elements in A and B, we have the total number of elements in the two arrays smaller than the current element. If this is exactly k, we're done. If this is less than k, then we recurse in the upper half of the first k elements of A, and if this is greater than k we recurse in the lower half of the first elements of k, etc. Eventually, we'll either find that the kth largest element is in array A, in which case we're done. Otherwise, repeat this process on array B.

The runtime for this algorithm is as follows. The search of array A does a binary search over k elements, which takes O(lg k) iterations. Each iteration costs O(lg n), since we have to do a binary search in B. This means that the total time for this search is O(lg k lg n). The time to do this in array B is the same, so the net runtime for the algorithm is O(lg k lg n) = O(lg2 n) = o(n), which is sublinear.


This is quite similar answer to Kirk's.

Let Find( nth, A, B ) be function that returns nth number, and |A| + |B| >= n. This is simple pseudo code without checking if one of array is small, less than 3 elements. In case of small array one or 2 binary searches in larger array is enough to find needed element.

Find( nth, A, B )  If A.last() <= B.first():    return B[nth - A.size()]  If B.last() <= A.first():    return A[nth - B.size()]  Let a and b indexes of middle elements of A and B  Assume that A[a] <= B[b] (if not swap arrays)  if nth <= a + b:    return Find( nth, A, B.first_half(b) )  return Find( nth - a, A.second_half(a), B )

It is log(|A|) + log(|B|), and because input arrays can be made to have n elements each it is log(n) complexity.