Numpy: find first index of value fast Numpy: find first index of value fast python python

Numpy: find first index of value fast


There is a feature request for this scheduled for Numpy 2.0.0: https://github.com/numpy/numpy/issues/2269


Although it is way too late for you, but for future reference:Using numba (1) is the easiest way until numpy implements it. If you use anaconda python distribution it should already be installed.The code will be compiled so it will be fast.

@jit(nopython=True)def find_first(item, vec):    """return the index of the first occurence of item in vec"""    for i in xrange(len(vec)):        if item == vec[i]:            return i    return -1

and then:

>>> a = array([1,7,8,32])>>> find_first(8,a)2


I've made a benchmark for several methods:

  • argwhere
  • nonzero as in the question
  • .tostring() as in @Rob Reilink's answer
  • python loop
  • Fortran loop

The Python and Fortran code are available. I skipped the unpromising ones like converting to a list.

The results on log scale. X-axis is the position of the needle (it takes longer to find if it's further down the array); last value is a needle that's not in the array. Y-axis is the time to find it.

benchmark results

The array had 1 million elements and tests were run 100 times. Results still fluctuate a bit, but the qualitative trend is clear: Python and f2py quit at the first element so they scale differently. Python gets too slow if the needle is not in the first 1%, whereas f2py is fast (but you need to compile it).

To summarize, f2py is the fastest solution, especially if the needle appears fairly early.

It's not built in which is annoying, but it's really just 2 minutes of work. Add this to a file called search.f90:

subroutine find_first(needle, haystack, haystack_length, index)    implicit none    integer, intent(in) :: needle    integer, intent(in) :: haystack_length    integer, intent(in), dimension(haystack_length) :: haystack!f2py intent(inplace) haystack    integer, intent(out) :: index    integer :: k    index = -1    do k = 1, haystack_length        if (haystack(k)==needle) then            index = k - 1            exit        endif    enddoend

If you're looking for something other than integer, just change the type. Then compile using:

f2py -c -m search search.f90

after which you can do (from Python):

import searchprint(search.find_first.__doc__)a = search.find_first(your_int_needle, your_int_array)