Numba code slower than pure python
The problem is that numba can't intuit the type of lookup
. If you put a print nb.typeof(lookup)
in your method, you'll see that numba is treating it as an object, which is slow. Normally I would just define the type of lookup
in a locals dict, but I was getting a strange error. Instead I just created a little wrapper, so that I could explicitly define the input and output types.
@nb.jit(nb.f8[:](nb.f8[:]))def numba_cumsum(x): return np.cumsum(x)@nb.autojitdef numba_resample2(qs, xs, rands): n = qs.shape[0] #lookup = np.cumsum(qs) lookup = numba_cumsum(qs) results = np.empty(n) for j in range(n): for i in range(n): if rands[j] < lookup[i]: results[j] = xs[i] break return results
Then my timings are:
print "Timing Numba Function:"%timeit numba_resample(qs, xs, rands)print "Timing Revised Numba Function:"%timeit numba_resample2(qs, xs, rands)
Timing Numba Function:100 loops, best of 3: 8.1 ms per loopTiming Revised Numba Function:100000 loops, best of 3: 15.3 µs per loop
You can go even a little faster still if you use jit
instead of autojit
:
@nb.jit(nb.f8[:](nb.f8[:], nb.f8[:], nb.f8[:]))
For me that lowers it from 15.3 microseconds to 12.5 microseconds, but it's still impressive how well autojit does.
Faster numpy
version (10x speedup compared to numpy_resample
)
def numpy_faster(qs, xs, rands): lookup = np.cumsum(qs) mm = lookup[None,:]>rands[:,None] I = np.argmax(mm,1) return xs[I]