Faster for-loops with arrays in Python
This is basically the idea behind Numba.Not as fast as C, but it can get close... It uses a jit compiler to compile python code to machine and it's compatible with most Numpy functions. (In the docs you find all the details)
import numpy as npfrom numba import njit@njitdef f(N, M): a = np.random.uniform(0, 1, (N, M)) k = np.random.randint(0, N, (N, M)) out = np.zeros((N, M)) for i in range(N): for j in range(M): out[k[i, j], j] += a[i, j] return outdef f_python(N, M): a = np.random.uniform(0, 1, (N, M)) k = np.random.randint(0, N, (N, M)) out = np.zeros((N, M)) for i in range(N): for j in range(M): out[k[i, j], j] += a[i, j] return out
Pure Python:
%%timeitN, M = 100, 4000f_python(M, N)
338 ms ± 12.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
With Numba:
%%timeitN, M = 100, 4000f(M, N)
12 ms ± 534 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)