Vectorize iterative addition in NumPy arrays

python loops numpy optimization vectorization

You could speed it up a little by using np.add.at, which correctly handles the case of duplicate indices:

def interadd(U, idx):    grids = np.zeros((U,U))          for i in range(len(idx)):        grids[idx[i,0],idx[i,1]] += 1    return gridsdef interadd2(U, idx):    grids = np.zeros((U,U))    np.add.at(grids, idx.T.tolist(), 1)    return gridsdef interadd3(U, idx):    # YXD suggestion    grids = np.zeros((U,U))    np.add.at(grids, (idx[:,0], idx[:,1]), 1)    return grids

which gives

>>> U = 100>>> idx = np.floor(np.random.random(size=(5000,2))*U).astype(np.int)>>> (interadd(U, idx) == interadd2(U, idx)).all()True>>> %timeit interadd(U, idx)100 loops, best of 3: 8.48 ms per loop>>> %timeit interadd2(U, idx)100 loops, best of 3: 2.62 ms per loop

And YXD's suggestion:

>>> (interadd(U, idx) == interadd3(U, idx)).all()True>>> %timeit interadd3(U, idx)1000 loops, best of 3: 1.09 ms per loop

python loops numpy optimization vectorization

Divakar's answer lead me to try the following, which looks to be the fastest method yet:

lin_idx = idx[:,0]*U + idx[:,1]grids = np.bincount(lin_idx, minlength=U**2).reshape(U, U)

Timings:

In [184]: U = 100      ...: input = np.random.random(size=(5000,2)) * U     ...: idx = np.floor(input).astype(np.int)In [185]: %timeit interadd3(U, idx)  # By DSM / XYD1000 loops, best of 3: 1.68 ms per loopIn [186]: %timeit unique_counts(U, idx)  # By Divakar1000 loops, best of 3: 676 µs per loopIn [187]: %%timeit     ...: lin_idx = idx[:,0]*U + idx[:,1]     ...: grids = np.bincount(lin_idx, minlength=U*U).reshape(U, U)     ...: 10000 loops, best of 3: 97.5 µs per loop

python loops numpy optimization vectorization

You can convert R,C indices from idx to linear indices, then find out the unique ones alongwith their counts and finally store them in the output grids as the final output. Here's the implementation to achieve the same -

# Calculate linear indices corressponding to idxlin_idx = idx[:,0]*U + idx[:,1]# Get unique linear indices and their countsunq_lin_idx,idx_counts = np.unique(lin_idx,return_counts=True)# Setup output array and store index counts in raveled/flattened versiongrids = np.zeros((U,U))  grids.ravel()[unq_lin_idx] = idx_counts

Runtime tests -

Here are the runtime tests covering all the approaches (including @DSM's approaches) and using the same definitions as listed in that solution -

In [63]: U = 100    ...: idx = np.floor(np.random.random(size=(5000,2))*U).astype(np.int)    ...: In [64]: %timeit interadd(U, idx)100 loops, best of 3: 7.57 ms per loopIn [65]: %timeit interadd2(U, idx)100 loops, best of 3: 2.59 ms per loopIn [66]: %timeit interadd3(U, idx)1000 loops, best of 3: 1.24 ms per loopIn [67]: def unique_counts(U, idx):    ...:     lin_idx = idx[:,0]*U + idx[:,1]    ...:     unq_lin_idx,idx_counts = np.unique(lin_idx,return_counts=True)    ...:     grids = np.zeros((U,U))      ...:     grids.ravel()[unq_lin_idx] = idx_counts    ...:     return grids    ...: In [68]: %timeit unique_counts(U, idx)1000 loops, best of 3: 595 µs per loop

Runtimes suggest that the proposed np.unique based approach is more than 50% faster than the second fastest approach.

CodeHunter

Vectorize iterative addition in NumPy arrays

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last