Fastest way to calculate the sum of specific regions of an array

That's easy with np.split:

result = [part.sum() for part in np.split(a, np.cumsum(b))[:-1]]print(result)>>> [36, 19, 37]

python python-2.7 performance numpy sum

A much faster way than np.split is:

np.add.reduceat(a, np.r_[0, np.cumsum(b)[:-1]])

What this does:

Creates an array of ascending indices out of b corresponding to the ranges you want to sum over - for simplicity, you can assign c = np.r_[0, np.cumsum(b)[:-1]] which for your example would be array([0, 8, 10]) - which is 0 followed all but the last element of the cumulative sum of b (np.cumsum(b) -> array([8, 10, 13]) (the domain of np.ufunc.reduceat is exclusive of the endpoint, so we have to get rid of that 13)
np.ufunc.reduceat(a, c) reduces a by ufunc (in this case, add) over ranges specified by c[i]:c[i+1]. When i+1 would overflow c, it instead reduces over c[i]:-1
reduce just condenses an array to a single value. For example, np.add.reduce(a) is equivalent to (but slower than) np.sum(a) (which is in turn slower than a.sum()). However, since reduceat pushes the for loop in the answer by @jdehsa out of python and into numpy core compiled c-code, it is much faster.

Speed test:

b = np.random.randint(1,10,(10000,))a = np.random.randint(1,10,(np.sum(b),))%timeit np.add.reduceat(a, np.r_[0, np.cumsum(b)[:-1]])1000 loops, best of 3: 293 µs per loop%timeit [part.sum() for part in np.split(a, np.cumsum(b))[:-1]]10 loops, best of 3: 44.6 ms per loop

And with the added benefit of not wasting memory creating a temporary split copy of a

python python-2.7 performance numpy sum

You can use the reduceat method of the np.add ufunc. You just need to add a zero in front of your indices and discard the last index (if it covers the complete array):

>>> import numpy as np>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,14])>>> b = np.array([8,2,3])>>> np.add.reduceat(a, np.append([0], np.cumsum(b)[:-1]))array([36, 19, 37], dtype=int32)

The [:-1] discards the last index and the np.append([0], adds a zero in front of the indices.

Note that this is a slightly adapted variant of DanielFs answer.

If you don't like the append you could also create a new array yourself containing the indices:

>>> b_sum = np.zeros_like(b)>>> np.cumsum(b[:-1], out=b_sum[1:])  # insert the cumsum in the b_sum array directly>>> np.add.reduceat(a, b_sum)array([36, 19, 37], dtype=int32)

CodeHunter

Fastest way to calculate the sum of specific regions of an array

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last