Python vectorizing nested for loops

python numpy for-loop optimization vectorization

Approach #1

Here's a vectorized approach -

m,n,r = volume.shapex,y,z = np.mgrid[0:m,0:n,0:r]X = x - roi[0]Y = y - roi[1]Z = z - roi[2]mask = X**2 + Y**2 + Z**2 < radius**2

Possible improvement : We can probably speedup the last step with numexpr module -

import numexpr as nemask = ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')

Approach #2

We can also gradually build the three ranges corresponding to the shape parameters and perform the subtraction against the three elements of roi on the fly without actually creating the meshes as done earlier with np.mgrid. This would be benefited by the use of broadcasting for efficiency purposes. The implementation would look like this -

m,n,r = volume.shapevals = ((np.arange(m)-roi[0])**2)[:,None,None] + \       ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)mask = vals < radius**2

Simplified version : Thanks to @Bi Rico for suggesting an improvement here as we can use np.ogrid to perform those operations in a bit more concise manner, like so -

m,n,r = volume.shape    x,y,z = np.ogrid[0:m,0:n,0:r]-roimask = (x**2+y**2+z**2) < radius**2

Runtime test

Function definitions -

def vectorized_app1(volume, roi, radius):    m,n,r = volume.shape    x,y,z = np.mgrid[0:m,0:n,0:r]    X = x - roi[0]    Y = y - roi[1]    Z = z - roi[2]    return X**2 + Y**2 + Z**2 < radius**2def vectorized_app1_improved(volume, roi, radius):    m,n,r = volume.shape    x,y,z = np.mgrid[0:m,0:n,0:r]    X = x - roi[0]    Y = y - roi[1]    Z = z - roi[2]    return ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')def vectorized_app2(volume, roi, radius):    m,n,r = volume.shape    vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \           ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)    return vals < radius**2def vectorized_app2_simplified(volume, roi, radius):    m,n,r = volume.shape        x,y,z = np.ogrid[0:m,0:n,0:r]-roi    return (x**2+y**2+z**2) < radius**2

Timings -

In [106]: # Setup input arrays       ...: volume = np.random.rand(90,110,100) # Half of original input sizes      ...: roi = np.random.rand(3)     ...: radius = 3.4     ...: In [107]: %timeit _make_mask(volume, roi, radius)1 loops, best of 3: 41.4 s per loopIn [108]: %timeit vectorized_app1(volume, roi, radius)10 loops, best of 3: 62.3 ms per loopIn [109]: %timeit vectorized_app1_improved(volume, roi, radius)10 loops, best of 3: 47 ms per loopIn [110]: %timeit vectorized_app2(volume, roi, radius)100 loops, best of 3: 4.26 ms per loopIn [139]: %timeit vectorized_app2_simplified(volume, roi, radius)100 loops, best of 3: 4.36 ms per loop

So, as always broadcasting showing its magic for a crazy almost 10,000x speedup over the original code and more than 10x better than creating meshes by using on-the-fly broadcasted operations!

python numpy for-loop optimization vectorization

Say you first build an xyzy array:

import itertoolsxyz = [np.array(p) for p in itertools.product(range(volume.shape[0]), range(volume.shape[1]), range(volume.shape[2]))]

Now, using numpy.linalg.norm,

np.linalg.norm(xyz - roi, axis=1) < radius

checks whether the distance for each tuple from roi is smaller than radius.

Finally, just reshape the result to the dimensions you need.

CodeHunter

Python vectorizing nested for loops

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last