Vectorize a for loop in numpy to calculate duct-tape overlaping
There is no need for any looping at all here. You have effectively two different line_mask
functions. Neither needs to be looped explicitly, but you would probably get a significant speedup just from rewriting it with a pair of for
loops in an if
and else
, rather than an if
and else
in a for
loop, which gets evaluated many many times.
The really numpythonic thing to do is to properly vectorize your code to operate on entire arrays without any loops. Here is a vectorized version of line_mask
:
def line_mask(drum, coef, intercept, upper=True, accuracy=accuracy): """Masks a half of the array""" r = np.arange(drum.shape[0]).reshape(-1, 1) c = np.arange(drum.shape[1]).reshape(1, -1) comp = c.__lt__ if upper else c.__ge__ return comp(r * coef + intercept)
Setting up the shapes of r
and c
to be (m, 1)
and (n, 1)
so that the result is (m, n)
is called broadcasting, and is the staple of vectorization in numpy.
The result of the updated line_mask
is a boolean mask (as the name implies) rather than a float array. This makes it smaller, and hopefully bypasses float operations entirely. You can now rewrite get_band
to use masking instead of addition:
def get_band(drum, coef, intercept, bandwidth): """Calculate a ribbon path on the drum""" t1 = line_mask(drum, coef, intercept + bandwidth / 2, upper=True) t2 = line_mask(drum, coef, intercept - bandwidth / 2, upper=False) return t1 & t2
The remainder of the program should stay the same, since these functions preserve all the interfaces.
If you want, you can rewrite most of your program in three (still somewhat legible) lines:
coeff = 1/10intercept = 130bandwidth = 15r, c = np.ogrid[:drum.shape[0], :drum.shape[1]]check = r * coeff + interceptsingle_band = ((check + bandwidth / 2 > c) & (check - bandwidth / 2 <= c))