Python 2.7: looping over 1D fibers in a multidimensional Numpy array
I think you might be looking for numpy.apply_along_axis
In [10]: def my_func(x): ...: return x**2 + xIn [11]: np.apply_along_axis(my_func, 2, A)Out[11]: array([[[ 0, 2, 6], [ 12, 20, 30], [ 42, 56, 72]], [[ 90, 110, 132], [156, 182, 210], [240, 272, 306]], [[342, 380, 420], [462, 506, 552], [600, 650, 702]]])
Although many NumPy functions (including sum
) have their own axis
argument to specify which axis to use:
In [12]: np.sum(A, axis=2)Out[12]: array([[ 3, 12, 21], [30, 39, 48], [57, 66, 75]])
numpy
provides a number of different ways of looping over 1 or more dimensions.
Your example:
func = np.sumfor fiber_index in np.ndindex(A.shape[:-1]): print func(fiber_index) print A[fiber_index]
produces something like:
(0, 0)[0 1 2](0, 1)[3 4 5](0, 2)[6 7 8]...
generates all index combinations over the 1st 2 dim, giving your function the 1D fiber on the last.
Look at the code for ndindex
. It's instructive. I tried to extract it's essence in https://stackoverflow.com/a/25097271/901925.
It uses as_strided
to generate a dummy matrix over which an nditer
iterate. It uses the 'multi_index' mode to generate an index set, rather than elements of that dummy. The iteration itself is done with a __next__
method. This is the same style of indexing that is currently used in numpy
compiled code.
http://docs.scipy.org/doc/numpy-dev/reference/arrays.nditer.htmlIterating Over Arrays
has good explanation, including an example of doing so in cython
.
Many functions, among them sum
, max
, product
, let you specify which axis (axes) you want to iterate over. Your example, with sum
, can be written as:
np.sum(A, axis=-1)np.sum(A, axis=(1,2)) # sum over 2 axes
An equivalent is
np.add.reduce(A, axis=-1)
np.add
is a ufunc
, and reduce
specifies an iteration mode. There are many other ufunc
, and other iteration modes - accumulate
, reduceat
. You can also define your own ufunc
.
xnx
suggests
np.apply_along_axis(np.sum, 2, A)
It's worth digging through apply_along_axis
to see how it steps through the dimensions of A
. In your example, it steps over all possible i
,j
in a while
loop, calculating:
outarr[(i,j)] = np.sum(A[(i, j, slice(None))])
Including slice
objects in the indexing tuple is a nice trick. Note that it edits a list, and then converts it to a tuple for indexing. That's because tuples are immutable.
Your iteration can applied along any axis by rolling that axis to the end. This is a 'cheap' operation since it just changes the strides.
def with_ndindex(A, func, ax=-1): # apply func along axis ax A = np.rollaxis(A, ax, A.ndim) # roll ax to end (changes strides) shape = A.shape[:-1] B = np.empty(shape,dtype=A.dtype) for ii in np.ndindex(shape): B[ii] = func(A[ii]) return B
I did some timings on 3x3x3, 10x10x10 and 100x100x100 A
arrays. This np.ndindex
approach is consistently a third faster than the apply_along_axis
approach. Direct use of np.sum(A, -1)
is much faster.
So if func
is limited to operating on a 1D fiber (unlike sum
), then the ndindex
approach is a good choice.