Numpy: Beginner nditer
It really helps to break things down by printing out what's going on along the way.
First, let's replace your whole loop with this:
i = 0while not it.finished: i += 1print i
It'll print 20, not 5. That's because you're doing a 5x4 iteration, not 5x1.
So, why is this even close to working? Well, let's look at the loop more carefully:
while not it.finished: print '>', it.operands[0], it[0] indices = np.arange(it[0],(it[0]+4), dtype=int) info = array.take(indices) info = info + 81 it.operands[1][...]=info print '<', it.operands[1], it[1]
You'll see that the first five loops go through [0 4 8 12 16]
five times, generating [[81 82 83 84]]
, then [[85 86 87 88]]
, etc. And then the next five loops do the same thing, and again and again.
This is also why your c_index
solutions didn't work—because it.index
is going to range from 0 to 19, and you don't have 20 of anything in it.operands[1]
.
If you did the multi_index right and ignored the columns, you could make this work… but still, you'd be doing a 5x4 iteration, just to repeat each step 4 times, instead of doing the 5x1 iteration you want.
Your it.operands[1][...]=info
replaces the entire output with a 5x1 row each time through the loop. Generally, you shouldn't ever have to do anything to it.operands[1]
—the whole point of nditer
is that you just take care of each it[1]
, and the final it.operands[1]
is the result.
Of course a 5x4 iteration over rows makes no sense. Either do a 5x4 iteration over individual values, or a 5x1 iteration over rows.
If you want the former, the easiest way to do it is to reshape the input array, then just iterate that:
it = np.nditer([array.reshape(5, -1), None], op_flags=[['readonly'], ['readwrite','allocate']])for a, b in it: b[...] = a + 81return it.operands[1]
But of course that's silly—it's just a slower and more complicated way of writing:
return array+81
And it would be a bit silly to suggest that "the way to write your own reshape
is to first call reshape
, and then…"
So, you want to iterate over rows, right?
Let's simplify things a bit by getting rid of the allocate
and explicitly creating a 5x4 array to start with:
outarray = np.zeros((5,4), dtype=array.dtype)offset = np.array([0, 4, 8, 12, 16])it = np.nditer([offset, outarray], flags=['reduce_ok'], op_flags=[['readonly'], ['readwrite']], op_axes=[None, [0]], itershape=[5])while not it.finished: indices = np.arange(it[0],(it[0]+4), dtype=int) info = array.take(indices) '''Just for fun, we'll perform an operation on data.\ Let's shift it to 100''' info = info + 81 it.operands[1][it.index][...]=info it.iternext()return it.operands[1]
This is a bit of an abuse of nditer
, but at least it does the right thing.
Since you're just doing a 1D iteration over the source and basically ignoring the second, there's really no good reason to use nditer
here. If you need to do lockstep iteration over multiple arrays, for a, b in nditer([x, y], …)
is cleaner than iterating over x
and using the index to access y
—just like for a, b in zip(x, y)
outside of numpy
. And if you need to iterate over multi-dimensional arrays, nditer
is usually cleaner than the alternatives. But here, all you're really doing is iterating over [0, 4, 8, 16, 20]
, doing something with the result, and copying it into another array
.
Also, as I mentioned in the comments, if you find yourself using iteration in numpy
, you're usually doing something wrong. All of the speed benefits of numpy
come from letting it execute the tight loops in native C/Fortran or lower-level vector operations. Once you're looping over array
s, you're effectively just doing slow Python numerics with a slightly nicer syntax:
import numpy as npimport timeitdef add10_numpy(array): return array + 10def add10_nditer(array): it = np.nditer([array, None], [], [['readonly'], ['writeonly', 'allocate']]) for a, b in it: np.add(a, 10, b) return it.operands[1]def add10_py(array): x, y = array.shape outarray = array.copy() for i in xrange(x): for j in xrange(y): outarray[i, j] = array[i, j] + 10 return out arraymyArray = np.arange(100000).reshape(250,-1)for f in add10_numpy, add10_nditer, add10_py: print '%12s: %s' % (f.__name__, timeit.timeit(lambda: f(myArray), number=1))
On my system, this prints:
add10_numpy: 0.000458002090454add10_nditer: 0.292730093002 add10_py: 0.127345085144
That shows you the cost of using nditer
unnecessarily.