Numpy: Beginner nditer

It really helps to break things down by printing out what's going on along the way.

First, let's replace your whole loop with this:

i = 0while not it.finished:    i += 1print i

It'll print 20, not 5. That's because you're doing a 5x4 iteration, not 5x1.

So, why is this even close to working? Well, let's look at the loop more carefully:

while not it.finished:    print '>', it.operands[0], it[0]    indices = np.arange(it[0],(it[0]+4), dtype=int)    info = array.take(indices)    info = info + 81    it.operands[1][...]=info    print '<', it.operands[1], it[1]

You'll see that the first five loops go through [0 4 8 12 16] five times, generating [[81 82 83 84]], then [[85 86 87 88]], etc. And then the next five loops do the same thing, and again and again.

This is also why your c_index solutions didn't work—because it.index is going to range from 0 to 19, and you don't have 20 of anything in it.operands[1].

If you did the multi_index right and ignored the columns, you could make this work… but still, you'd be doing a 5x4 iteration, just to repeat each step 4 times, instead of doing the 5x1 iteration you want.

Your it.operands[1][...]=info replaces the entire output with a 5x1 row each time through the loop. Generally, you shouldn't ever have to do anything to it.operands[1]—the whole point of nditer is that you just take care of each it[1], and the final it.operands[1] is the result.

Of course a 5x4 iteration over rows makes no sense. Either do a 5x4 iteration over individual values, or a 5x1 iteration over rows.

If you want the former, the easiest way to do it is to reshape the input array, then just iterate that:

it = np.nditer([array.reshape(5, -1), None],               op_flags=[['readonly'],                         ['readwrite','allocate']])for a, b in it:    b[...] = a + 81return it.operands[1]

But of course that's silly—it's just a slower and more complicated way of writing:

return array+81

And it would be a bit silly to suggest that "the way to write your own reshape is to first call reshape, and then…"

So, you want to iterate over rows, right?

Let's simplify things a bit by getting rid of the allocate and explicitly creating a 5x4 array to start with:

outarray = np.zeros((5,4), dtype=array.dtype)offset = np.array([0, 4, 8, 12, 16])it = np.nditer([offset, outarray],               flags=['reduce_ok'],               op_flags=[['readonly'],                         ['readwrite']],               op_axes=[None, [0]],               itershape=[5])while not it.finished:    indices = np.arange(it[0],(it[0]+4), dtype=int)    info = array.take(indices)    '''Just for fun, we'll perform an operation on data.\       Let's shift it to 100'''    info = info + 81    it.operands[1][it.index][...]=info    it.iternext()return it.operands[1]

This is a bit of an abuse of nditer, but at least it does the right thing.

Since you're just doing a 1D iteration over the source and basically ignoring the second, there's really no good reason to use nditer here. If you need to do lockstep iteration over multiple arrays, for a, b in nditer([x, y], …) is cleaner than iterating over x and using the index to access y—just like for a, b in zip(x, y) outside of numpy. And if you need to iterate over multi-dimensional arrays, nditer is usually cleaner than the alternatives. But here, all you're really doing is iterating over [0, 4, 8, 16, 20], doing something with the result, and copying it into another array.

Also, as I mentioned in the comments, if you find yourself using iteration in numpy, you're usually doing something wrong. All of the speed benefits of numpy come from letting it execute the tight loops in native C/Fortran or lower-level vector operations. Once you're looping over arrays, you're effectively just doing slow Python numerics with a slightly nicer syntax:

import numpy as npimport timeitdef add10_numpy(array):    return array + 10def add10_nditer(array):    it = np.nditer([array, None], [],                   [['readonly'], ['writeonly', 'allocate']])    for a, b in it:        np.add(a, 10, b)    return it.operands[1]def add10_py(array):    x, y = array.shape    outarray = array.copy()    for i in xrange(x):        for j in xrange(y):            outarray[i, j] = array[i, j] + 10    return out arraymyArray = np.arange(100000).reshape(250,-1)for f in add10_numpy, add10_nditer, add10_py:    print '%12s: %s' % (f.__name__, timeit.timeit(lambda: f(myArray), number=1))

On my system, this prints:

 add10_numpy: 0.000458002090454add10_nditer: 0.292730093002    add10_py: 0.127345085144

That shows you the cost of using nditer unnecessarily.

CodeHunter

Numpy: Beginner nditer

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last