Why can I use the same name for iterator and sequence in a Python for loop? Why can I use the same name for iterator and sequence in a Python for loop? python python

Why can I use the same name for iterator and sequence in a Python for loop?


What does dis tell us:

Python 3.4.1 (default, May 19 2014, 13:10:29)[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> from dis import dis>>> dis("""x = [1,2,3,4,5]... for x in x:...     print(x)... print(x)""")  1           0 LOAD_CONST               0 (1)              3 LOAD_CONST               1 (2)              6 LOAD_CONST               2 (3)              9 LOAD_CONST               3 (4)             12 LOAD_CONST               4 (5)             15 BUILD_LIST               5             18 STORE_NAME               0 (x)  2          21 SETUP_LOOP              24 (to 48)             24 LOAD_NAME                0 (x)             27 GET_ITER        >>   28 FOR_ITER                16 (to 47)             31 STORE_NAME               0 (x)  3          34 LOAD_NAME                1 (print)             37 LOAD_NAME                0 (x)             40 CALL_FUNCTION            1 (1 positional, 0 keyword pair)             43 POP_TOP             44 JUMP_ABSOLUTE           28        >>   47 POP_BLOCK  4     >>   48 LOAD_NAME                1 (print)             51 LOAD_NAME                0 (x)             54 CALL_FUNCTION            1 (1 positional, 0 keyword pair)             57 POP_TOP             58 LOAD_CONST               5 (None)             61 RETURN_VALUE

The key bits are sections 2 and 3 - we load the value out of x (24 LOAD_NAME 0 (x)) and then we get its iterator (27 GET_ITER) and start iterating over it (28 FOR_ITER). Python never goes back to load the iterator again.

Aside: It wouldn't make any sense to do so, since it already has the iterator, and as Abhijit points out in his answer, Section 7.3 of Python's specification actually requires this behavior).

When the name x gets overwritten to point at each value inside of the list formerly known as x Python doesn't have any problems finding the iterator because it never needs to look at the name x again to finish the iteration protocol.


Using your example code as the core reference

x = [1,2,3,4,5]for x in x:    print xprint x

I would like you to refer the section 7.3. The for statement in the manual

Excerpt 1

The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list.

What it means is that your variable x, which is a symbolic name of an object list : [1,2,3,4,5] is evaluated to an iterable object. Even if the variable, the symbolic reference changes its allegiance, as the expression-list is not evaluated again, there is no impact to the iterable object that has already been evaluated and generated.

Note

  • Everything in Python is an Object, has an Identifier, attributes and methods.
  • Variables are Symbolic name, a reference to one and only one object at any given instance.
  • Variables at run-time can change its allegiance i.e. can refer to some other object.

Excerpt 2

The suite is then executed once for each item provided by the iterator, in the order of ascending indices.

Here the suite refers to the iterator and not to the expression-list. So, for each iteration, the iterator is executed to yield the next item instead of referring to the original expression-list.


It is necessary for it to work this way, if you think about it. The expression for the sequence of a for loop could be anything:

binaryfile = open("file", "rb")for byte in binaryfile.read(5):    ...

We can't query the sequence on each pass through the loop, or here we'd end up reading from the next batch of 5 bytes the second time. Naturally Python must in some way store the result of the expression privately before the loop begins.


Are they in different scopes?

No. To confirm this you could keep a reference to the original scope dictionary (locals()) and notice that you are in fact using the same variables inside the loop:

x = [1,2,3,4,5]loc = locals()for x in x:    print locals() is loc  # True    print loc["x"]  # 1    break

What's going on under the hood that allows something like this to work?

Sean Vieira showed exactly what is going on under the hood, but to describe it in more readable python code, your for loop is essentially equivalent to this while loop:

it = iter(x)while True:    try:        x = it.next()    except StopIteration:        break    print x

This is different from the traditional indexing approach to iteration you would see in older versions of Java, for example:

for (int index = 0; index < x.length; index++) {    x = x[index];    ... }

This approach would fail when the item variable and the sequence variable are the same, because the sequence x would no longer be available to look up the next index after the first time x was reassigned to the first item.

With the former approach, however, the first line (it = iter(x)) requests an iterator object which is what is actually responsible for providing the next item from then on. The sequence that x originally pointed to no longer needs to be accessed directly.