Unexpected output from list(generator) Unexpected output from list(generator) python python

Unexpected output from list(generator)


This behaviour has been fixed in python 3. When you use a list comprehension [i(0) + i(1) for a in alist] you will define a in its surrounding scope which is accessible for i. In a new session list(i(0) + i(1) for a in alist) will throw error.

>>> i = lambda x: a[x]>>> alist = [(1, 2), (3, 4)]>>> list(i(0) + i(1) for a in alist)Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 1, in <genexpr>  File "<stdin>", line 1, in <lambda>NameError: global name 'a' is not defined

A list comprehension is not a generator: Generator expressions and list comprehensions.

Generator expressions are surrounded by parentheses (“()”) and list comprehensions are surrounded by square brackets (“[]”).

In your example list() as a class has its own scope of variables and it has access to global variables at most. When you use that, i will look for a inside that scope. Try this in new session:

>>> i = lambda x: a[x]>>> alist = [(1, 2), (3, 4)]>>> [i(0) + i(1) for a in alist][3, 7]>>> a(3, 4)

Compare it to this in another session:

>>> i = lambda x: a[x]>>> alist = [(1, 2), (3, 4)]>>> l = (i(0) + i(1) for a in alist)<generator object <genexpr> at 0x10e60db90>>>> aTraceback (most recent call last):  File "<stdin>", line 1, in <module>NameError: name 'a' is not defined>>> [x for x in l]Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 1, in <genexpr>  File "<stdin>", line 1, in <lambda>NameError: global name 'a' is not defined

When you run list(i(0) + i(1) for a in alist) you will pass a generator (i(0) + i(1) for a in alist) to the list class which it will try to convert it to a list in its own scope before return the list. For this generator which has no access inside lambda function, the variable a has no meaning.

The generator object <generator object <genexpr> at 0x10e60db90> has lost the variable name a. Then when list tries to call the generator, lambda function will throw error for undefined a.

The behaviour of list comprehensions in contrast with generators also mentioned here:

List comprehensions also "leak" their loop variable into the surrounding scope. This will also change in Python 3.0, so that the semantic definition of a list comprehension in Python 3.0 will be equivalent to list(). Python 2.4 and beyond should issue a deprecation warning if a list comprehension's loop variable has the same name as a variable used in the immediately surrounding scope.

In python3:

>>> i = lambda x: a[x]>>> alist = [(1, 2), (3, 4)]>>> [i(0) + i(1) for a in alist]Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 1, in <listcomp>  File "<stdin>", line 1, in <lambda>NameError: name 'a' is not defined


Important things to understand here are

  1. generator expressions will be creating function objects internally but list comprehension will not.

  2. they both will bind the loop variable to the values and the loop variables will be in the current scope if they are not already created.

Lets see the byte codes of the generator expression

>>> dis(compile('(i(0) + i(1) for a in alist)', 'string', 'exec'))  1           0 LOAD_CONST               0 (<code object <genexpr> at ...>)              3 MAKE_FUNCTION            0              6 LOAD_NAME                0 (alist)              9 GET_ITER                         10 CALL_FUNCTION            1             13 POP_TOP                          14 LOAD_CONST               1 (None)             17 RETURN_VALUE        

It loads the code object and then it makes it a function. Lets see the actual code object.

>>> dis(compile('(i(0) + i(1) for a in alist)', 'string', 'exec').co_consts[0])  1           0 LOAD_FAST                0 (.0)        >>    3 FOR_ITER                27 (to 33)              6 STORE_FAST               1 (a)              9 LOAD_GLOBAL              0 (i)             12 LOAD_CONST               0 (0)             15 CALL_FUNCTION            1             18 LOAD_GLOBAL              0 (i)             21 LOAD_CONST               1 (1)             24 CALL_FUNCTION            1             27 BINARY_ADD                       28 YIELD_VALUE                      29 POP_TOP                          30 JUMP_ABSOLUTE            3        >>   33 LOAD_CONST               2 (None)             36 RETURN_VALUE        

As you see here, the current value from the iterator is stored in the variable a. But since we make this a function object, the a created will be visible only within the generator expression.

But in case of list comprehension,

>>> dis(compile('[i(0) + i(1) for a in alist]', 'string', 'exec'))  1           0 BUILD_LIST               0              3 LOAD_NAME                0 (alist)              6 GET_ITER                    >>    7 FOR_ITER                28 (to 38)             10 STORE_NAME               1 (a)             13 LOAD_NAME                2 (i)             16 LOAD_CONST               0 (0)             19 CALL_FUNCTION            1             22 LOAD_NAME                2 (i)             25 LOAD_CONST               1 (1)             28 CALL_FUNCTION            1             31 BINARY_ADD                       32 LIST_APPEND              2             35 JUMP_ABSOLUTE            7        >>   38 POP_TOP                          39 LOAD_CONST               2 (None)             42 RETURN_VALUE        

There is no explicit function creation and the variable a is created in the current scope. So, a is leaked in to the current scope.


With this understanding, lets approach your problem.

>>> i = lambda x: a[x]>>> alist = [(1, 2), (3, 4)]

Now, when you create a list with comprehension,

>>> [i(0) + i(1) for a in alist][3, 7]>>> a(3, 4)

you can see that a is leaked to the current scope and it is still bound to the last value from the iteration.

So, when you iterate the generator expression after the list comprehension, the lambda function uses the leaked a. That is why you are getting [7, 7], since a is still bound to (3, 4).

But, if you iterate the generator expression first, then the a will be bound to the values from alist and will not be leaked to the current scope as generator expression becomes a function. So, when the lambda function tries to access a, it couldn't find it anywhere. That is why it fails with the error.

Note: The same behaviour cannot be observed in Python 3.x, because the leaking is prevented by creating functions for list comprehensions as well. You might want to read more about this in the History of Python blog's post, From List Comprehensions to Generator Expressions, written by Guido himself.


You should make a a parameter to your lambda function. This works as expected:

In [10]: alist = [(1, 2), (3, 4)]In [11]: i = lambda a, x: a[x]In [12]: [i(a, 0) + i(a, 1) for a in alist]Out[12]: [3, 7]In [13]: list(i(a, 0) + i(a, 1) for a in alist)Out[13]: [3, 7]

An alternative way to get the same result would be:

In [14]: [sum(a) for a in alist]Out[14]: [3, 7]

EDIT this answer is just a simple workaround and is not a real answer to the question. The observed effect is a bit more complex, see my other answer.