Why does the UnboundLocalError occur on the second variable of the flat comprehension? Why does the UnboundLocalError occur on the second variable of the flat comprehension? python-3.x python-3.x

Why does the UnboundLocalError occur on the second variable of the flat comprehension?


This behaviour is (implicitly) described in the reference documentation (emphasis mine).

However, aside from the iterable expression in the leftmost for clause, the comprehension is executed in a separate implicitly nested scope. This ensures that names assigned to in the target list don’t “leak” into the enclosing scope.

The iterable expression in the leftmost for clause is evaluated directly in the enclosing scope and then passed as an argument to the implictly [sic] nested scope. Subsequent for clauses and any filter condition in the leftmost for clause cannot be evaluated in the enclosing scope as they may depend on the values obtained from the leftmost iterable. For example: [x*y for x in range(10) for y in range(x, x+10)].

This means that:

list_ = [(x, y) for x in range(x) for y in range(y)]

equivalent to:

def f(iter_):    for x in iter_:        for y in range(y):            yield x, ylist_ = list(f(iter(range(x))))

As the name x in for the leftmost iterable is read in the enclosing scope as opposed to the nested scope then there is no name conflict between these two uses of x. The same is not true for y, which is why it is where the UnboundLocalError occurs.

As to why this happens: a list comprehension is more-or-less syntactic sugar for list(<generator expression>), so it's going to be using the same code path as a generator expression (or at least behave in the same way). Generator expressions evaluate the iterable expression in the leftmost for clause to make error handling when the generator expression somewhat saner. Consider the following code:

y = None                             # line 1gen = (x + 1 for x in range(y + 1))  # line 2item = next(gen)                     # line 3

y is clearly the wrong type and so the addition will raise a TypeError. By evaluating range(y + 1) immediately that type error is raised on line 2 rather than line 3. Thus, it is easier to diagnose where and why the problem occurred. Had it occurred on line 3 then you might mistakenly assume that it was the x + 1 statement that caused the error.

There is a bug report here that mentions this behaviour. It was resolved as "not a bug" for reason that it is desirable that list comprehensions and generator expressions have the same behaviour.