Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3? Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3? python python

Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3?


Both work differently. The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.

>>> def func_lc():    [x**2 for x in y]...>>> dis.dis(func_lc)  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>)              3 LOAD_CONST               2 ('func_lc.<locals>.<listcomp>')              6 MAKE_FUNCTION            0              9 LOAD_GLOBAL              0 (y)             12 GET_ITER             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)             16 POP_TOP             17 LOAD_CONST               0 (None)             20 RETURN_VALUE>>> lc_object = list(dis.get_instructions(func_lc))[0].argval>>> lc_object<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>>>> dis.dis(lc_object)  2           0 BUILD_LIST               0              3 LOAD_FAST                0 (.0)        >>    6 FOR_ITER                16 (to 25)              9 STORE_FAST               1 (x)             12 LOAD_FAST                1 (x)             15 LOAD_CONST               0 (2)             18 BINARY_POWER             19 LIST_APPEND              2             22 JUMP_ABSOLUTE            6        >>   25 RETURN_VALUE

On the other hand the list() version simply passes the generator object to list's __init__ method which then calls its extend method internally. As the object is not a list or tuple, CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:

>>> def func_ge():    list(x**2 for x in y)...>>> dis.dis(func_ge)  2           0 LOAD_GLOBAL              0 (list)              3 LOAD_CONST               1 (<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>)              6 LOAD_CONST               2 ('func_ge.<locals>.<genexpr>')              9 MAKE_FUNCTION            0             12 LOAD_GLOBAL              1 (y)             15 GET_ITER             16 CALL_FUNCTION            1 (1 positional, 0 keyword pair)             19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)             22 POP_TOP             23 LOAD_CONST               0 (None)             26 RETURN_VALUE>>> ge_object = list(dis.get_instructions(func_ge))[1].argval>>> ge_object<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>>>> dis.dis(ge_object)  2           0 LOAD_FAST                0 (.0)        >>    3 FOR_ITER                15 (to 21)              6 STORE_FAST               1 (x)              9 LOAD_FAST                1 (x)             12 LOAD_CONST               0 (2)             15 BINARY_POWER             16 YIELD_VALUE             17 POP_TOP             18 JUMP_ABSOLUTE            3        >>   21 LOAD_CONST               1 (None)             24 RETURN_VALUE>>>

Timing comparisons:

>>> %timeit [x**2 for x in range(10**6)]1 loops, best of 3: 453 ms per loop>>> %timeit list(x**2 for x in range(10**6))1 loops, best of 3: 478 ms per loop>>> %%timeitout = []for x in range(10**6):    out.append(x**2)...1 loops, best of 3: 510 ms per loop

Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.

>>> %%timeitout = [];append=out.appendfor x in range(10**6):    append(x**2)...1 loops, best of 3: 467 ms per loop

Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:

>>> [x**2 for x in 1, 2, 3] # Python 2[1, 4, 9]>>> [x**2 for x in 1, 2, 3] # Python 3  File "<ipython-input-69-bea9540dd1d6>", line 1    [x**2 for x in 1, 2, 3]                    ^SyntaxError: invalid syntax>>> [x**2 for x in (1, 2, 3)] # Add parenthesis[1, 4, 9]>>> for x in 1, 2, 3: # Python 3: For normal loops it still works    print(x**2)...149


Both forms create and call an anonymous function. However, the list(...) form creates a generator function and passes the returned generator-iterator to list, while with the [...] form, the anonymous function builds the list directly with LIST_APPEND opcodes.

The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-list:

import disdef f():    [x for x in []]def g():    list(x for x in [])dis.dis(f.__code__.co_consts[1])dis.dis(g.__code__.co_consts[1])

The output for the comprehension is

  4           0 BUILD_LIST               0              3 LOAD_FAST                0 (.0)        >>    6 FOR_ITER                12 (to 21)              9 STORE_FAST               1 (x)             12 LOAD_FAST                1 (x)             15 LIST_APPEND              2             18 JUMP_ABSOLUTE            6        >>   21 RETURN_VALUE

The output for the genexp is

  7           0 LOAD_FAST                0 (.0)        >>    3 FOR_ITER                11 (to 17)              6 STORE_FAST               1 (x)              9 LOAD_FAST                1 (x)             12 YIELD_VALUE             13 POP_TOP             14 JUMP_ABSOLUTE            3        >>   17 LOAD_CONST               0 (None)             20 RETURN_VALUE


You can actually show that the two can have different outcomes to prove they are inherently different:

>>> list(next(iter([])) if x > 3 else x for x in range(10))[0, 1, 2, 3]>>> [next(iter([])) if x > 3 else x for x in range(10)]Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 1, in <listcomp>StopIteration

The expression inside the comprehension is not treated as a generator since the comprehension does not handle the StopIteration, whereas the list constructor does.