Why is itertools.chain faster than a flattening list comprehension?

python list-comprehension itertools flatten

Here is itertools.chain.from_iterable. It's not hard to read even if you don't know C and you can tell everything is happening at the c level (before being used to generate a list in your code).

The bytecode for list comprehensions is like so:

def f(lsts):    return [x for y in lsts for x in y]dis.dis(f.__code__.co_consts[1])  2           0 BUILD_LIST               0              2 LOAD_FAST                0 (.0)        >>    4 FOR_ITER                18 (to 24)              6 STORE_FAST               1 (y)              8 LOAD_FAST                1 (y)             10 GET_ITER        >>   12 FOR_ITER                 8 (to 22)             14 STORE_FAST               2 (x)             16 LOAD_FAST                2 (x)             18 LIST_APPEND              3             20 JUMP_ABSOLUTE           12        >>   22 JUMP_ABSOLUTE            4        >>   24 RETURN_VALUE

These are all the python interpreter operations involved in creating a list comprehension. Just having all the operations at the C level (in chain) rather than having the interpreter step over each byte code step (in the comprehension) is what will give you that performance boost.

Still, that boost is so small I wouldn't worry about it. This is python, readability over speed.

Edit:

For a list wrapped generator comprehension

def g(lists):    return list(x for y in lsts for x in y)# the comprehensiondis.dis(g.__code__.co_consts[1])  2           0 LOAD_FAST                0 (.0)        >>    2 FOR_ITER                20 (to 24)              4 STORE_FAST               1 (y)              6 LOAD_FAST                1 (y)              8 GET_ITER        >>   10 FOR_ITER                10 (to 22)             12 STORE_FAST               2 (x)             14 LOAD_FAST                2 (x)             16 YIELD_VALUE             18 POP_TOP             20 JUMP_ABSOLUTE           10        >>   22 JUMP_ABSOLUTE            2        >>   24 LOAD_CONST               0 (None)             26 RETURN_VALUE

So the interpreter has a similar number of steps to go to in running the generator expression being unpacked by list, but as you would expect, the python level overhead of having list unpack a generator (as opposed to the C LIST_APPEND instruction) is what slows it down.

dis.dis(f)  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x000000000FB58B70, file "<ipython-input-33-1d46ced34d66>", line 2>)              2 LOAD_CONST               2 ('f.<locals>.<listcomp>')              4 MAKE_FUNCTION            0              6 LOAD_FAST                0 (lsts)              8 GET_ITER             10 CALL_FUNCTION            1             12 RETURN_VALUEdis.dis(g)  2           0 LOAD_GLOBAL              0 (list)              2 LOAD_CONST               1 (<code object <genexpr> at 0x000000000FF6F420, file "<ipython-input-40-0334a7cdeb8f>", line 2>)              4 LOAD_CONST               2 ('g.<locals>.<genexpr>')              6 MAKE_FUNCTION            0              8 LOAD_GLOBAL              1 (lsts)             10 GET_ITER             12 CALL_FUNCTION            1             14 CALL_FUNCTION            1             16 RETURN_VALUE

CodeHunter

Why is itertools.chain faster than a flattening list comprehension?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last