Why this list comprehension is faster than equivalent generator expression? Why this list comprehension is faster than equivalent generator expression? python-3.x python-3.x

Why this list comprehension is faster than equivalent generator expression?


I believe the difference here is entirely in the cost of 1000000 additions. Testing with 64-bit Python.org 3.3.0 on Mac OS X:

In [698]: %timeit len ([None for n in range (1, 1000000) if n%3 == 1])10 loops, best of 3: 127 ms per loopIn [699]: %timeit sum (1 for n in range (1, 1000000) if n%3 == 1)10 loops, best of 3: 138 ms per loopIn [700]: %timeit sum ([1 for n in range (1, 1000000) if n%3 == 1])10 loops, best of 3: 139 ms per loop

So, it's not that the comprehension is faster than the genexp; they both take about the same time. But calling len on a list is instant, while summing 1M numbers adds another 7% to the total time.

Throwing a few different numbers at it, this seems to hold up unless the list is very tiny (in which case it does seem to get faster), or large enough that memory allocation starts to become a significant factor (which it isn't yet, at 333K).


Borrowed from this answer, there are two things to consider:

1. A Python list is index-able and fetching its length only takes O(1) times. This means that the speed of calling len() on a list does not depend on its size. However, if you call len() on a generator, you're consuming all the items it generates and thus, the time complexity is O(n).

2. See the linked answer above. A list comprehension is a tight C loop, whereas a generator has to store a reference to the iterator inside and call next(iter) for every item it generates. This creates another layer of overhead for generators. At a small scale, the difference in performance between list comprehension and generators can be safely ignored, but at a larger scale, you have to consider this.