A better way for a Python 'for' loop A better way for a Python 'for' loop python python

A better way for a Python 'for' loop


Using

for _ in itertools.repeat(None, count)    do something

is the non-obvious way of getting the best of all worlds: tiny constant space requirement, and no new objects created per iteration. Under the covers, the C code for repeat uses a native C integer type (not a Python integer object!) to keep track of the count remaining.

For that reason, the count needs to fit in the platform C ssize_t type, which is generally at most 2**31 - 1 on a 32-bit box, and here on a 64-bit box:

>>> itertools.repeat(None, 2**63)Traceback (most recent call last):    ...OverflowError: Python int too large to convert to C ssize_t>>> itertools.repeat(None, 2**63-1)repeat(None, 9223372036854775807)

Which is plenty big for my loops ;-)


The first method (in Python 3) creates a range object, which can iterate through the range of values. (It's like a generator object but you can iterate through it several times.) It doesn't take up much memory because it doesn't contain the entire range of values, just a current and a maximum value, where it keeps increasing by the step size (default 1) until it hits or passes the maximum.

Compare the size of range(0, 1000) to the size of list(range(0, 1000)): Try It Online!. The former is very memory efficient; it only takes 48 bytes regardless of the size, whereas the entire list increases linearly in terms of size.

The second method, although faster, takes up that memory I was talking about in the past one. (Also, it seems that although 0 takes up 24 bytes and None takes 16, arrays of 10000 of each have the same size. Interesting. Probably because they're pointers)

Interestingly enough, [0] * 10000 is smaller than list(range(10000)) by about 10000, which kind of makes sense because in the first one, everything is the same primitive value so it can be optimized.

The third one is also nice because it doesn't require another stack value (whereas calling range requires another spot on the call stack), though since it's 6 times slower, it's not worth that.

The last one might be the fastest just because itertools is cool that way :P I think it uses some C-library optimizations, if I remember correctly.


This answer provides a loop construct for convenience. For additional background about looping with itertools.repeat look up Tim Peters' answer above, Alex Martelli's answer here and Raymond Hettinger's answer here.

# loop.py"""Faster for-looping in CPython for cases where intermediate integersfrom `range(x)` are not needed.Example Usage:--------------from loop import loopfor _ in loop(10000):    do_something()# or:results = [calc_value() for _ in loop(10000)]"""from itertools import repeatfrom functools import partialloop = partial(repeat, None)