Length of generator output [duplicate] Length of generator output [duplicate] python python

Length of generator output [duplicate]


The easiest way is probably just sum(1 for _ in gen) where gen is your generator.


So, for those who would like to know the summary of that discussion. The final top scores for counting a 50 million-lengthed generator expression using:

  • len(list(gen)),
  • len([_ for _ in gen]),
  • sum(1 for _ in gen),
  • ilen(gen) (from more_itertool),
  • reduce(lambda c, i: c + 1, gen, 0),

sorted by performance of execution (including memory consumption), will make you surprised:

```

1: test_list.py:8: 0.492 KiB

gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))

('list, sec', 1.9684218849870376)

2: test_list_compr.py:8: 0.867 KiB

gen = (i for i in data*1000); t0 = monotonic(); len([i for i in gen])

('list_compr, sec', 2.5885991149989422)

3: test_sum.py:8: 0.859 KiB

gen = (i for i in data*1000); t0 = monotonic(); sum(1 for i in gen); t1 = monotonic()

('sum, sec', 3.441088170016883)

4: more_itertools/more.py:413: 1.266 KiB

d = deque(enumerate(iterable, 1), maxlen=1)test_ilen.py:10: 0.875 KiBgen = (i for i in data*1000); t0 = monotonic(); ilen(gen)

('ilen, sec', 9.812256851990242)

5: test_reduce.py:8: 0.859 KiB

gen = (i for i in data*1000); t0 = monotonic(); reduce(lambda counter, i: counter + 1, gen, 0)

('reduce, sec', 13.436614598002052)```

So, len(list(gen)) is the most frequent and less memory consumable


There isn't one because you can't do it in the general case - what if you have a lazy infinite generator? For example:

def fib():    a, b = 0, 1    while True:        a, b = b, a + b        yield a

This never terminates but will generate the Fibonacci numbers. You can get as many Fibonacci numbers as you want by calling next().

If you really need to know the number of items there are, then you can't iterate through them linearly one time anyway, so just use a different data structure such as a regular list.