Cartesian product of large iterators (itertools)

python memory itertools cartesian-product

Here's an implementation which calls callables and iterates iterables, which are assumed restartable:

def product(*iterables, **kwargs):    if len(iterables) == 0:        yield ()    else:        iterables = iterables * kwargs.get('repeat', 1)        it = iterables[0]        for item in it() if callable(it) else iter(it):            for items in product(*iterables[1:]):                yield (item, ) + items

Testing:

import itertoolsg = product(lambda: itertools.permutations(xrange(100)),            lambda: itertools.permutations(xrange(100)))print next(g)print sum(1 for _ in g)

python memory itertools cartesian-product

Without "iterator recreation", it may be possible for the first of the factors. But that would save only 1/n space (where n is the number of factors) and add confusion.

So the answer is iterator recreation. A client of the function would have to ensure that the creation of the iterators is pure (no side-effects). Like

def iterProduct(ic):    if not ic:        yield []        return    for i in ic[0]():        for js in iterProduct(ic[1:]):            yield [i] + js# Testx3 = lambda: xrange(3)for i in iterProduct([x3,x3,x3]):    print i

python memory itertools cartesian-product

This can't be done with standard Python generators, because some of the iterables must be cycled through multiple times. You have to use some kind of datatype capable of "reiteration." I've created a simple "reiterable" class and a non-recursive product algorithm. product should have more error-checking, but this is at least a first approach. The simple reiterable class...

class PermutationsReiterable(object):    def __init__(self, value):        self.value = value    def __iter__(self):        return itertools.permutations(xrange(self.value))

And product iteslf...

def product(*reiterables, **kwargs):    if not reiterables:        yield ()        return    reiterables *= kwargs.get('repeat', 1)    iterables = [iter(ri) for ri in reiterables]    try:        states = [next(it) for it in iterables]    except StopIteration:        # outer product of zero-length iterable is empty        return    yield tuple(states)    current_index = max_index = len(iterables) - 1    while True:        try:            next_item = next(iterables[current_index])        except StopIteration:            if current_index > 0:                new_iter = iter(reiterables[current_index])                next_item = next(new_iter)                states[current_index] = next_item                iterables[current_index] = new_iter                current_index -= 1            else:                # last iterable has run out; terminate generator                return        else:            states[current_index] = next_item            current_index = max_index            yield tuple(states)

Tested:

>>> pi2 = PermutationsReiterable(2)>>> list(pi2); list(pi2)[(0, 1), (1, 0)][(0, 1), (1, 0)]>>> list(product(pi2, repeat=2))[((0, 1), (0, 1)), ((0, 1), (1, 0)), ((1, 0), (0, 1)), ((1, 0), (1, 0))]>>> giant_product = product(PermutationsReiterable(100), repeat=5)>>> len(list(itertools.islice(giant_product, 0, 5)))5>>> big_product = product(PermutationsReiterable(10), repeat=2)>>> list(itertools.islice(big_product, 0, 5))[((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)),  ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 7, 9, 8)),  ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 8, 7, 9)),  ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 8, 9, 7)),  ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (0, 1, 2, 3, 4, 5, 6, 9, 7, 8))]

CodeHunter

Cartesian product of large iterators (itertools)

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last