R expand.grid() function in Python R expand.grid() function in Python r r

R expand.grid() function in Python


Just use list comprehensions:

>>> [(x, y) for x in range(5) for y in range(5)][(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)]

convert to numpy array if desired:

>>> import numpy as np>>> x = np.array([(x, y) for x in range(5) for y in range(5)])>>> x.shape(25, 2)

I have tested for up to 10000 x 10000 and performance of python is comparable to that of expand.grid in R. Using a tuple (x, y) is about 40% faster than using a list [x, y] in the comprehension.

OR...

Around 3x faster with np.meshgrid and much less memory intensive.

%timeit np.array(np.meshgrid(range(10000), range(10000))).reshape(2, 100000000).T1 loops, best of 3: 736 ms per loop

in R:

> system.time(expand.grid(1:10000, 1:10000))   user  system elapsed   1.991   0.416   2.424 

Keep in mind that R has 1-based arrays whereas Python is 0-based.


product from itertools is the key to your solution. It produces a cartesian product of the inputs.

from itertools import productdef expand_grid(dictionary):   return pd.DataFrame([row for row in product(*dictionary.values())],                        columns=dictionary.keys())dictionary = {'color': ['red', 'green', 'blue'],               'vehicle': ['car', 'van', 'truck'],               'cylinders': [6, 8]}>>> expand_grid(dictionary)    color  cylinders vehicle0     red          6     car1     red          6     van2     red          6   truck3     red          8     car4     red          8     van5     red          8   truck6   green          6     car7   green          6     van8   green          6   truck9   green          8     car10  green          8     van11  green          8   truck12   blue          6     car13   blue          6     van14   blue          6   truck15   blue          8     car16   blue          8     van17   blue          8   truck


Here's an example that gives output similar to what you need:

import itertoolsdef expandgrid(*itrs):   product = list(itertools.product(*itrs))   return {'Var{}'.format(i+1):[x[i] for x in product] for i in range(len(itrs))}>>> a = [1,2,3]>>> b = [5,7,9]>>> expandgrid(a, b){'Var1': [1, 1, 1, 2, 2, 2, 3, 3, 3], 'Var2': [5, 7, 9, 5, 7, 9, 5, 7, 9]}

The difference is related to the fact that in itertools.product the rightmost element advances on every iteration. You can tweak the function by sorting the product list smartly if it's important.