How to apply outer product for tensors without unnecessary increase of dimensions?
We need to introduce broadcastable dimensions into the two input matrices with dimshuffle
and then let broadcasting
take care of the elementwise multiplication resulting in outer-product between coresponding rows of them.
Thus, with V
and W
as the theano matrices, simply do -
V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)
In NumPy
, we have np.newaxis
to extend dimensions and np.transpose()
for permuting dimensions. With theno
, there's dimshuffle
to do both of these tasks with a mix of listing dimension IDs and x
's for introducing new broadcast-able axes.
Sample run
1) Inputs :
# Numpy arraysIn [121]: v = np.random.randint(11,99,(3,4)) ...: w = np.random.randint(11,99,(3,5)) ...: # Perform outer product on corresponding rows in inputsIn [122]: for i in range(v.shape[0]): ...: print(np.outer(v[i],w[i])) ...: [[2726 1972 1740 2117 1972] [8178 5916 5220 6351 5916] [7520 5440 4800 5840 5440] [8648 6256 5520 6716 6256]][[8554 3458 8918 4186 4277] [1786 722 1862 874 893] [8084 3268 8428 3956 4042] [2444 988 2548 1196 1222]][[2945 2232 1209 372 682] [2565 1944 1053 324 594] [7125 5400 2925 900 1650] [6840 5184 2808 864 1584]]
2) Theano part :
# Get to theano : Get the theano matrix versions In [123]: V = T.matrix('v') ...: W = T.matrix('w') ...: # Use proposed codeIn [124]: OUT = V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)# Create a function out of it and then use on input NumPy arraysIn [125]: f = function([V,W], OUT)
3) Verify results :
In [126]: f(v,w) # Verify results against the earlier loopy resultsOut[126]: array([[[ 2726., 1972., 1740., 2117., 1972.], [ 8178., 5916., 5220., 6351., 5916.], [ 7520., 5440., 4800., 5840., 5440.], [ 8648., 6256., 5520., 6716., 6256.]], [[ 8554., 3458., 8918., 4186., 4277.], [ 1786., 722., 1862., 874., 893.], [ 8084., 3268., 8428., 3956., 4042.], [ 2444., 988., 2548., 1196., 1222.]], [[ 2945., 2232., 1209., 372., 682.], [ 2565., 1944., 1053., 324., 594.], [ 7125., 5400., 2925., 900., 1650.], [ 6840., 5184., 2808., 864., 1584.]]])
I can't believe nobody has tried to use np.einsum
.
warray([[1, 8, 9, 2], [1, 2, 9, 0], [5, 8, 7, 3], [2, 9, 8, 2]])v array([[1, 4, 5, 9], [9, 1, 3, 7], [9, 6, 1, 5], [4, 9, 7, 0]])for i in range(w.shape[0]): print(np.outer(w[i], v[i]))[[ 1 4 5 9] [ 8 32 40 72] [ 9 36 45 81] [ 2 8 10 18]][[ 9 1 3 7] [18 2 6 14] [81 9 27 63] [ 0 0 0 0]][[45 30 5 25] [72 48 8 40] [63 42 7 35] [27 18 3 15]][[ 8 18 14 0] [36 81 63 0] [32 72 56 0] [ 8 18 14 0]]np.einsum('ij,ik->ijk', w, v)array([[[ 1, 4, 5, 9], [ 8, 32, 40, 72], [ 9, 36, 45, 81], [ 2, 8, 10, 18]], [[ 9, 1, 3, 7], [18, 2, 6, 14], [81, 9, 27, 63], [ 0, 0, 0, 0]], [[45, 30, 5, 25], [72, 48, 8, 40], [63, 42, 7, 35], [27, 18, 3, 15]], [[ 8, 18, 14, 0], [36, 81, 63, 0], [32, 72, 56, 0], [ 8, 18, 14, 0]]])
It looks like the equivalent Theano function is theano.tensor.batched_dot
(which is supposed to be even faster than einsum
), but I have no experience with Theano.