Memory Efficient L2 norm using Python broadcasting Memory Efficient L2 norm using Python broadcasting numpy numpy

Memory Efficient L2 norm using Python broadcasting


Here is broadcasting with shapes of the intermediates made explicit:

m = x.shape[0] # x has shape (m, d)n = y.shape[0] # y has shape (n, d)x2 = np.sum(x**2, axis=1).reshape((m, 1))y2 = np.sum(y**2, axis=1).reshape((1, n))xy = x.dot(y.T) # shape is (m, n)dists = np.sqrt(x2 + y2 - 2*xy) # shape is (m, n)

The documentation on broadcasting has some pretty good examples.


I think what you are asking for already exists in scipy in the form of the cdist function.

from scipy.spatial.distance import cdistres = cdist(test, train, metric='euclidean')


Simplified and working version from this answer:

x, y = test, trainx2 = np.sum(x**2, axis=1, keepdims=True)y2 = np.sum(y**2, axis=1)xy = np.dot(x, y.T)dist = np.sqrt(x2 - 2*xy + y2)

So the approach you have in mind is correct, but you need to be careful how you apply it.

To make your life easier, consider using the tested and proven functions from scipy or scikit-learn.