Memory Efficient L2 norm using Python broadcasting

python numpy euclidean-distance array-broadcasting

Here is broadcasting with shapes of the intermediates made explicit:

m = x.shape[0] # x has shape (m, d)n = y.shape[0] # y has shape (n, d)x2 = np.sum(x**2, axis=1).reshape((m, 1))y2 = np.sum(y**2, axis=1).reshape((1, n))xy = x.dot(y.T) # shape is (m, n)dists = np.sqrt(x2 + y2 - 2*xy) # shape is (m, n)

The documentation on broadcasting has some pretty good examples.

python numpy euclidean-distance array-broadcasting

I think what you are asking for already exists in scipy in the form of the cdist function.

from scipy.spatial.distance import cdistres = cdist(test, train, metric='euclidean')

python numpy euclidean-distance array-broadcasting

Simplified and working version from this answer:

x, y = test, trainx2 = np.sum(x**2, axis=1, keepdims=True)y2 = np.sum(y**2, axis=1)xy = np.dot(x, y.T)dist = np.sqrt(x2 - 2*xy + y2)

So the approach you have in mind is correct, but you need to be careful how you apply it.

To make your life easier, consider using the tested and proven functions from scipy or scikit-learn.

CodeHunter

Memory Efficient L2 norm using Python broadcasting

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last