How to compute "EMD" for 2 numpy arrays i.e "histogram" using opencv? How to compute "EMD" for 2 numpy arrays i.e "histogram" using opencv? numpy numpy

How to compute "EMD" for 2 numpy arrays i.e "histogram" using opencv?


You have to define your arrays in terms of weights and coordinates. If you have two arrays a = [1,1,0,0,1] and b = [0,1,0,1] that represent one dimensional histograms, then the numpy arrays should look like this:

a = [[1 1]     [1 2]     [0 3]     [0 4]     [1 5]]b = [[0 1]     [1 2]     [0 3]     [1 4]]

Notice that the number of rows can be different. The number of columns should be the dimensions + 1. The first column contains the weights, and the second column contains the coordinates.

The next step is to convert your arrays to a CV_32FC1 Mat before you input the numpy array as a signature to the CalcEMD2 function. The code would look like this:

from cv2 import *import numpy as np# Initialize a and b numpy arrays with coordinates and weightsa = np.zeros((5,2))for i in range(0,5):    a[i][1] = i+1a[0][0] = 1a[1][0] = 1a[2][0] = 0a[3][0] = 0a[4][0] = 1b = np.zeros((4,2))for i in range(0,4):    b[i][1] = i+1b[0][0] = 0b[1][0] = 1b[2][0] = 0b[3][0] = 1    # Convert from numpy array to CV_32FC1 Mata64 = cv.fromarray(a)a32 = cv.CreateMat(a64.rows, a64.cols, cv.CV_32FC1)cv.Convert(a64, a32)b64 = cv.fromarray(b)b32 = cv.CreateMat(b64.rows, b64.cols, cv.CV_32FC1)cv.Convert(b64, b32)# Calculate Earth Mover'sprint cv.CalcEMD2(a32,b32,cv.CV_DIST_L2)# Wait for keycv.WaitKey(0)

Notice that the third parameter of CalcEMD2 is the Euclidean Distance CV_DIST_L2. Another option for the third parameter is the Manhattan Distance CV_DIST_L1.

I would also like to mention that I wrote the code to calculate the Earth Mover's distance of two 2D histograms in Python. You can find this code here.


CV.CalcEMD2 expects arrays that also include the weight for each signal according to the documentation.

I would suggest defining your arrays with a weight of 1, like so:

a=array([1,1],[2,1],[3,1],[4,1],[5,1])b=array([1,1],[2,1],[3,1],[4,1])


I know the OP wanted to measure Earth Mover's Distance using OpenCV, but if you'd like to do so using Scipy, you can use the following (Wasserstein Distance is also known as Earth Mover's Distance):

from scipy.stats import wasserstein_distancefrom scipy.ndimage import imreadimport numpy as npdef get_histogram(img):  '''  Get the histogram of an image. For an 8-bit, grayscale image, the  histogram will be a 256 unit vector in which the nth value indicates  the percent of the pixels in the image with the given darkness level.  The histogram's values sum to 1.  '''  h, w = img.shape  hist = [0.0] * 256  for i in range(h):    for j in range(w):      hist[img[i, j]] += 1  return np.array(hist) / (h * w)a = imread('a.jpg')b = imread('b.jpg')a_hist = get_histogram(a)b_hist = get_histogram(b)dist = wasserstein_distance(a_hist, b_hist)print(dist)