Interpreting negative Word2Vec similarity from gensim
Cosine similarity ranges from -1 to 1, same as a regular cosine wave.
As for the source:
def similarity(self, w1, w2): """ Compute cosine similarity between two words. Example:: >>> trained_model.similarity('woman', 'man') 0.73723527 >>> trained_model.similarity('woman', 'woman') 1.0 """ return dot(matutils.unitvec(self[w1]), matutils.unitvec(self[w2])
As others have said, the cosine similarity can range from -1 to 1 based on the angle between the two vectors being compared. The exact implementation in gensim is a simple dot product of the normalized vectors.
def similarity(self, w1, w2): """ Compute cosine similarity between two words. Example:: >>> trained_model.similarity('woman', 'man') 0.73723527 >>> trained_model.similarity('woman', 'woman') 1.0 """ return dot(matutils.unitvec(self[w1]), matutils.unitvec(self[w2]))
In terms of interpretation, you can think of these values like you might think of correlation coefficients. A value of 1 is a perfect relationship between word vectors (e.g., "woman" compared with "woman"), a value of 0 represents no relationship between words, and a value of -1 represents a perfect opposite relationship between words.