Clustering cosine similarity matrix

python math scikit-learn cluster-analysis data-mining

You can easily do this using spectral clustering. You can use the ready implementations such as the one in sklearn or implement it yourself. It is rather an easy algorithm.

Here is a piece of code doing it in python using sklearn:

import numpy as npfrom sklearn.cluster import SpectralClusteringmat = np.matrix([[1.,.1,.6,.4],[.1,1.,.1,.2],[.6,.1,1.,.7],[.4,.2,.7,1.]])SpectralClustering(2).fit_predict(mat)>>> array([0, 1, 0, 0], dtype=int32)

As you can see it returns the clustering you have mentioned.

The algorithm takes the top k eigenvectors of the input matrix corresponding to the largest eigenvalues, then runs the k-mean algorithm on the new matrix. Here is a simple code that does this for your matrix:

from sklearn.cluster import KMeanseigen_values, eigen_vectors = np.linalg.eigh(mat)KMeans(n_clusters=2, init='k-means++').fit_predict(eigen_vectors[:, 2:4])>>> array([0, 1, 0, 0], dtype=int32)

Note that the implementation of the algorithm in the sklearn library may differ from mine. The example I gave is the simplest way of doing it. There are some good tutorial available online describing the spectral clustering algorithm in depth.

For the cases you want the algorithm to figure out the number of clusters by itself, you can use Density Based Clustering Algorithms like DBSCAN:

from sklearn.cluster import DBSCANDBSCAN(min_samples=1).fit_predict(mat)array([0, 1, 2, 2])

CodeHunter

Clustering cosine similarity matrix

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last