上QQ阅读APP看书,第一时间看更新
Clustering algorithms
Clustering the dataset into useful groups is what clustering algorithms do. The goal of clustering is to create groups of data points, such that points in different clusters are dissimilar, while points within a cluster are similar.
There are two essential elements for clustering algorithms to work:
- Similarity function: This determines how we decide that two points are similar.
- Clustering method: This is the method observed in order to arrive at clusters.
There needs to be a mechanism to determine similarity between points, on which basis they could be categorized as similar or dissimilar. There are various similarity measures. Here are a few:
- Euclidean:
- Cosine:
- KL-divergence: