Todays class we discussed about on various clustering methods, specifically comparing and contrasting three prominent ones: K-means, K-medoids, and DBSCAN.
- K-means Clustering:
- K-means is a partitioning method that aims to divide a dataset into K distinct, non-overlapping clusters.
- It is a centroid-based approach, where the data points are assigned to the cluster with the nearest centroid.
- K-means has the drawback of being sensitive to the initial placement of centroids and may not work well with non-globular clusters.
- K-medoids Clustering:
- K-medoids is another partitioning method, similar to K-means, but it uses medoids instead of centroids.
- A medoid is the data point within a cluster that minimizes the dissimilarity to all other points in that cluster. It is more robust to outliers than centroids.
- K-medoids is less sensitive to the initial choice of medoids and works well with non-globular clusters.
- K-means and K-medoids are more suitable for datasets with well-defined, globular clusters, while DBSCAN is better at handling clusters of irregular shapes.
- The choice of clustering method often depends on the nature of the data, the desired number of clusters, and the tolerance for outliers.