23rd October

There are different methods of clustering: k-means, k-medoids, and DBSCAN.

k-means is a centroid-based clustering algorithm. It works by first randomly selecting k centroids, which are representative points of the data set. Then, each data point is assigned to the closest centroid. The centroids are then updated to be the average of the data points assigned to them. This process is repeated until the centroids no longer change.

k-medoids is also a centroid-based clustering algorithm, but instead of using the average of the data points assigned to a centroid, it uses the medoid, which is the most centrally located data point in the cluster. This makes k-medoids more robust to outliers than k-means.

DBSCAN is a density-based clustering algorithm. It works by identifying regions of high density in the data set. A data point is considered to be part of a cluster if it is within a certain distance (epsilon) of at least a minimum number of other data points (minPts). DBSCAN is able to identify clusters of arbitrary shape and size, and it is also robust to outliers.

Leave a Reply

Your email address will not be published. Required fields are marked *