K-Medoids is a clustering algorithm used in data analysis and machine learning. It’s a variation of the more common K-Means clustering method. K-Medoids, also known as Partitioning Around Medoids (PAM), is used to group similar data points into clusters, with a focus on robustness and the ability to handle outliers. Here’s a simple explanation:
Start with a set of data points, such as our dataset of police shooting records. In K-Medoids, we select k initial data points as “medoids” or cluster centers. These are not necessarily the means of the data points, as in K-Means, but actual data points from our dataset. Each data point is assigned to the nearest medoid, creating clusters based on proximity. The medoids are then updated by selecting the data point that minimizes the total dissimilarity (distance) to other data points within the same cluster. Repeat iteratively until the clusters no longer change significantly.
K-Medoids is especially useful when dealing with data that may have outliers or when you want to identify the most central or representative data points within clusters. It’s a bit more robust than K-Means because it doesn’t rely on the mean, which can be sensitive to extreme values. Instead, it uses actual data points as medoids, making it better suited for certain types of datasets. In your project, you might use K-Medoids to cluster police shooting records, potentially identifying the most representative cases within each cluster.