Book ; Online: Silhouette Aggregation
From Micro to Macro
2024
Abstract: Silhouette coefficient is an established internal clustering evaluation measure that produces a score per data point, assessing the quality of its clustering assignment. To assess the quality of the clustering of the whole dataset, the scores of all the ... ...
Abstract | Silhouette coefficient is an established internal clustering evaluation measure that produces a score per data point, assessing the quality of its clustering assignment. To assess the quality of the clustering of the whole dataset, the scores of all the points in the dataset are either (micro) averaged into a single value or averaged at the cluster level and then (macro) averaged. As we illustrate in this work, by using a synthetic example, the micro-averaging strategy is sensitive both to cluster imbalance and outliers (background noise) while macro-averaging is far more robust to both. Furthermore, the latter allows cluster-balanced sampling which yields robust computation of the silhouette score. By conducting an experimental study on eight real-world datasets, estimating the ground truth number of clusters, we show that both coefficients, micro and macro, should be considered. |
---|---|
Keywords | Computer Science - Machine Learning ; Computer Science - Artificial Intelligence |
Subject code | 310 |
Publishing date | 2024-01-11 |
Publishing country | us |
Document type | Book ; Online |
Database | BASE - Bielefeld Academic Search Engine (life sciences selection) |
Full text online
More links
Kategorien
Inter-library loan at ZB MED
Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.