Clustering Challenges

Clustering is one of the most widely used methods in unsupervised analysis. Yet its results are often unstable, arbitrary and difficult to justify — limitations incompatible with Responsible AI and the AI Act.

Why Clustering Is Problematic

Unlike supervised models, clustering relies on no ground truth. Algorithms must “invent” a structure from the data, which introduces:

high variability of results
dependence on hyperparameters
arbitrary choices that are hard to justify
lack of explainability
limited reproducibility

These limitations become critical in a demanding regulatory context such as the AI Act.

Variability and Instability

Different Results at Each Execution

Many algorithms (k‑means, GMM, spectral clustering…) produce different results depending on initialization. Two identical runs may yield different clusters.

The k‑means objective function aims to minimize:

$$ J = \sum_{i=1}^{k} \sum_{x \in C_i} \| x - \mu_i \|^2 $$

This minimization depends heavily on the initialization of the centers $\mu_i$, which explains the variability.

Sensitivity to Data

A slight modification of the dataset can lead to a completely different segmentation. This instability makes justification difficult for an auditor.

Arbitrariness of Hyperparameters

Choosing the number of clusters, distance metrics or density parameters often relies on heuristics. These choices are rarely scientifically justifiable.

Example: the “silhouette score”, often used to choose $k$:

$$ s(i) = \frac{b(i) - a(i)}{\max(a(i), b(i))} $$

This type of metric has no strong regulatory or scientific justification, limiting its use under the AI Act.

Problem for the AI Act

The AI Act requires explainable, documented and reproducible decisions. Arbitrary hyperparameters are incompatible with these requirements.

Lack of Explainability

Clusters are often difficult to interpret: why does a given individual belong to a given group? Traditional algorithms do not provide clear narrative or mathematical justification.

This lack of explainability limits the use of clustering in sensitive or regulated contexts.

Learn More

Building a Responsible AI Culture

Responsible AI is not only a regulatory requirement — it is a strategic capability. MathIAs+™ Academy helps your teams master modern, sovereign practices.

Explore the Academy

Discover MRAI‑clustering