There exist various metrics to compare performance of a clustering.
We are performing experiments where we are generating the data, so we also know the ground truth, i.e. what cluster each point originally belongs to. In such cases, Rand Index (RI) and Adjusted Rand Index (ARI) provide for a good comparison metric.
We simply compare the original clustering with the predicted clustering, and see how many of those agree.
The raw RI score can be adjusted for chance.
Consider all the possible pairings (original and possible predictions). The maximum RI of all of these pairings form the