-
Notifications
You must be signed in to change notification settings - Fork 1
Description
When we have good embeddings we should yield useful clusters, in one paper (Zhang et al: Learning Node Embeddings in Interaction Graphs) I found the following paragraph describing that we can evaluate the performance of the clusters:
Clustering. We first use K-Means to test embeddings on
the unsupervised task. We use Normalized Mutual Information (NMI) [23] score to evaluate clustering results. The NMI score is between 0 and 1. The larger the value, the better the performance. A labeling will have score 1 if it matches the ground truth perfectly, and 0 if it is completely random. Since entities in the Yelp dataset are multi-labeled, we ignore the entities that belong to multiple categories when calculate NMI score.
with our toy-set we can create ground-truth labels and evaluate the embedding technique. We can even compare this with directly applying other techniques (metapath2vec,deepwalk etc)
for the real data-sets no ground truth is known hence we must describe it in a different way.