Introduction
The Metric module provides a series of widely used topic modeling metrics to evaluate these probabilistic topic models after training, including Accuracy for document classification, Word Perplexity for document modeling, Normalized Mutual Information for document clustering and Topic Coherence for measuring topic quality.
Metrics are as following:
Metrics | API |
---|---|
Classification Accuracy | ACC(x_tr, x_te, y_tr, y_te, model) |
Cluster Accuracy | Cluster_ACC(y_true, y_pred) |
Normalized Mutual Information | NMI(y_true, y_pred) |
Perplexity | Perplexity(x, x_reconstruct) |
Poisson Likelihood | Poisson_Likelihood(x, x_re) |
Reconstruction Error | Reconstruct_Error(x, x_re) |
Topic Coherence | Topic_Coherence |
All metric API are included in pydpm/_metric/…
Example
The demo of Classification Accuracy:
from pydpm._metric import ACC
from pydpm._model import PGBN
# create the model and deploy it on gpu or cpu
model = PGBN([128, 64, 32], device='gpu')
model.initial(train_data)
train_local_params = model.train(100, train_data)
train_local_params = model.test(100, train_data)
test_local_params = model.test(100, test_data)
# evaluate the model with classification accuracy
# the demo accuracy can achieve 0.8549
results = ACC(train_local_params.Theta[0], test_local_params.Theta[0], train_label, test_label, 'SVM')