Introduction

The Metric module provides a series of widely used topic modeling metrics to evaluate these probabilistic topic models after training, including Accuracy for document classification, Word Perplexity for document modeling, Normalized Mutual Information for document clustering and Topic Coherence for measuring topic quality.

Metrics are as following:

Metrics	API
Classification Accuracy	ACC(x_tr, x_te, y_tr, y_te, model)
Cluster Accuracy	Cluster_ACC(y_true, y_pred)
Normalized Mutual Information	NMI(y_true, y_pred)
Perplexity	Perplexity(x, x_reconstruct)
Poisson Likelihood	Poisson_Likelihood(x, x_re)
Reconstruction Error	Reconstruct_Error(x, x_re)
Topic Coherence	Topic_Coherence

All metric API are included in pydpm/_metric/…

Example

The demo of Classification Accuracy:

from pydpm._metric import ACC
from pydpm._model import PGBN

# create the model and deploy it on gpu or cpu
model = PGBN([128, 64, 32], device='gpu')
model.initial(train_data)
train_local_params = model.train(100, train_data)
train_local_params = model.test(100, train_data)
test_local_params = model.test(100, test_data)

# evaluate the model with classification accuracy
# the demo accuracy can achieve 0.8549
results = ACC(train_local_params.Theta[0], test_local_params.Theta[0], train_label, test_label, 'SVM')