Approximating The DCM

Rasmus Elsborg Madsen

AbstractThe Dirichlet compound multinomial (DCM), which has recently been
shown to be well suited for modeling for word burstiness in documents,
is here investigated. A number of conceptual explanations that account
for these recent results, are provided. An exponential family approximation
of the DCM that is substantially faster to train, while still producing
similar probabilities and classification performance is provided.
Keywordsdirichlet approximation DCM
TypeTechnical report
Year2005    Month December
IMM Group(s)Intelligent Signal Processing

