|Abstract||The Dirichlet compound multinomial (DCM), which has recently been|
shown to be well suited for modeling for word burstiness in documents,
is here investigated. A number of conceptual explanations that account
for these recent results, are provided. An exponential family approximation
of the DCM that is substantially faster to train, while still producing
similar probabilities and classification performance is provided.