Deep Belief Nets Topic Modeling |
|
Abstract | This thesis is conducted in collaboration with Issuu, an online publishing company. In order to analyze the vast amount of documents on the platform, Issuu use Latent Dirichlet Allocation as a topic model.
Geoffrey Hinton & Ruslan Salakhutdinov have introduced a new way to perform topic modeling, which they claim can outperform Latent Dirichlet Allocation. The topic model is based on the theory of Deep Belief Nets and is a way of computing the conceptual meaning of documents into a latent representation. The latent representation consists of a reduced dimensionality of binary numbers, which proves to be useful when comparing documents.
The thesis comprises the development of a toolbox for the Deep Belief Nets for topic modeling by which performance measurements has been conducted on the model itself and as a comparison to Latent Dirichlet Allocation. |
Type | Master's thesis [Academic thesis] |
Year | 2014 |
Publisher | Technical University of Denmark, Department of Applied Mathematics and Computer Science |
Address | Matematiktorvet, Building 303B, DK-2800 Kgs. Lyngby, Denmark, compute@compute.dtu.dk |
Series | DTU Compute M.Sc.-2014 |
Note | DTU supervisor: Ole Winther, olwi@dtu.dk, DTU Compute, external supervisor Morten Arngren from Issuu |
Electronic version(s) | [pdf] |
Publication link | http://www.compute.dtu.dk/English.aspx |
BibTeX data | [bibtex] |
IMM Group(s) | Intelligent Signal Processing |