Deep Belief Nets Topic Modeling



AbstractThis thesis is conducted in collaboration with Issuu, an online publishing company. In order to analyze the vast amount of documents on the platform, Issuu use Latent Dirichlet Allocation as a topic model.
Geoffrey Hinton & Ruslan Salakhutdinov have introduced a new way to perform topic modeling, which they claim can outperform Latent Dirichlet Allocation. The topic model is based on the theory of Deep Belief Nets and is a way of computing the conceptual meaning of documents into a latent representation. The latent representation consists of a reduced dimensionality of binary numbers, which proves to be useful when comparing documents.
The thesis comprises the development of a toolbox for the Deep Belief Nets for topic modeling by which performance measurements has been conducted on the model itself and as a comparison to Latent Dirichlet Allocation.
TypeMaster's thesis [Academic thesis]
Year2014
PublisherTechnical University of Denmark, Department of Applied Mathematics and Computer Science
AddressMatematiktorvet, Building 303B, DK-2800 Kgs. Lyngby, Denmark, compute@compute.dtu.dk
SeriesDTU Compute M.Sc.-2014
NoteDTU supervisor: Ole Winther, olwi@dtu.dk, DTU Compute, external supervisor Morten Arngren from Issuu
Electronic version(s)[pdf]
Publication linkhttp://www.compute.dtu.dk/English.aspx
BibTeX data [bibtex]
IMM Group(s)Intelligent Signal Processing