Semantics in user-added text for categorizing press releases |
Simon Paarlberg
|
Abstract | The aim of this thesis is to analyze and test whether Latent Semantic Analyses (LSA) can be used to improve the delivery of targeted press releases. This is done by using existing content of press releases as a base for finding relevant media outlets. The focus in the thesis is on how LSA works by examples, using the free software package gensim. Various approaches to using LSA are covered along with background information on the media industry.
The result of this thesis has been conducted on data from 138,363 articles from 28 Danish online news outlets and the Danish version of Wikipedia. The result is inconclusive, most likely because the dataset was not big enough. |
Type | Bachelor of Engineering thesis [Industrial collaboration] |
Year | 2012 |
Publisher | Technical University of Denmark, DTU Informatics, E-mail: reception@imm.dtu.dk |
Address | Asmussens Alle, Building 305, DK-2800 Kgs. Lyngby, Denmark |
Series | IMM-B.Eng.-2012-22 |
Note | Supervised by Associate Professor Michael Kai Petersen, mkp@imm.dtu.dk, DTU Informatics. |
Electronic version(s) | [pdf] |
Publication link | http://www.imm.dtu.dk/English.aspx |
BibTeX data | [bibtex] |
IMM Group(s) | Intelligent Signal Processing |