Semantics in user-added text for categorizing press releases

Simon Paarlberg

AbstractThe aim of this thesis is to analyze and test whether Latent Semantic Analyses (LSA) can be used to improve the delivery of targeted press releases. This is done by using existing content of press releases as a base for finding relevant media outlets. The focus in the thesis is on how LSA works by examples, using the free software package gensim. Various approaches to using LSA are covered along with background information on the media industry.
The result of this thesis has been conducted on data from 138,363 articles from 28 Danish online news outlets and the Danish version of Wikipedia. The result is inconclusive, most likely because the dataset was not big enough.
TypeBachelor of Engineering thesis [Industrial collaboration]
PublisherTechnical University of Denmark, DTU Informatics, E-mail:
AddressAsmussens Alle, Building 305, DK-2800 Kgs. Lyngby, Denmark
NoteSupervised by Associate Professor Michael Kai Petersen,, DTU Informatics.
IMM Group(s)Intelligent Signal Processing

