Application of machine learning in analysis of answers to open-ended questions in survey data

Philip Pries Henningsen

AbstractThe goal of the thesis is to implement a framework for analyzing answers to open-ended questions in a semi-automated way, thereby lessening the cost of including open-ended questions in a survey. To do this, techniques from the machine learning branch of computer science will be explored. More specifically, a methods known as latent semantic analysis and non-negative matrix factorization will be the focus of the thesis. This techniques will be used to extract topics from the answers, which enables me to cluster the answers according to these topics. The clustering will be done using k-means clustering. To implement all of this, the Python programming language is used.
TypeBachelor thesis [Academic thesis]
Year2013
PublisherTechnical University of Denmark, DTU Compute, E-mail: compute@compute.dtu.dk
AddressMatematiktorvet, Building 303-B, DK-2800 Kgs. Lyngby, Denmark
SeriesB.Sc.-2013-25
Electronic version(s)[pdf]
Publication linkhttp://www.compute.dtu.dk/English.aspx
BibTeX data [bibtex]
IMM Group(s)Computer Science & Engineering