@MASTERSTHESIS\{IMM2013-07073, author = "C. Kristiansen and D. Shanti", title = "Integrating Solutions to Solving the Cold Start Problem in the Wikipedia Recommender System", year = "2013", school = "Technical University of Denmark, Department of Applied Mathematics and Computer Science", address = "Richard Petersens Plads, Building 324, {DK-}2800 Kgs. Lyngby, Denmark, compute@compute.dtu.dk", type = "", note = "{DTU} supervisor: Christian D. Jensen, cdje@dtu.dk, {DTU} Compute", url = "http://www.compute.dtu.dk/english", abstract = "Wikipedia is a free online encyclopedia that can be edited by anyone. Due to its open nature, it can be difficult for readers to assess the quality of the articles, since the articles may contain errors due to vandalism or simply due to lack of knowledge by the editors. Wikipedia Recommender System (WRS) is a collaborative filtering system that provides readers with individual rating predictions for Wikipedia articles, based on ratings given by other users for that article. The ratings are weighed according to the similarity with the reader’s ratings on other articles. The purpose of the ratings is to give the reader an indication of the quality of the articles and thereby increasing trust in Wikipedia articles. However, the system is currently a prototype and as such it has no real users or any user generated data. When readers start using {WRS,} the data in the system is expected to be sparse and as Wikipedia contains a tremendous number or articles, a reader may be required to rate a high number of articles, before {WRS} is able to provide useful predictions for the reader. This phenomenon is well known within collaborative filtering systems, and is known as the cold start problem. The purpose of this thesis is to investigate solutions to the cold start problem. This thesis focuses on integrating two techniques for mitigating the cold start problem. The first is to use data from the external service WikiTrust which calculates ratings for Wikipedia articles by determining and computing a trust value for the content. The second technique is to depend on the concept of similarity propagation. This is utilized when a reader requires a predicted rating for a given article but the reader has given only a few ratings, and none of the user’s ratings are similar to those of someone who has rated the article in question. In this case {WRS} may determine that the reader is similar to another {WRS} user who is similar to a third {WRS} user that can be used to predict a rating. The result of this thesis is the prototype of an improved version of {WRS} which is far more capable in the cold start situation. With this improved version of {WRS,} readers of Wikipedia are given an advanced tool that can help them determine the quality of Wikipedia articles in advance and the tool has the potential for helping the millions of internet users that visit Wikipedia every day." }