@MASTERSTHESIS\{IMM2015-06887, author = "A. M. Nielsen and B. D. Hughes", title = "Real-time suggestions for improving engagement of social media posts using machine learning", year = "2015", school = "Technical University of Denmark, Department of Applied Mathematics and Computer Science", address = "Richard Petersens Plads, Building 324, {DK-}2800 Kgs. Lyngby, Denmark, compute@compute.dtu.dk", type = "", note = "{DTU} supervisors: David Kofoed Wind, and Ole Winther, olwi@dtu.dk, in collaboration with Falcon Social", url = "http://www.compute.dtu.dk/English.aspx", abstract = "This thesis attempted to apply predictive analytics on Facebook, and integrate it to an enterprise environment in collaboration with the Danish company Falcon Social. While doing so, a framework that documents the process was outlined, which will also function as a way of implementing new big data projects and extending the one made in this thesis. Regression and classification models such as Random Forest, Decision Tree, and {K-}Nearest Neighbours was applied and evaluated against each other. They all attempted to predict how a Facebook post would perform in terms of the number of achieved likes, consumptions and impressions. Through careful validation with an unseen test set, it was possible to predict significantly better than random guessing. Especially prediction of likes seemed to have almost 72 \% accuracy on above/below average estimates. A Median Absolute Percentage Error (MdAPE) estimate of 52.80 \% on likes showed that regression was applicable as well. Some predictions proved to be far off, even more so when a post had achieved an amount of likes which was much larger than the average likes that the page would normally get. Overall, the predictions performed better than random and there is still room for improvement. Random Forest seemed to almost always outperform or match the other models with statistical significance." }