@MASTERSTHESIS\{IMM2012-06473, author = "B. D. H{\o}yer", title = "Sequence Signals for Protein Expression", year = "2012", school = "Technical University of Denmark, {DTU} Informatics, {E-}mail: reception@imm.dtu.dk", address = "Asmussens Alle, Building 305, {DK-}2800 Kgs. Lyngby, Denmark", type = "", note = "{DTU} supervisor: Ole Winther, owi@imm.dtu.dk, {DTU} Informatics", url = "http://www.imm.dtu.dk/English.aspx", abstract = "This thesis was prepared at the department of Informatics and Mathematical Modelling at the Technical University of Denmark in fullment of the requirements for acquiring an M.Sc. in Mathematical Modelling and Computation. It was carried out in collaboration with Novozymes A/S. The thesis deals with codon optimisation as a method for increasing protein yields in an industrial setting. The goal is to provide an accurate and sufficient description of {DNA} to protein translation for a mathematician to get a grasp of the topic. A parallel goal is to provide some insight into machine learning techniques in biotechnology for biologists. Finally, the goal is to develop a model from a machine learning standard, rather than necessarily a biological standpoint. This thesis is loosely divided into three parts. Part 1 Introduction and biological background. Topics from {DNA} transcription to {RNA} translation to proteins are covered. Part 2 Prior works and machine learning methodology. Part 3 Modelling building and validation." }