Section for Cognitive Systems
DTU Compute

02450 Introduction to Machine Learning and Data Mining

Morten Mørup
Morten Mørup
Tue Herlau
Tue Herlau
Mikkel N. Schmidt
Mikkel N. Schmidt
Pegah Hafiz
Pegah Hafiz
Martin J&oslashrgensen
Martin Jørgensen
Laura Rieger
Laura Rieger
Nicki Skafte Detlefsen
Nicki Skafte Detlefsen
Morten Wehlast J&oslashrgensen
Morten Wehlast Jørgensen
Frederik Rahb&aeligk Warburg
Frederik Rahbæk Warburg
Malte K. E. Jensen
Malte K. E. Jensen
Benjamin J&uumlttner
Benjamin Jüttner
Frederik B. H&uumlttel
Frederik B. Hüttel
Lorenzo Belgrano
Lorenzo Belgrano
Jakob P. Thorsbro
Jakob P. Thorsbro

Machine learning and data mining

The course is designed around a data modeling framework shown in the figure. Each lecture/assignment will focus on an aspect of the data modeling framework.

data modeling framework

We emphasize the holistic view of modeling in order to motivate and stress the relevance of individual components and building blocks, disseminate the obtained competence (see the course learning obejctives), and make them applicable for a broad spectrum of engineering problems in e.g. biomedical engineering, chemistry, electrical engineering, and informatics.



The lectures will take place in building 306 auditorium 31 and 32 Tuesdays from 13:00-15:00 followed by exercises in

Python: 303A group area east and west as well as auditurium 42

Matlab: Building 308 first floor group area north and south as well as Building 306 1st floor group area east and west and auditorium 31

R: Building 450 room 006A and 006B from 15:00-17:00.

Please bring a laptop computer for the exercises.

Reading material, lecture slides and exercises

The course will use lecture notes and other freely available material. Lecture notes, slides, course assignment instructions etc. is available at the DTU Campusnet course page (requires formal enrolment to the course).

Course description

A description of the course can be found at the DTU Coursebase

Online help and support

Online help and support is available through the Piazza course platform.


Lecture schedule

No. Date Subject Preparation
130 January, 2018 MMIntroduction C1
Data: Feature extraction, and visualization
26 February, 2018 MMData and feature extraction C2, C3. (P3.1, P2.1, P3.2)
313 February, 2018 MMMeasures of similarity and summary statistics C4. (P4.1, P4.2, P4.3)
420 February, 2018 MMData Visualization and probability C5, C6. (P5.1, P5.2, P6.1)
Supervised learning: Classification and regression
527 February, 2018 MMDecision trees and linear regression (Hand in project 1 before 13:00) C7, C8. (P8.1, P7.1, P7.2)
66 March, 2018 MMOverfitting and performance evaluation C9. (P9.1, P9.2, P9.3)
713 March, 2018 MMNearest Neighbor, Bayes and Naive Bayes C10, C11. (P11.1, P11.2, P10.1)
820 March, 2018 MMArtificial Neural Networks and Bias/Variance C12, C13. (P13.1, P13.2, P13.3)
93 April, 2018 MMAUC and ensemble methods C14, C15. (P14.1, P14.2, P15.1)
Unsupervised learning: Clustering and density estimation
1010 April, 2018 MMK-means and hierarchical clustering (Hand in project 2 before 13:00) C16. (P16.1, P16.2, P16.3)
1117 April, 2018 MMMixture models and density estimation C17, C18. (P18.1, P17.1, P17.2)
1224 April, 2018 MMAssociation mining C19. (P19.1, P16.2, P16.3)
131 May, 2018 MMRecap and discussion of the exam (Hand in project 3 before 13:00) C1-C19

(Cx refers to Chapter x of the course notes. Px.y refers to problem number y in chapter x of the course notes.
The first listed problem will be that weeks discussion question at the exercises.)

DTU logo space