space
Section for Cognitive Systems
DTU Compute

02450 Introduction to Machine Learning and Data Mining

Georgios Arvanitidis
Georgios Arvanitidis
 
Bjørn Sand Jensen
Bjørn Sand Jensen
 
Tue Herlau
Tue Herlau
 
Morten Mørup
Morten Mørup
 
Mikkel N. Schmidt
Mikkel N. Schmidt
 
Anders Stevnhoved Olsen
Anders Stevnhoved Olsen
 
Lenka Hýlová
Lenka Hýlová
 
Meadhbh Healy
Meadhbh Healy
 
Sam Walton Norwood
Sam Walton Norwood
 
Bjarke Arnskjær Hastrup
Bjarke Arnskjær Hastrup
 
Saeid Barzegarkhordehbalagh
Saeid Barzegarkhordehbalagh
 
Emilie Wedenborg
Emilie Wedenborg
 
Yevhenii Osadchuk
Yevhenii Osadchuk
 
Bjarke Lars Verner Bruhn Erichsen
Bjarke Lars Verner Bruhn Erichsen
 
Vimal Velusamy Bhararhi
Vimal Velusamy Bhararhi
 
Zalán Zsiborás
Zalán Zsiborás
 
Kalle Leander Johansen
Kalle Leander Johansen
 
Lucie Fontaine
Lucie Fontaine
 
Ruijie Ren
Ruijie Ren
 
Ousama Mhadden
Ousama Mhadden
 
Mathias Sofus Hovmark
Mathias Sofus Hovmark
 
Panagiotis Apostolidis
Panagiotis Apostolidis
 
Albert Kjøller Jacobsen
Albert Kjøller Jacobsen
 
Yeganeh Ghamary
Yeganeh Ghamary
 
Ana-Daria Zahaleanu
Ana-Daria Zahaleanu
 

Machine learning and data mining

The course is designed around a data modeling framework shown in the figure. Each lecture/assignment will focus on an aspect of the data modeling framework.

data modeling framework

We emphasize the holistic view of modeling in order to motivate and stress the relevance of individual components and building blocks, disseminate the obtained competence (see the course learning obejctives), and make them applicable for a broad spectrum of engineering problems in e.g. biomedical engineering, chemistry, electrical engineering, and informatics.

Resources

DTU Learn

If you are enrolled in the course you can access material and participate in the course through the DTU Learn homepage.

Lectures

The lectures will take place in Building 116 auditorium 081 and 083 on Tuesdays from 13:00-15:00.

If you cannot attend the lectures in person, it is possible to stream the lectures online, and all lectures will be recorded and made available online.

Exercises

Exercises will take place after lectures on Tuesdays from 15:00-17:00.

You will be able to attend the exercises online, however physical attendance is preferred and highly recommended.

We expect you will have access to your own laptop/computer during the exercise sessions. Exercises will be available in Matlab, R, and Python and we recommend selecting a language you are familiar with. If you are unfamiliar with any of the languages, we recommend Python.

Please bring a laptop computer for the exercises. The exercises will be available in Matlab, R, and Python and we recommend selecting a language you are familiar with. If you are unfamiliar with any of the languages, we recommend Python. The exercise rooms are (room capacity in square brackets and programming language in parentheses):

Virtual exercise rooms on Microsoft Teams :

Reading material, lecture slides and exercises

The course will use lecture notes and other freely available material. Lecture notes, slides, course assignment instructions etc. is available at the DTU learn course page (requires formal enrolment to the course).

Online demos

We have developed several online demos which illustrates key concepts from the course. The topics discussed currently includes PCA, regression, classification and density estimation.

Course description

A description of the course can be found at the DTU Coursebase

Online help and support

Support outwith the scheduled sessions is primarialy available through the Piazza forum.

Teachers

Lecture schedule

No. Date Subject Reading Homework
130 August, 2022 GAIntroduction C1
Data: Feature extraction, and visualization
26 September, 2022 GAData, feature extraction and PCA C2, C3 P3.1, P2.1, P3.2
313 September, 2022 GAMeasures of similarity, summary statistics and probabilities C4, C5 P4.1, P4.2, P4.3
420 September, 2022 GAProbability densities and data visualization C6, C7 P6.1, P6.2, P7.1
Supervised learning: Classification and regression
527 September, 2022 GADecision trees and linear regression C8, C9 P9.1, P8.1, P8.2
64 October, 2022 GAOverfitting, cross-validation and Nearest Neighbor (Project 1 due before 13:00) C10, C12 P10.1, P10.2, P12.1
711 October, 2022 GAPerformance evaluation, Bayes, and Naive Bayes C11, C13 P13.1, 13.2, P12.2
Holiday
825 October, 2022 GAArtificial Neural Networks and Bias/Variance C14, C15 P15.1, P15.2, P15.3
91 November, 2022 GAAUC and ensemble methods C16, C17 P16.1, P16.2, P17.1
Unsupervised learning: Clustering and density estimation
108 November, 2022 BJK-means and hierarchical clustering C18 P18.1, P18.2, P18.3
1115 November, 2022 BJMixture models and density estimation (Project 2 due before 13:00) C19, C20 P20.1, P19.1, P19.2
1222 November, 2022 BJAssociation mining C21 P21.1, P18.2, P18.3
Recap
1329 November, 2022 GARecap and discussion of the exam C1-C21

(Cx refers to Chapter x of the course notes. Px.y refers to problem number y in chapter x of the course notes.
The first listed problem will be that weeks discussion question at the exercises.)

FAQ

DTU logo space
space