Detailed Information
This course is given is a one week course in August at The Technical University of Denmark. Subsequently the students spend one month applying the methods to own data. The course is a 5 ECTS course. It is open both for all PhD students and for everyone else via Open University. DTU students should sign up using campus net. For information on how to apply via Open University, see this link. For guest PhD Students information on how to sign up is found here: Guest PhDs
The course material consists of chapters from electronic textbooks and electronic papers. Most lectures will refer to the book "Elements of Statistical Learning" (ESL) by Hastie, Tibshirani and Friedman. This book is freely available from this link. References to other material will be given on CampusNet.
Lectures and exercises are in modules of half a day for each subject (812 o'clock and 1317 o'clock), and will take place at DTU, Lyngby Campus. We will make arrangements for lunch from 1213, but students will need to pay their own lunch. The schedule below is subject to smaller changes  content will be: crossvalidation, model selection, biasvariance tradeoff, over and under fitting, sparse regression, sparse classification, logistic regression, linear discriminant analysis, clustering, classification and regression trees, multiple hypothesis testing, principal component analysis, sparse principal component analysis, support vector machines, neural netwroks, self organizing maps, random forests, boosting, nonnegative matrix factorization, independent component analysis, archetypical analysis, and sparse coding.
Module  Date  Subjects  Lecturer  Litterature 
1  24/8  Introduction to computational data analysis [OLS, Ridge]  Line  ESL Chapters 1, 2, 3.1, 3.2, 3.4.1, 4.1 
2  24/8  Model selection [CV, Bootstrap, Cp, AIC, BIC, ROC]  Line  ESL Chapter 7 and 9.2.5. You may safely skip sections 7.8 and 7.9 
3  25/8  Sparse regression [Lasso, elastic net]  Line  ESL Chapters 3.3, 3.4, 18.1, and 18.7 
4  25/8  Sparse classifiers [LDA, Logistic regression]  Line  ESL Chapters 4.3, 4.4, 18.2, 18.3, 18.4, 5.1, and 5.2 
5  26/8  Nonlinear learners [Support vector machines, CART and KNN]  Line  ESL Chapters 4.5, 4.4, 5.1, 5.2, 9.2 and 13.3 
6  26/8  Ensemble methods [Bagging, random forest, boosting]  Line  ESL Chapter 8.7, 9.2, 10.1 and 15 
7  27/8  Subspace methods [PCA, SPCA, PLS, CCA, PCR]  Line  ESL Chapters 14.5.1, 14.5.5 and 3.5 
8  27/8  Unsupervised decompositions [ICA, NMF, AA, Sparse Coding]  Line  ESL Chapters 14.6 
14.10, [Sparse
Coding, Nature]

9  28/8  Cluster analysis [Hierarchical, Kmeans, GMM, GapStatistic]  Line  ESL Chapter 14.3 
10  28/8  Artificial Neural Networks and Self Organizing Maps  Line  11.111.5 and 14.5 
The student should participate in the course and hand in a small report on one or more of the course subjects related to the students' own research. The grades will be passed/nonpassed. Deadline for the report is one month from the last lecture (i.e. end September).
Line H. Clemmensen, Associate Professor, DTU Compute, Statistics and Data Analysis, lkhc[at]dtu.dk