Statistical Learning Theory, Spring Semester 2024
Instructors
Prof. Dr. Joachim M. BuhmannAssistants
Vignesh Ram SomnathDr. Alina Dubatovka
Evgenii Bykovetc
Dr. Fabian Laumer
Ivan Ovinnikov
João Lourenço Borges Sá Carvalho
Patrik Okanovic
Robin Geyer
Xia Li
Yilmazcan Özyurt
News
 The exam will be held on June 3 2024, from 9 AM  12 PM in rooms ETA F5 & ETF C1.
General Information
This is the last offering of the Statistical Learning Theory course.
The ETHZ Course Catalogue information can be found here.
The course covers advanced methods of statistical learning. The fundamentals of Machine Learning as presented in the course "Introduction to Machine Learning" and "Advanced Machine Learning" are expanded and, in particular, the following topics are discussed:
 Variational methods and optimization. We consider optimization approaches for problems where the optimizer is a probability distribution. We will discuss concepts like maximum entropy, information bottleneck, and deterministic annealing.
 Clustering. This is the problem of sorting data into groups without using training samples. We discuss alternative notions of "similarity" between data points and adequate optimization procedures.
 Model selection and validation. This refers to the question of how complex the chosen model should be. In particular, we present an information theoretic approach for model validation.
 Statistical physics models. We discuss approaches for approximately optimizing large systems, which originate in statistical physics (free energy minimization applied to spin glasses and other models). We also study sampling methods based on these models.
Please use Moodle for questions regarding course material, organization and projects.
Time and Place
Type  Time  Place 

Lectures  Mon 10:1512:00  HG E 7 
Tue 17:1518:00  HG G 5  
Exercises  Mon 16:1518:00  HG G 3 
Course Script
The latest version of the course Script can be found here, with additional chapters on Graph Clustering and Mean Field Approximation at the end of the script.
An older version of the same script can be found at here. It's no longer maintained, but it contains useful notes for some chapters not covered yet in the latest version.
Lectures
Tutorials
Date  Tutorial  Recording Links  Exercises 

Februrary 26 
Calculus Recap
Functional Derivatives 
Recording 
Exercise 1
Solution 1 
March 4 
Information Theory Recap
(Taught on Blackboard) 
Recording 
Exercise 2
Solution 2 
March 11  Sampling  Unavailable 
Exercise 3
Solution 3 
March 18  Deterministic Annealing  Recording 
Exercise 4
Solution 4 
March 25 
Histogram Clustering
(Taught on Blackboard) 
Recording 
Exercise 5
Solution 5 
April 08 
Information Bottleneck

Recording 
Exercise 6
Solution 6 
April 15 
No Tutorial


April 22 
Constant Shift Embeddings

Recording 
Exercise 7
Solution 7 
April 29 
Mean Field Approximation
(Taught on Blackboard) 
Recording 
Exercise 8
Solution 8 
May 06 
Model Selection

Recording  No Exercise 
May 13 
Approximate Sorting

Recording
(Combined with Prof. Buhmann's lecture) 
Exercise 10
Solution 10 
Past written Exams
2018 [Exam] [Solution]
2019 [Exam] [Solution]
2020 [Exam (with solution)]
Projects
Projects are coding exercises that concern the implementation of an algorithm taught in the lecture/exercise class.
There will be four coding exercises, with a time span of approximately two weeks per coding exercise. Each one of them will be graded as not passed or with a passing grade ranging from 4 to 6.
In order to be admitted to the exam the student has to pass (i.e. a grade of 4) in 3 of the 4 projects, and the final grade for the whole class is the weighted average 0.7 exam + 0.3 project. The coding exercises will be provided and submitted via moodle.
Project Release Date  Project Due Date  Topic  Moodle Link 

March 4  March 18  Sampling and Annealing  Coding Exercise 1 
March 25  April 15  Histogram Clustering  Coding Exercise 2 
April 22  May 6  Constant Shift Embeddings  Coding Exercise 3 
May 13  May 27  Model Validation  Coding Exercise 4 
Other Resources
 Duda, Hart, Stork: Pattern Classification, Wiley Interscience, 2000.
 Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer, 2001.
 L. Devroye, L. Gyorfi, and G. Lugosi: A probabilistic theory of pattern recognition. Springer, New York, 1996
Web Acknowledgements
The webpage code is based (with modifications) on the one of the course on Machine Learning (Fall Semester 2013; Prof. A. Krause).