Statistical Learning Theory, Spring Semester 2020

Instructors

Prof. Dr. Joachim M. Buhmann
Dr. Carlos Cotrini

Assistants

Paolo Penna
Evgenii Bykovetc
Joao Carvalho
Luca Corinzia
Alina Dubatovka
Ivan Ovinnikov
Chris Wendler

General Information

The ETHZ Course Catalogue information can be found here.

The course covers advanced methods of statistical learning. The fundamentals of Machine Learning as presented in the course "Introduction to Machine Learning" and "Advanced Machine Learning" are expanded and, in particular, the following topics are discussed:

  • Variational methods and optimization. We consider optimization approaches for problems where the optimizer is a probability distribution. We will discuss concepts like maximum entropy, information bottleneck, and deterministic annealing.
  • Clustering. This is the problem of sorting data into groups without using training samples. We discuss alternative notions of "similarity" between data points and adequate optimization procedures.
  • Model selection and validation. This refers to the question of how complex the chosen model should be. In particular, we present an information theoretic approach for model validation.
  • Statistical physics models. We discuss approaches for approximately optimizing large systems, which originate in statistical physics (free energy minimization applied to spin glasses and other models). We also study sampling methods based on these models.

Time and Place

Type Time Place
Lectures Mon 14-16, Tue 17-18 HG G 3
Exercises Mon 16-18 HG G 3

Piazza website

link

Lecture Recordings

link

Material

<
Date Lecture Exercise Class/Tutorial Exercises (homework)
Feb 17 Lecture 1 Tutorial 1 Exercise 1 Solution 1
Feb 24 Lecture 2
Notes 1/2 , Notes 2/2
Probability theory cheat sheet
Tutorial 2
Exercise 2 Solution 2
Mar 2 Lecture 3
Notes
Tutorial 3 Exercise 3 Solution 3
Mar 9 Lecture 4
Notes
Tutorial 4 Exercise 4 Solution 4
Mar 16 Lecture 5
Notes
Video Recording Monday
Video Recording Tuesday
Tutorial 5
Tutorial Video
Exercise 5 Solution 5
Mar 23 Lecture 6
Video Recording Monday
Video Recording Tuesday
Tutorial 6
Tutorial Video
Exercise 6 Solution 6
Mar 30 Lecture 7
Video Recording Monday
Video Recording Tuesday

Tutorial 7
Tutorial Video
Exercise 7 Solution 7
Apr 6 Lecture 8
Video Recording Monday
Video Recording Tuesday
Tutorial 8 (and some older notes)
Tutorial Video
Exercise 8 Solution 8
Apr 21 No lecture on Monday
Lecture 9
Video Recording Tuesday
No tutorial this week No exercise this week
Apr 27 Video Recording Monday
Video Recording Tuesday
Tutorial 9
Tutorial Video
Exercise 9 Solution 9
May 4 Zoom ID: 794 819 1159
TUESDAY: 916 6030 8334
Lecture 10
Video Recording Monday
Video Recording Tuesday
Zoom ID: 916 6030 8334
Tutorial 10
Tutorial Video
Exercise 10 Solution 10
May 11 Zoom ID: 916 6030 8334
Lecture 11
Video Recording Monday
Video Recording Tuesday
Zoom ID: 958 1961 2500
Tutorial 11
Tutorial Video
AIC
BIC
Study of cluster validity indices
Package clusterCrit for R [pdf]
May 18 Zoom ID: 916 6030 8334
TUESDAY: 794 819 1159
Lecture 12 (Monday)
Lecture 12 (Tuesday)
Video Recording Monday
Video Recording Tuesday
Zoom ID: 958 1961 2500
Tutorial 12 (and intro slides)
Tutorial Video
Exercise 12 Solution 12
May 25 Zoom ID: 794 819 1159
Lecture 13 (Monday)
Lecture 13 (Tuesday)
Video Recording Monday
Video Recording Tuesday
No tutorial this week No exercise this week

Past written Exams:


Exam 2018
Draft Solution 2018
Exam 2019
Draft Solution 2019

Projects

Projects are small coding exercises that concern the implementation of an algorithm taught in the lecture/exercise class.

There will be seven coding exercises, with a time span of two weeks per coding exercise. Each one of them will be graded as not passed or with a passing grade ranging from 4 to 6. The project part is passed if the student receives a passing grade in at least four coding exercises, and in that case the grade of the project part is the average of the four best coding exercises.

In order to be admitted to the exam the student has to pass the project part, and the final grade for the whole class is the weighted average 0.7 exam + 0.3 project. More details and info are contained into the project repository (including the dates of the various projects and instructions on how to submit solutions).

Reading

  • Preliminary course script (ver. Mar 2019). This script has not been fully checked and thus comes without any guarantees, however, is good for getting oriented in the material.
  • Duda, Hart, Stork: Pattern Classification, Wiley Interscience, 2000.
  • Hastie, Tibshirani, Friedman: The Elements of Statistical Learning, Springer, 2001.
  • L. Devroye, L. Gyorfi, and G. Lugosi: A probabilistic theory of pattern recognition. Springer, New York, 1996

Projects from the ISE group


Proposals

Web Acknowledgements

The web-page code is based (with modifications) on the one of the course on Machine Learning (Fall Semester 2013; Prof. A. Krause).