Machine Learning

Course Description

Machine learning algorithms are data analysis methods which search data sets for patterns and characteristic structures. Typical tasks are the classification of data, automatic regression and unsupervised model fitting. Machine learning has emerged mainly from computer science and artificial intelligence, and draws on methods from a variety of related subjects including statistics, applied mathematics and more specialized fields, such as pattern recognition and neural computation. Applications are, for example, image and speech analysis, medical imaging, bioinformatics and exploratory data analysis in natural science and engineering:


Non-linear decision boundary of a trained support vector machine (SVM) using a radial-basis function kernel.	Fisher's linear discriminant analysis (LDA) of four different auditory scenes: speech, speech in noise, noise and music.	Gene expression levels obtained from a micro-array experiment, used in gene function prediction.

Other related courses offered at the D-INFK include: Computational Intelligence Lab, Probabilistic Artificial Intelligence, Advanced Topics in Machine Learning, Information Retrieval, Data Mining.

Announcements

The exam will take place on Tuesday, 23rd of January from 9.00 to 12.00 in HIL G41 and HIL G15 (ETH Hönggerberg).

Syllabus

Calendar Week	Lecture Topics	Lecture Slides	Tutorial Slides	Exercise Sheets & Solutions	Material
38	Introduction	Course Information, Philosophical Motivation, Slides Lecture 1		Exercise Sheet 1, Solution 1
39	Introduction / Data Types / Regression	Slides Lecture 2.1, Slides Lecture 2.2	Slides Tutorial 1, jupyter notebook	Exercise Sheet 2 (updated), Solution 2
40	Regression	continuation of last week's slides	Slides Tutorial 2	Exercise Sheet 3 (UPD v3) Solution 3
41	Gaussian Processes	Slides Lecture 4	Annotated PDFs: Wed @ 1pm Wed @ 3pm Thu @ 3pm Fri @ 1pm Python Demo: [ipynb] or [web]	Exercise Sheet 4, Solution 4	A proposal from 1955
42	Density Estimation in Regression: Parametric Models	Slides Lecture 5	Slides Tutorial 4, jupyter notebook	Exercise Sheet 5, Solution 5
43	Numerical Estimation Techniques	Slides Lecture 6	Slides Tutorial 5	Exercise Sheet 6 Solution 6	Hastie-Tibshirani-Friedman: The Elements of Statistical Learning
44	Introduction to Classification	Slides Lecture 7	Slides Tutorial 6 jupyter notebook	Exercise Sheet 7 Data problem 1 Solution 7 jupyter notebook
45	Linear Discriminant Functions	Slides Lecture 8	Slides Tutorial 7 (updated)	Exercise Sheet 8 (updated), Solution 8
46	Support Vector Machines	Slides Lecture 9	Slides Tutorial 8	Exercise Sheet 9 ex9_skeleton graphSVM Solution Sheet 9 Solution Script 9
47	Kernels	Slides Lecture 10	Slides Tutorial 9	Exercise Sheet 10, Solution Sheet 10
48	Structural SVMs, Ensemble learners	Slides Lecture 11.1 Slides Lecture 11.2 Slides Random Forests	Kernel Regression [ipynb], Kernel SVM [ipynb], Unbalanced SVM [ipynb], Slides Tutorial 10	Exercise Sheet 11 Solution Sheet 11 [corrected]
49	Neural Networks	Slides Lecture 12 ConvNets	Slides Tutorial 11 Random Forests, imbalanced data [py]	Exercise Sheet 12, Solution Sheet 12
50	Unsupervised Learning	Slides Lecture 13	Slides Tutorial 12	Exercise Sheet 13, Solution Sheet 13
51	Mixture Models	Slides Lecture 14	Slides: Slides Tutorial 13 Notebooks: kernel density estimation, k-nearest neighbours, k-means, GMMs, histograms [ipynb]

Some of the material can only be accessed with a valid nethz account.

Video recordings of the lectures

General Information

Course Information (as presented in Lecture 1)

VVZ Information

See here.

Times and Places

Lectures

Time	Room	Remarks
Thu 14-15	ML D 28
	ML E 12	live video stream
Fri 08-10	HG F 1
	HG F 3	live video stream

Tutorials

Time	Room	Remarks
Wed 13-15	CAB G 61	Surnames A-F
Wed 15-17	CAB G 61	Surnames G-K
Thu 15-17	CAB G 51	Surnames L-R
Fri 13-15	CAB G 61	Surnames S-Z

The first tutorials sessions take place in the second week of the semester. All tutorial sessions are identical. Please attend the session assigned to you based on the first letter of your last name.

Exercises

The exercise problems will contain theoretical pen & paper assignments. It is not mandatory to submit solutions. A Testat is not required in order to participate in the exam. Solutions to the exercise problems will be published on this website.

If you choose to submit solutions:

Send a soft copy of your solutions to the respective teaching assistant for that exercise (specified on top of the exercise sheet). This can be latex, but also a simple scan or even a picture of a hand-written solution.
Please do not submit hard copies of your solutions.

Projects

In order to complete the course, students have to participate in three practical projects. The goal of these projects is to get hands-on experience in machine learning tasks. See the Project repository for further information (log in using your nethz credentials).

Exam

There will be a written exam of 180 minutes length. The language of examination is English. As written aids, you can bring two A4 pages (i.e. one A4 sheet of paper), either handwritten or 11 point minimum font size. The grade obtained in the written exam will constitute 70% of the total grade.

Piazza

To account for the scale of this course, we will answer questions regarding lectures exercises and projects on Piazza. To allow for an optimal flow of information, please ask your content-related questions on this platform (rather than via direct emails) and label them accordingly (e.g., by indicating which lecture / project your question refers to). In this manner, your question and our answer are visible to everyone. Consequently, please read existing question-answer pairs before asking new questions.

Use the sign-up link to sign up for Piazza.

Text Books

C. Bishop. Pattern Recognition and Machine Learning. Springer 2007.
This is an excellent introduction to machine learning that covers most topics which will be treated in the lecture. Contains lots of exercises, some with exemplary solutions. Available from ETH-HDB and ETH-INFK libraries.

R. Duda, P. Hart, and D. Stork. Pattern Classification. John Wiley & Sons, second edition, 2001.
The classic introduction to the field. An early edition is available online for students attending this class, the second edition is available from ETH-BIB and ETH-INFK libraries.

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2001.
Another comprehensive text, written by three Stanford statisticians. Covers additive models and boosting in great detail. Available from ETH-BIB and ETH-INFK libraries. A free pdf version is available.

L. Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer, 2004.
This book is a compact treatment of statistics that facilitates a deeper understanding of machine learning methods. Available from ETH-BIB and ETH-INFK libraries.

D. Barber. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2012.
This book is a compact and extensive treatment of most topics. Available for personal use online: Link.

K. Murphy. Machine Learning: A Probabilistic Perspective. MIT, 2012.
Unified probabilistic introduction to machine learning. Available from ETH-BIB and ETH-INFK libraries.

S. Shalev-Shwartz, and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014.
This recent book covers the mathematical foundations of machine learning. Available for personal use online: Link.

Previous Exams

Contact

Please ask questions related to the course using Piazza, not via email.

Instructor: Prof. Joachim M. Buhmann
Head Assistants: Rebekka Burkholz, Luis Haug
Teaching Assistants: Leonard Adolphs, An Bian, Luca Corinzia, Hadi Daneshmand, Natalie Davidson, Alina Dubatovka, Viktor Gal, Benjamin Gallusser, Alex Gronskiy, Xinrui Lyu, Djordje Miladinovic, Harun Mustafa, Aytunc Sahin, Viktor Wegmayr, Sebastian Weichwald