The Perceptron algorithm revised
We discussed the basic perceptron algorithm, and its ability to solve non-separable problems
See class notes of Avrim Blum here and the paper by Freund & Schapire, “Large Margin Classification Using the Perceptron Algorithm” (1998)
Convex optimization
We discussed the problem of optimization, Lagrangian, primal and dual problems. We will give an example of efficient solution of support vector machine (SVMs) . Class notes can be downloaded here.
SVMs and dual coordinate ascent
We presented dual coordinate ascent algorithm (Hildreth’s algorithm) to solve SVM. Class notes can be downloaded from here.
SVMs and stochastic sub-gradient descent
We present the stochastic sub-gradient descent algorithm to binary SVM. Class notes can be downloaded from here.
Multiclass and multilabel
Class notes on multiclass classification can be downloaded from here. Definition of the multilabel problem and its surrogate loss is presented in Section 2 of (Crammer and Singer, 2003) and in Section 7 (the first half) of (Crammer, Dekel, Keshet, Shalev-Shwartz, and Singer, 2006)
Online and batch learning
Search engines and learning to rank
The presentation can be downloaded from here.
Structured prediction
First presentation on structured prediction can be downloaded from here.
Look at the presentation of Klein and Taskar and the presentation of Gärtner and Vembu.
Conditional Random Fields (CRFs)
CRFs are extension of logistic regression to structured output problems. A description of CRF is given here.
Structured prediction 2 and maximizing AUC (area under the the ROC curve)
Second presentation on structured prediction can be downloaded from here.
The derivation of the AUC can be found here.
The derivation of Structured Probit can be found here.
Recommender systems
Presentation can be found here.
If you are interested, see also papers by Yehuda Koren.
Latent Dirichlet Allocation
Presentation can be found here.
2014a, 2014b, 2015a, and 2015b