ECE 901 – Spring 2014

Statistical Learning Theory

Prerequisite: Background in applied mathematics, probability, and statistics

Robert Nowak
3539 Engineering Hall
Office Hours: T-R 12:15-1pm

Meeting times:
Tuesday and Thursday 11am-12:15pm, room 2305 Engineering Hall

Course Format:
This course will serve as a primer for statistical learning theory as well as a platform
for exploring emerging theory and methods in large-scale data analytics. The course is
intended for PhD students in ECE, CS, Statistics and Mathematics who already have a strong
background in signal processing, machine learning, or mathematical statistics. The first
few weeks of the course will give an introduction to statistical learning theory (somewhat
following the lecture notes below). The rest of the course will involve reading and
discussion of a selection of papers. Each student will be responsible for reading all
papers and presenting a paper in the class.

Grading: Grades will be based on class participation, quizzes, homework assignments,
presentations and/or reports. Each student will be responsible for presenting one of the papers
that the class will study. Every student is required to carefully read the lecture notes and/or
papers prior to class lectures.


Topic 1: Regression

Topic 2: Concentration of Measure and Empirical Risk Minimization

Topic 3: Vapnik-Chervonenkis Theory and Binary Classification

Topic 4: Structural Risk Minimization

Topic 5: Minimax Lower Bounds

Topic 6: Bernstein’s Inequality and the Johnson-Lindenstrauss Lemma
section 2.8 and 2.9 in Concentration Inequalities: A Nonasymptotic Theory of Independence

Topic: Convex Losses and Rademacher Complexity

Topic: Logistic Regression and Lasso
Nikhll Rao, March 6
Nikhll Rao, March 11
March 13 TBA: Ravi Sastry

Topic: Compressed Sensing
Jing F and Shike M, March 25

Topic: Sparsity and Learning Graphs
Sumeet K and Atul D, March 27

Topic: Dictionary Learning
Qinyuan Sun and Shulei Wang, April 1, chapter 5

Topic: The SVD and Matrix Completion
Urvashi O and Vamsi K, April 3 chapter 4 of
Han Li and Karthik A, April 8

Topic: Clustering
Xin Zhang and Albert Oh, April 10
Daniel P and Rao Fu, April 15

Topic: Stochastic Gradient Methods and Large-scale Optimization
Xiaomao L and Ting-Ting, April 17
Willem M and Urvashi O, April 22

Topic: Sketching
Wentao Wu and Luwan Z, April 24 chapter 7 of
Song W and Won Kim, April 29

Topic: Multi-armed Bandits and Active Learning
Sumeet K and Atul D, May 1
Wentao W and Karthik R, May 6
Shike M and Ting-Ting, May 8

Lecture Notes from Previous Course Offerings

Lecture 1 A Probabilistic Approach to Pattern Recognition
Lecture 2 Introduction to Classification and Regression
Lecture 3 Introduction to Complexity Regularization
Lecture 4 Denoising in Smooth Function Spaces
Lecture 5 Plug-in Rules and Histogram Classifiers
Lecture 6 Probably Approximately Correct (PAC) Learning
Lecture 7 Chernoff’s Bound and Hoeffding’s Inequality
Lecture 8 Classification Error Bounds
Lecture 9 Error Bounds in Countably Infinite Models Spaces
Lecture 10 Complexity Regularization
Lecture 11 Decision Trees
Lecture 12 Complexity Regularization for Squared Error Loss
Lecture 13 Maximum Likelihood Estimation
Lecture 14 Maximum Likelihood and Complexity Regularization
Lecture 15 Denoising II: Adapting to Unknown Smoothness
Lecture 16 Wavelet Approximation Theory
Lecture 17 Denoising III: Spatial Adaptivity
Lecture 18 Introduction to VC Theory
Lecture 19 The VC Inequality
Lecture 20 Applications of VC Theory
Lecture 21 Minimax Lower Bounds

Textbooks and References:

A textbook will not be followed in this course. A collection of
notes, relevant papers and materials will be prepared and distributed.
Textbooks recommended for further reading are listed below.

A probabilistic theory of pattern recognition, Devroye, Gyorfi, Lugosi, Springer
Nonparameteric Estimation Theory, Iain Johnstone, unpublished monograph
The Elements of Statistical Learning, Hastie, et al, Springer
An introduction to support vector machines, Cristianini and Shawe-Taylor, Cambridge Press
Combinatorial methods in density estimation, Devroye and Lugosi, Springer
Statistical Learning Theory, Vapnik, Wiley
An Introduction to Computational Learning Theory, Kearns and Vazirani, MIT Press
Empirical Processes in M-Estimation, van de Geer, Cambridge Press

Grading and Evaluation:

Grades will be based on course participation, quizzes, and lecture presentations.