PhD Proficiency Exam (PILAB Sigma)

Machine learning is inevitably a technical subject that requires topics that are not always covered in the undergraduate CS curriculum.

The main goal of the proficiency exam at PiLab Sigma is to demonstrate that you have a sufficient understanding of the state of the art in Machine Learning and basic mathematical background, to pursue PhD work.

You will be assigned a jury of 4 faculty members.

The test has three parts

Written exam, Take home (about 5-7 days time) Example
Written exam, In class (about 3-4 hours)
Oral exam (1-2 hours)

Below is a list of topics that you must be familiar of, that is you should be able to explain what each term means and you should be able to have some experience with each one.

PI Lab Proficiency Exam Topics

Foundations AL,BI
- Probability distributions, Entropy, Expectation
- Bayes Rule, conditional distributions
- Bayesian model comparison
- Statistics: Sampling, estimation, hypothesis testing

Models AL,BI
- Mixture models / k-means
- Factor Analysis / PCA
- Matrix Factorisation models (ICA, NMF)
- Hidden Markov models (HMMs)
- State space models (SSMs)
- Graphical models: directed, undirected, factor graphs

Algorithms AL,BI,R&N
- Forward-backward
- Kalman filtering, smoothing and extended Kalman filtering
- Belief propagation
- The EM Algorithm
- Variational methods
- Laplace approximation and the BIC
- Monte Carlo, Rejection and Importance sampling
- Markov chain Monte Carlo (MCMC) methods: Metropolis Hastings, Gibbs sampler
- Sequential Monte Carlo, Particle filters

Supervised Learning: AL,BI
- Linear regression
- Logistic regression
- Generalised Linear Models
- Perceptrons
- Neural networks (multi-layer perceptrons) and backpropagation
- Gaussian processes
- Support vector machines
- Decision trees

Optimization: BI,B99
- Linear Programming
- Convex functions, Jensens inequality
- Gradient Descent
- Newtons method
- Constrained optimization, Lagrange multipliers

Numerical Analysis and Linear Algebra : T&B,NR3
- Interpolation
- Fourier Transform
- Numerical Integration, Gaussian quadrature
- Matrix algebra and calculus
- Least Squares
- QR factorisation
- Eigenvalues and Eigenvectors
- Singular value decomposition
- Numerical stability and floating point representation

Stochastic optimal control and Reinforcement Learning : (Specialisation) AL,R&N,B05
- Value functions
- Bellman's equation
- Value iteration
- Policy iteration
- Q-Learning
- TD(lambda)

Study Material and References

Basic Machine Learning References. You should be as familiar as possible with the material covered in these books.

(AL) Alpaydin, Ethem (2010). Introduction to Machine Learning (Second ed.)
(DB) David Barber, (2012). Bayesian Reasoning and Machine Learning, Cambridge University Press
(KM) Kevin Murphy(2012). Machine Learning, a Probabilistic Perspective. MIT Press
(BI) Bishop, Christopher (2006), Pattern recognition and Machine Learning

General AI

(R&N) Russell and Norvig (2001), Artificial Intelligence, a modern approach

Matrices, basics of numeric computation and numerical linear algebra is necessary

(T&B) Trefethen and Bau (1996), Numerical Linear Algebra,
(NR3) Press, Teukolsky, Vetterling and Flannery (2007), Numerical Recipes 3rd Edition: The Art of Scientific Computing

Basics of Optimization and Control

(B99) Bertsekas, Dimitri P. (1999). Nonlinear Programming (Second ed.).
(B05) Bertsekas, Dimitri P. (2005).Dynamic Programming and Optimal Control, Vol 1

Further references

In addition to above references, if you want to improve yourself in the filed, below are some books and topics that are good for self study.

Recommendations by Mike Jordan

I personally think that everyone in machine learning should be (completely) familiar with essentially all of the material in the following intermediate-level statistics book:
- 1.) Casella, G. and Berger, R.L. (2001). “Statistical Inference” Duxbury Press.

For a slightly more advanced book that's quite clear on mathematical techniques, the following book is quite good:
- 2.) Ferguson, T. (1996). “A Course in Large Sample Theory” Chapman & Hall/CRC.

You'll need to learn something about asymptotics at some point, and a good starting place is:
- 3.) Lehmann, E. (2004). “Elements of Large-Sample Theory” Springer.

Those are all frequentist books. You should also read something Bayesian:
- 4.) Gelman, A. et al. (2003). “Bayesian Data Analysis” Chapman & Hall/CRC. and you should start to read about Bayesian computation:
- 5.) Robert, C. and Casella, G. (2005). “Monte Carlo Statistical Methods” Springer.
On the probability front, a good intermediate text is:
- 6.) Grimmett, G. and Stirzaker, D. (2001). “Probability and Random Processes” Oxford. At a more advanced level, a very good text is the following:
- 7.) Pollard, D. (2001). “A User's Guide to Measure Theoretic Probability” Cambridge.
- The standard advanced textbook is Durrett, R. (2005). “Probability: Theory and Examples” Duxbury.

Machine learning research also reposes on optimization theory. A good starting book on linear optimization that will prepare you for convex optimization:
- 8.) Bertsimas, D. and Tsitsiklis, J. (1997). “Introduction to Linear Optimization” Athena. And then you can graduate to:
- 9.) Boyd, S. and Vandenberghe, L. (2004). “Convex Optimization” Cambridge.

Getting a full understanding of algorithmic linear algebra is also important. At some point you should feel familiar with most of the material in
- 10.) Golub, G., and Van Loan, C. (1996). “Matrix Computations” Johns Hopkins.
It's good to know some information theory. The classic is:
- 11.) Cover, T. and Thomas, J. “Elements of Information Theory” Wiley.
Finally, if you want to start to learn some more abstract math, you might want to start to learn some functional analysis (if you haven't already). Functional analysis is essentially linear algebra in infinite dimensions, and it's necessary for kernel methods, for nonparametric Bayesian methods, and for various other topics. Here's a book that I find very readable:
- 12.) Kreyszig, E. (1989). “Introductory Functional Analysis with Applications” Wiley.
I now tend to add some books that dig still further into foundational topics. In particular, I recommend
- A. Tsybakov's book “Introduction to Nonparametric Estimation” a very readable source for the tools for obtaining lower bounds on estimators, and
- Y. Nesterov's very readable “Introductory Lectures on Convex Optimization” as a way to start to understand lower bounds in optimization.
- A. van der Vaart's “Asymptotic Statistics”, a book that we often teach from at Berkeley, as a book that shows how many ideas in inference (M estimation — which includes maximum likelihood and empirical risk minimization — the bootstrap, semiparametrics, etc) repose on top of empirical process theory.
- I'd also include B. Efron's “Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction”, as a thought-provoking book.