Readings

Amazon logo Help support MIT OpenCourseWare by shopping at Amazon.com! MIT OpenCourseWare offers direct links to Amazon.com to purchase the books cited in this course. Click on the Amazon logo to the left of any citation and purchase the book from Amazon.com, and MIT OpenCourseWare will receive up to 10% of all purchases you make. Your support will enable MIT to continue offering open access to MIT courses.

LEC # TOPICS READINGS
1 Introduction, linear classification, perceptron update rule
2 Perceptron convergence, generalization
3 Maximum margin classification

Optional

Amazon logo Cristianini, Nello, and John Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge, UK: Cambridge University Press, 2000. ISBN: 9780521780193.

Burges, Christopher. "A Tutorial on Support Vector Machines for Pattern Recognition." Data Mining and Knowledge Discovery 2, no. 2 (June 1998): 121-167.

4 Classification errors, regularization, logistic regression
5 Linear regression, estimator bias and variance, active learning
6 Active learning (cont.), non-linear predictions, kernals
7 Kernal regression, kernels
8 Support vector machine (SVM) and kernels, kernel optimization

Short tutorial on Lagrange multipliers (PDF)

Optional

Stephen Boyd's course notes on convex optimization

Amazon logo Boyd, Stephen, and Lieven Vandenberghe. Convex Optimization. Cambridge, UK: Cambridge University Press, 2004. ISBN: 9780521833783.

9 Model selection
10 Model selection criteria
Midterm
11 Description length, feature selection
12 Combining classifiers, boosting
13 Boosting, margin, and complexity

Optional

Schapire, Robert. "A Brief Introduction to Boosting." Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999, pp. 1401-1406.

14 Margin and generalization, mixture models

Optional

Bartlett, Peter, Yoav Freund, Wee sun Lee, and Robert E. Schapire. "Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods." Annals of Statistics 26, no. 5 (1998): 1651-1686.

15 Mixtures and the expectation maximization (EM) algorithm
16 EM, regularization, clustering
17 Clustering
18 Spectral clustering, Markov models

Optional

Shi, Jianbo, and Jitendra Malik. "Normalized Cuts and Image Segmentation." IEEE Transactions on Pattern Analysis and Machine Intelligence 22, no. 8 (2000): 888-905.

19 Hidden Markov models (HMMs)

Optional

Rabiner, Lawrence R. "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition." Proceedings of the IEEE 77, no. 2 (1989): 257-286.

20 HMMs (cont.)
21 Bayesian networks

Optional

Amazon logo Heckerman, David. "A Tutorial on Learning with Bayesian Networks." In Learning in Graphical Models by Michael I. Jordan. Cambridge, MA: MIT Press, 1998. ISBN: 9780262600323.

22 Learning Bayesian networks
23

Probabilistic inference

Guest lecture on collaborative filtering

Final
24 Current problems in machine learning, wrap up

References

Amazon logo Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 9780198538646.

Amazon logo Duda, Richard, Peter Hart, and David Stork. Pattern Classification. 2nd ed. New York, NY: Wiley-Interscience, 2000. ISBN: 9780471056690.

Amazon logo Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. New York, NY: Springer, 2001. ISBN: 9780387952840.

Amazon logo MacKay, David. Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press, 2003. ISBN: 9780521642989. Available on-line here.

Amazon logo Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.

Amazon logo Cover, Thomas M., and Joy A. Thomas. Elements of Information Theory. New York, NY: Wiley-Interscience, 1991. ISBN: 9780471062592.