Chapter 10 Support Vector Machines
Support Vector Machines (SVM) is a classification model1 that maps observations as points in space so that the categories are divided by as wide a gap as possible. New observations can then be mapped into that same space for prediction. The SVM algorithm finds the optimal separating hyperplane using a nonlinear mapping to a sufficiently high dimension. The hyperplane is defined by the observations that lie within a margin optimized by a cost hyperparameter. These observations are called the support vectors.
SVM is an extension of the support vector classifier which in turn is a generalization of the simple and intuitive maximal margin classifier. The best way to understand the SVM is to start with the maximal margin classifier and work up.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2017. The Elements of Statistical Learning. 2nd ed. New York, NY: Springer. https://web.stanford.edu/~hastie/ElemStatLearn/.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. 1st ed. New York, NY: Springer. http://faculty.marshall.usc.edu/gareth-james/ISL/book.html.
Kuhn, Max, and Kjell Johnson. 2016. Applied Predictive Modeling. 1st ed. New York, NY: Springer. http://appliedpredictivemodeling.com/.
Other classification models include LDA, logistic regression, and tree-based models such as bagging, random forests, and gradient boosting.↩︎