6.15 Assessing Model Accuracy
- There are various measure of model accuracy (depend on outcome)
- If outcome binary we can use the below
- Training error rate: the proportion of mistakes that are made if we apply estimate to the training observations
- \(\frac{1}{n}\sum_{i=1}^{n}I(y_{i}\neq\hat{y}_{i})\): Fraction of incorrect classific
- \(\hat{y}_{i}\): predicted class label for observation \(i\)
- \(I(y_{i}\neq\hat{y}_{i})\): indicator variable that equals 1 if \(y_{i}\neq\hat{y}_{i}\) and zero \(y_{i}=\hat{y}_{i}\)
- If \(I(y_{i}\neq\hat{y}_{i})=0\) then the ith observation was classified correctly (otherwise missclassified)
- \(\frac{1}{n}\sum_{i=1}^{n}I(y_{i}\neq\hat{y}_{i})\): Fraction of incorrect classific
- Test error rate: Associated with a set of test observations of the form (\(x_{0},y_{0}\))
- \(Ave(I(y_{0}=\hat{y}_{0}))\)
- \(\hat{y}_{0}\): predicted class label that results from applying the classifier to the test observation with predictor \(x_{0}\)
- \(Ave(I(y_{0}=\hat{y}_{0}))\)
- Good classifier: One for which the test error is smallest
- The opposite of the error rate is the Correct Classification Rate (CCR)
- How many were correctly classified?
- Source: James et al. (2013 Chap. 2.2.3)
References
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. Springer.