Chapter 11 Model Evaluation

This week, our goals are to…

Continue to develop your final project plans.
Using peer review, practice giving and receiving evaluations of predictive analytics work and proposals.
Recognize the capabilities and limitations of selected model evaluation metrics, including the basics of how these metrics are calculated.

11.1 Evaluation metrics

Below, please read and watch about a variety of evaluation metrics that can be used to evaluate the results and utility of a machine learning model. Some of these metrics are ones that we have already used before in our class, while others are new.

Start by reading the following:

Erickson, B. J., & Kitamura, F. (2021). Magician’s corner: 9. Performance metrics for machine learning models. Radiology: Artificial Intelligence, 3(3), e200126. https://doi.org/10.1148/ryai.2021200126. PDF: https://pubs.rsna.org/doi/epdf/10.1148/ryai.2021200126.

Continue by watching the following:

Starmer, J. (2018). Machine Learning Fundamentals: The Confusion Matrix. StatQuest. https://www.youtube.com/watch?v=Kdsp6soqA7o.
Starmer, J. (2019). Machine Learning Fundamentals: Sensitivity and Specificity. StatQuest. https://www.youtube.com/watch?v=vP06aMoz4v8.
Data Science Dojo. Introduction to Precision, Recall and F1 | Classification Models. https://www.youtube.com/watch?v=jJ7ff7Gcq34.
Dragonfly Statistics. (2013). Statistics : The F Score. https://www.youtube.com/watch?v=fcO9820wCXE.
Starmer, J. (2019). ROC and AUC, Clearly Explained! StatQuest. https://www.youtube.com/watch?v=4jRBRDbJemM.
Starmer, J. Machine Learning Fundamentals: Cross Validation. StatQuest. https://www.youtube.com/watch?v=fSytzGwwBVw.

11.2 Assignment

This week’s assignment involves two parts:

Draft abstract – submit in D2L only
Discussion response – submit in D2L only

Note that this assignment previously said that there would be a peer review component to the assignment. There is NOT going to be a peer review part of this assignment.

11.2.1 Draft abstract

In this part of the assignment, please imagine that it is one year into the future. In this scenario, imagine that you successfully finished your final project in this class and you continued working on the project and published it in a peer-reviewed academic journal. That publication has an abstract at the top. What does that abstract say?

Note that published articles are arguably not the most important outcome and use of machine learning models (compared to other practical applications), but since this is a PhD course, it is worthwhile for us to also deliberately practice a more formal research skill like writing an abstract in the context of machine learning projects. The abstract you submit might be shared with the rest of the class and/or peer-reviewed later in the class.

Task 1: Write an abstract based on your final project. More details are below.

Additional details:

Submit the abstract in D2L to the dropbox called “Week 11 project abstract.”
Your abstract should contain the following sections: purpose, methods, results, conclusions.
Remember that we are pretending that you have already completed your project and written a publication-quality manuscript about it. This means that you will have results to report, even if your final project in this class is only a proposal. We are pretending that you went on to do the whole project, meaning that you ran all of the analytics and got results that you can share.
Since you likely do not know the results of your final project yet, just write “X” in place of any numbers that you do not yet know. Or you can make up a number that you think is reasonable. For example: “The best predictive model has a sensitivity of X, which is inadequate for practical use.”
You might find it useful to look at the abstracts of scholarly articles that we have read earlier in this course, for examples and ideas about how to write.
Do not write more than one page.

11.2.2 Discussion post

Below, you will prepare a discussion response and submit it in D2L.

Task 2: Give an example of an analytics scenario in which prioritizing high sensitivity would be most important.

Task 3: Give an example of an analytics scenario in which prioritizing high specificity would be most important.

Task 4: Give an example of an analytics scenario in which prioritizing high F1 score would be most important.

Task 5: What are the benefits of using cross validation? Please include at least one example of one of the benefits that you describe.

Task 6: Even though it’s a pretty smart approach, what are some of the limitations of cross validation? Please include at least one example of one of the limitations that you describe.

You have reached the end of this week’s assignment. Please be sure to both e-mail and upload your peer review, upload your draft abstract, and post your discussion response in D2L. Also, don’t forget to complete your weekly flashcard requirement in the Adaptive Learner App!