Chapter 27 Data science pedagogy

27.1 Introduction

Since it’s a relatively new field, it’s not surprising that the best way to teach data science is still evolving. Below are some resources for those who are sorting out how to teach this cross-silo discipline.

27.2 General

Joyce Cahoon, Things I Wish I Knew Before I Started Teaching

Advice for new lecturers:

and a reply:


27.3 Data sources

Kim, A. Y, Ismay, C., & Chunn, J. (2018). The fivethirtyeight R Package: “Tame Data” Principles for Introductory Statistics and Data Science Courses. Technology Innovations in Statistics Education, 11(1). Retrieved from https://escholarship.org/uc/item/0rx1231m

See also (???)


27.4 Teaching R

Wisdom from Roger Peng:

You have to decide how to teach this topic…you have to decide how to present this material, and in what order. Are you teaching people how to analyze data, or are you teaching people a programming language?

Not So Standard Deviations, Episode 84 “All the Easy Issues” (discussion starting at ~39:45)

27.4.1 Text books, etc

See (??? and other quantitative methods: courses and text books)


27.5 RStudio cloud

RStudio Class materials for Teach the Tidyverse, a Train-the-trainer workshop

(whole thread)

27.7 Ideas on what to teach / learn

27.7.1 plagiarize from these

Genomics in R https://carpentrieslab.github.io/genomics-r-intro/

Data Science in the Tidyverse (Amelia McNamara) https://github.com/AmeliaMN/data-science-in-tidyverse

data science for economists:

Nick Huntington-Klein, ECON 305: (2019)

https://medium.com/airbnb-engineering/empowering-data-science-with-data-engineering-education-ef2acabd3042

Data Carpentry One Day Workshop

R Programming (on Coursera): https://www.coursera.org/learn/r-programming

27.7.2 This thread!

Mine Cetinkaya-Rundel (2019-06-25) Let them eat cake (first)!

27.7.3 course structure ideas

R learning sprint

See whole thread:

27.8 Other topics

ECONOMY, SOCIETY, AND PUBLIC POLICY (MPA 612, BYU)

-30-

Albert, Jim. 2009. Bayesian Computation with R (Second Edition). Springer.

Andrew Gelman, Hal S. Stern, John B. Carlin, and Donald B. Rubin. 2014. Bayesian Data Analysis (Third Edition). CRC Press.

Anscombe, F. J. 1973. “Graphs in Statistical Analysis.” The American Statistician 27 (1): 17–21. https://doi.org/10.1080/00031305.1973.10478966.

Brewer, Cynthia A. 2003. “A Transition in Improving Maps: The Colorbrewer Example.” Cartography and Geographic Information Science 30 (2): 159–62. https://doi.org/10.1559/152304003100011126.

Broman, Karl, and Kara Woo. 2017. “Data Organization in Spreadsheets.” The American Statistician 72 (1): 2–10. https://doi.org/10.1080/00031305.2017.1375989.

Cairo, Alberto. 2013. The Functional Art: An Introduction to Information Graphics and Visualization. New Riders.

———. 2016. The Truthful Art: Data, Charts, and Maps for Communication. New Riders.

Cleveland, William S. 1993. Visualizing Data. Hobart Press.

———. 1994. The Elements of Graphing Data. Hobart Press.

Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54. https://doi.org/10.1080/01621459.1984.10478080.

Cynthia A. Brewer, Mark A. Harrower, Geoffrey W. Hatchard. 2003. “ColorBrewer in Print: A Catalog of Color Schemes for Maps, Cartography and Geographic Information Science.” Cartography and Geographic Information Science 30 (1): 5–32. https://doi.org/10.1559/152304003100010929.

Davenport, Thomas H., and Jeanne G. Harris. 2007. Competing on Analytics: The New Science of Winning. Harvard Business School Press.

Duarte, Nancy. 2008. Slide:Ology: The Art and Science of Creating Great Presentations. O’Reilly. https://www.presentationzen.com/.

Friendly, Michael, and David Meyer. 2016. Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data. CRC Press.

Gillespie, Colin, and Robin Lovelace. 2017. Efficient R Programming: A Practical Guide to Smarter Programming. O’Reilly. https://csgillespie.github.io/efficientR/.

Healy, Kieran. 2019. Data Visualization: A Practical Introduction. Princeton. http://socviz.co/.

Hicks, Stephanie C., and Roger D. Peng. 2019a. “Elements and Principles of Data Analysis.” arXiv.org. https://arxiv.org/abs/1903.07639v1.

———. 2019b. “Evaluating the Success of a Data Analysis.” arXiv.org. https://arxiv.org/abs/1904.11907.

Ismay, Chester, and Albert Y. Kim. 2019. Modern Dive: Statistical Inference via Data Science (a Moderndive into R and the Tidyverse). self-published. https://moderndive.com/.

Knaflic, Cole Nussbaumer. 2015. Storytelling with Data: A Data Visualization Guide for Business Professionals. Wiley. https://www.storytellingwithdata.com/.

Larkin, Jill H., and Herbert A. Simon. 1987. “Why a Diagram Is (Sometimes) Worth Ten Thousand Words.” Cognitive Science 11 (1): 65–100. https://doi.org/10.1111/j.1551-6708.1987.tb00863.x.

McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC Press.

Nicolas P. Rougier, Philip E. Bourne, Michael Droettboom. 2014. “Ten Simple Rules for Better Figures.” PLOS Computational Biology 10 (9). https://doi.org/10.1371/journal.pcbi.1003833.

O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown.

Peng, Roger D. 2018. R Programming for Data Science. leanpub.com. https://leanpub.com/rprogramming.

Perez, Caroline Criado. 2019. Invisible Women: Data Bias in a World Designed for Men. Harry N. Abrams.

Pfeffermann, Danny. 2002. “Small Area Estimation: New Developments and Directions.” International Statistical Review / Revue Internationale de Statistique 70 (1): 125–43. https://doi.org/10.2307/1403729.

Raymond, Eric S. 1999. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly Media. http://www.catb.org/esr/writings/cathedral-bazaar/.

Reynolds, Garr. 2008. Presentation Zen: Simple Ideas on Presentation Design and Delivery. New Riders. https://www.presentationzen.com/.

Robbins, Naomi B. 2013. Creating More Effective Graphs. Chart House.

Shawn Graham, Ian Milligan, and Scott Weingart. 2015. Exploring Big Historical Data: The Historian’s Macroscope. Imperial College Press. http://www.themacroscope.org/2.0/.

Tukey, John W. 1977. Exploratory Data Analysis. Addison-Wesley.

Wainer, Howard. 1984. “How to Display Data Badly.” The American Statistician 38 (2): 137–47. https://doi.org/10.1080/00031305.1984.10483186.

Wang, Earo, Dianne Cook, and Rob J Hyndman. 2019. “A New Tidy Data Structure to Support Exploration and Modeling of Temporal Data.”

Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10.

———. 2015a. Advanced R. CRC Press. https://adv-r.hadley.nz/.

———. 2015b. R Packages: Organize, Test, Document, and Share Your Code. O’Reilly. http://r-pkgs.had.co.nz/.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science. O’Reilly Media. https://r4ds.had.co.nz/.

Wickham, Hadley, and Heike Hofmann. 2011. “Product Plots.” IEEE Transactions on Visualization and Computer Graphics 17 (12): 2223–30. https://doi.org/10.1109/TVCG.2011.227.

Yihui Xie, Amber Thomas, and Alison Presmanes Hill. 2019. Blogdown: Creating Websites with R Markdown. CRC Press / Chapman & Hall. https://bookdown.org/yihui/blogdown/.