Chapter 27 Data science pedagogy
27.1 Introduction
Since it’s a relatively new field, it’s not surprising that the best way to teach data science is still evolving. Below are some resources for those who are sorting out how to teach this cross-silo discipline.
27.2 General
Joyce Cahoon, Things I Wish I Knew Before I Started Teaching
Advice for new lecturers:
Interview went well yesterday, thanks for all the messages of support. My favourite question… You are talking to a new lecturer and they are asking for advice. What are 3 pieces of advice you would give them about teaching? thread 1/4
— Dr Jenny Richmond ((???)) July 25, 2019
and a reply:
These are good. I would add:
— Dan Simpson ((???)) July 25, 2019
- Not every lecture will go well. Learn from the bad ones but don’t obsess over them.
- Teaching is a gas and expands to fill available space. If you need to also do something else (like teaching or admin) protect that time even if … https://t.co/corUeemG1Q
27.3 Data sources
Kim, A. Y, Ismay, C., & Chunn, J. (2018). The fivethirtyeight R Package: “Tame Data” Principles for Introductory Statistics and Data Science Courses. Technology Innovations in Statistics Education, 11(1). Retrieved from https://escholarship.org/uc/item/0rx1231m
See also (???)
27.4 Teaching R
Wisdom from Roger Peng:
You have to decide how to teach this topic…you have to decide how to present this material, and in what order. Are you teaching people how to analyze data, or are you teaching people a programming language?
Not So Standard Deviations, Episode 84 “All the Easy Issues” (discussion starting at ~39:45)
27.4.1 Text books, etc
See (??? and other quantitative methods: courses and text books)
27.5 RStudio cloud
RStudio Class materials for Teach the Tidyverse, a Train-the-trainer workshop
(whole thread)This semester I have switched to teaching my #rstats course in Rstudio Cloud. It works like a charm and the students enjoy it too. There is no endless debugging individual computers & students can get right to the fun since I can make sure everything works in advance. pic.twitter.com/6VRYVQg2S7
— Fabio Votta📊🦉 ((???)) January 8, 2019
27.6 More R-based teaching tools
27.7 Ideas on what to teach / learn
27.7.1 plagiarize from these
Genomics in R https://carpentrieslab.github.io/genomics-r-intro/
Data Science in the Tidyverse (Amelia McNamara) https://github.com/AmeliaMN/data-science-in-tidyverse
data science for economists:I'm teaching a “data science for economists” course this semester.
— Grant McDermott ((???)) January 9, 2019
If you're interested in learning more about #rstats, Git(Hub), programming, databases, cloud computation, ML, etc., I'll be making all of my course material publicly available here: https://t.co/ApJo7Nuo7d pic.twitter.com/9gbMb2winV
Nick Huntington-Klein, ECON 305: (2019)
Data Carpentry One Day Workshop
R Programming (on Coursera): https://www.coursera.org/learn/r-programming
27.7.2 This thread!
#rstats hive mind and especially (???) - what do you think is the most essential thing students should be taught on a one (! I know) day R and applied statistics class?
— Elizabeth Davis ((???)) August 3, 2019
Mine Cetinkaya-Rundel (2019-06-25) Let them eat cake (first)!
27.7.3 course structure ideas
R learning sprint
See whole thread:
I sent him a quick list of things I wanted to cover:
— Jon Schwabish ((???)) December 18, 2018
-Set up the basic R workspace (eg R script vs R project)
-Read & write data (CSV, Excel, and Stata)
-Reshape
-Run simple tabs (means, variances, etc.)
-Working with strings and number formats
-Switch bw data sets
27.7.4 next
If you were learning the basics of R for the first time, what would you want to know in the first two hours?
— Colin J. Carlson ((???)) January 8, 2019
replies:
Basic vocab stuff. What are “functions”, “objects”, “values”, “vectors”, “environments”, etc. What does it mean to “assign”? Really basic things.
— Nelson Stauffer ((???)) January 8, 2019
how to read in data, how to graph the data, how to find help to do whatever I want to do in R next, how to trouble shoot error messages
— Auriel Fournier ((???)) January 8, 2019
Assuming I had no coding experience at all… how to “see” and change your R objects. Things like head, str,… and how to create and manipulate dataframes, matrices, etc.
— Dr Maureen Berg, microboiologist ((???)) January 8, 2019
Assuming they're coming from non-coding or excel background, give instant success. Read a CSV and make a bar graph, then change it to a beeswarm plot, then add color and style and make it interactive with a quick ggplotly.
— William Chase ((???)) January 8, 2019
For beginners, I suggest importing/exporting multiple file types, data inspection/cleaning (head, class, level, removing empties/NAN, removing, sorting, & naming rows and columns), data structures, line code for removing stored info (variables & graphics), & SUBSET!!!
— Sarah PhillipsGarcia ((???)) January 8, 2019
27.8 Other topics
ECONOMY, SOCIETY, AND PUBLIC POLICY (MPA 612, BYU)
-30-
Albert, Jim. 2009. Bayesian Computation with R (Second Edition). Springer.
Andrew Gelman, Hal S. Stern, John B. Carlin, and Donald B. Rubin. 2014. Bayesian Data Analysis (Third Edition). CRC Press.
Anscombe, F. J. 1973. “Graphs in Statistical Analysis.” The American Statistician 27 (1): 17–21. https://doi.org/10.1080/00031305.1973.10478966.
Brewer, Cynthia A. 2003. “A Transition in Improving Maps: The Colorbrewer Example.” Cartography and Geographic Information Science 30 (2): 159–62. https://doi.org/10.1559/152304003100011126.
Broman, Karl, and Kara Woo. 2017. “Data Organization in Spreadsheets.” The American Statistician 72 (1): 2–10. https://doi.org/10.1080/00031305.2017.1375989.
Cairo, Alberto. 2013. The Functional Art: An Introduction to Information Graphics and Visualization. New Riders.
———. 2016. The Truthful Art: Data, Charts, and Maps for Communication. New Riders.
Cleveland, William S. 1993. Visualizing Data. Hobart Press.
———. 1994. The Elements of Graphing Data. Hobart Press.
Cleveland, William S., and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54. https://doi.org/10.1080/01621459.1984.10478080.
Cynthia A. Brewer, Mark A. Harrower, Geoffrey W. Hatchard. 2003. “ColorBrewer in Print: A Catalog of Color Schemes for Maps, Cartography and Geographic Information Science.” Cartography and Geographic Information Science 30 (1): 5–32. https://doi.org/10.1559/152304003100010929.
Davenport, Thomas H., and Jeanne G. Harris. 2007. Competing on Analytics: The New Science of Winning. Harvard Business School Press.
Duarte, Nancy. 2008. Slide:Ology: The Art and Science of Creating Great Presentations. O’Reilly. https://www.presentationzen.com/.
Friendly, Michael, and David Meyer. 2016. Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data. CRC Press.
Gillespie, Colin, and Robin Lovelace. 2017. Efficient R Programming: A Practical Guide to Smarter Programming. O’Reilly. https://csgillespie.github.io/efficientR/.
Healy, Kieran. 2019. Data Visualization: A Practical Introduction. Princeton. http://socviz.co/.
Hicks, Stephanie C., and Roger D. Peng. 2019a. “Elements and Principles of Data Analysis.” arXiv.org. https://arxiv.org/abs/1903.07639v1.
———. 2019b. “Evaluating the Success of a Data Analysis.” arXiv.org. https://arxiv.org/abs/1904.11907.
Ismay, Chester, and Albert Y. Kim. 2019. Modern Dive: Statistical Inference via Data Science (a Moderndive into R and the Tidyverse). self-published. https://moderndive.com/.
Knaflic, Cole Nussbaumer. 2015. Storytelling with Data: A Data Visualization Guide for Business Professionals. Wiley. https://www.storytellingwithdata.com/.
Larkin, Jill H., and Herbert A. Simon. 1987. “Why a Diagram Is (Sometimes) Worth Ten Thousand Words.” Cognitive Science 11 (1): 65–100. https://doi.org/10.1111/j.1551-6708.1987.tb00863.x.
McElreath, Richard. 2016. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC Press.
Nicolas P. Rougier, Philip E. Bourne, Michael Droettboom. 2014. “Ten Simple Rules for Better Figures.” PLOS Computational Biology 10 (9). https://doi.org/10.1371/journal.pcbi.1003833.
O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown.
Peng, Roger D. 2018. R Programming for Data Science. leanpub.com. https://leanpub.com/rprogramming.
Perez, Caroline Criado. 2019. Invisible Women: Data Bias in a World Designed for Men. Harry N. Abrams.
Pfeffermann, Danny. 2002. “Small Area Estimation: New Developments and Directions.” International Statistical Review / Revue Internationale de Statistique 70 (1): 125–43. https://doi.org/10.2307/1403729.
Raymond, Eric S. 1999. The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly Media. http://www.catb.org/esr/writings/cathedral-bazaar/.
Reynolds, Garr. 2008. Presentation Zen: Simple Ideas on Presentation Design and Delivery. New Riders. https://www.presentationzen.com/.
Robbins, Naomi B. 2013. Creating More Effective Graphs. Chart House.
Shawn Graham, Ian Milligan, and Scott Weingart. 2015. Exploring Big Historical Data: The Historian’s Macroscope. Imperial College Press. http://www.themacroscope.org/2.0/.
Tukey, John W. 1977. Exploratory Data Analysis. Addison-Wesley.
Wainer, Howard. 1984. “How to Display Data Badly.” The American Statistician 38 (2): 137–47. https://doi.org/10.1080/00031305.1984.10483186.
Wang, Earo, Dianne Cook, and Rob J Hyndman. 2019. “A New Tidy Data Structure to Support Exploration and Modeling of Temporal Data.”
Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59 (1): 1–23. https://doi.org/10.18637/jss.v059.i10.
———. 2015a. Advanced R. CRC Press. https://adv-r.hadley.nz/.
———. 2015b. R Packages: Organize, Test, Document, and Share Your Code. O’Reilly. http://r-pkgs.had.co.nz/.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science. O’Reilly Media. https://r4ds.had.co.nz/.
Wickham, Hadley, and Heike Hofmann. 2011. “Product Plots.” IEEE Transactions on Visualization and Computer Graphics 17 (12): 2223–30. https://doi.org/10.1109/TVCG.2011.227.
Yihui Xie, Amber Thomas, and Alison Presmanes Hill. 2019. Blogdown: Creating Websites with R Markdown. CRC Press / Chapman & Hall. https://bookdown.org/yihui/blogdown/.