Chapter 2 This thing called “data science”
2.1 Theory
David Donoho (2015) 50 Years of Data Science, based on a presentation at the Tukey Centennial workshop, Princeton NJ Sept 18 2015.
reprinted in Journal of Computational and Graphical Statistics, Volume 26, No. 4 (2017), including a variety of discussion papers / responses, including:
- Jenny Bryan and Hadley Wickham, “Data Science: A Three Ring Circus or a Big Tent?”(https://arxiv.org/ftp/arxiv/papers/1712/1712.07349.pdf) {discussion of Donoho, 50 Years of Data Science}
Hadley Wickham (2018) Readings in Applied Data Science, course materials for Standford Stats337 (spring 2018)
Iain Carmichael and J.S. Marron (2018) “Data science vs. statistics: two cultures?” (Carmichael and Marron 2018)
David Robinson (2018-01-09) “What’s the difference between data science, machine learning, and artificial intelligence?”
Let's try that again pic.twitter.com/1G4wHvGvdd
— Data Science Renee (/@/BecomingDataSci) September 6, 2016
Mango Solutions (2018-08-15) “Demystifying Data Science Terminology”
Martin Monkman (2019-06-02) “Same name, different bird”
2.2 Philosophy
Angela Bassa (2017) Data Alone Isn’t Ground Truth … and why you should always carry a healthy dose of skepticism in your back pocket
Tim Davies and Mark Frank (2013) ‘There’s no such thing as raw data’. Exploring the sociotechnical life of a government dataset, conference paper from Web Science 2013, France (02 - 04 May 2013)
Bertrand Russell, “The Social Responsibilities of Scientists” (Russell 1960)
2.3 Using R for Data Science
Hadley Wickham & Garrett Grolemund (2016) [R for Data Science] (Wickham and Grolemund 2016)
Roger Peng, [R Programming for Data Science] (Peng 2018)
- Roger Peng’s other books on LeanPub
Chester Ismay and Albert Y. Kim, 2019-02-24, Modern Dive: Statistical Inference via Data Science (A moderndive into R and the tidyverse) (Ismay and Kim 2019) (was An Introduction to Statistical and Data Sciences via R)
JD Long and Paul Teetor, 2019-09-26, R Cookbook, 2nd Edition
Chester Ismay and Patrick C. Kennedy, 2018-05-23, Getting used to R, RStudio, and R Markdown
Gordon Shotwell, 2019-12-30, “Why I use R: They said the war was over…”](https://blog.shotwell.ca/posts/why_i_use_r/)—a well-articulated explication as to why R is the best tool for data science
2.3.1 Using R for Data Journalism
.Rddj: Hand-curated, high quality resources for doing data journalism with R
R for Journalists at scoop.it
2.4 The Practice of Data Science & Statistics
Data science terminology – University of British Columbia, Master of Data Science program
Steph de Silva and John Ormerod, The Bayesian and The Frequentist {blog}
Hadley Wickham, Stats 337: Readings in Applied Data Science – reading list for Stanford University course, Spring 2018.
2.4.1 Data Science and Public Policy
Using Big Data to Solve Economic and Social Problems – course at The Equality of Opportunity Project
2.4.2 Data science careers
Jonny Brooks-Bartlett (2018-03-28) Here’s why so many data scientists are leaving their jobs – a splash of cold water realism in the face
Nate Oostendorp (2019-03-01) “Radical Change Is Coming To Data Science Jobs”, forbes.com
2.5 The skills of data science
Of course, how you set out to learn data science hinges on how you define data science. A typology based on data users might be helpful; knowing what sort of data scientist you are will shape what you might want to learn.
2.5.1 Data science leadership
Thomas H. Davenport and Jeanne G. Harris, Competing on Analytics: The New Science of Winning, Harvard Business School Press, January 2007. (Davenport and Harris 2007)
- Thomas H. Davenport, “Competing on Analytics”, Harvard Business Review, January 2006.
Angela Bassa, “Managing a Data Science Team”, Harvard Business Review, 2018-10-24
Executive Data Science (Coursera)
Building a Data Science Team (Coursera)
2.5.2 Business analytics & business intelligence
Sahil Arora, “Top Data Analytics Skills Required to Become a Data Analyst”, Digital Vidya, 2017-03-24
Samantha Leonard, “6 Must-Have Skills For Data Analysts”, Northeastern University, 2018-08-31
Jay Gendron, (2016) Introduction to R for Business Intelligence
2.5.3 Data science
Data Science, Johns Hopkins University via Coursera
Mango Solutions’ R training provides structure by user proficiency.
Chris Engelhardt, data_sci_guide – A wealth of data science learning resources. “The overarching goal here is to provide anyone interested in learning data science with a wealth of open source, industry-best learning materials and learning tracks.”