Chapter 2 This thing called “data science”

2.1 Theory

David Donoho (2015) 50 Years of Data Science, based on a presentation at the Tukey Centennial workshop, Princeton NJ Sept 18 2015.

Hadley Wickham (2018) Readings in Applied Data Science, course materials for Standford Stats337 (spring 2018)

Iain Carmichael and J.S. Marron (2018) “Data science vs. statistics: two cultures?” (Carmichael and Marron 2018)

David Robinson (2018-01-09) “What’s the difference between data science, machine learning, and artificial intelligence?”

Mango Solutions (2018-08-15) “Demystifying Data Science Terminology”

Martin Monkman (2019-06-02) “Same name, different bird”

2.2 Philosophy

Angela Bassa (2017) Data Alone Isn’t Ground Truth … and why you should always carry a healthy dose of skepticism in your back pocket

Tim Davies and Mark Frank (2013) ‘There’s no such thing as raw data’. Exploring the sociotechnical life of a government dataset, conference paper from Web Science 2013, France (02 - 04 May 2013)

Bertrand Russell, “The Social Responsibilities of Scientists” (Russell 1960)

2.3 Using R for Data Science

Hadley Wickham & Garrett Grolemund (2016) [R for Data Science] (Wickham and Grolemund 2016)

Roger Peng, [R Programming for Data Science] (Peng 2018)

Chester Ismay and Albert Y. Kim, 2019-02-24, Modern Dive: Statistical Inference via Data Science (A moderndive into R and the tidyverse) (Ismay and Kim 2019) (was An Introduction to Statistical and Data Sciences via R)

JD Long and Paul Teetor, 2019-09-26, R Cookbook, 2nd Edition

Chester Ismay and Patrick C. Kennedy, 2018-05-23, Getting used to R, RStudio, and R Markdown

Gordon Shotwell, 2019-12-30, “Why I use R: They said the war was over…”](—a well-articulated explication as to why R is the best tool for data science

2.4 The Practice of Data Science & Statistics

Data science terminology – University of British Columbia, Master of Data Science program

Steph de Silva and John Ormerod, The Bayesian and The Frequentist {blog}

Hadley Wickham, Stats 337: Readings in Applied Data Science – reading list for Stanford University course, Spring 2018.

2.4.1 Data Science and Public Policy

Using Big Data to Solve Economic and Social Problems – course at The Equality of Opportunity Project

2.4.2 Data science careers

Jonny Brooks-Bartlett (2018-03-28) Here’s why so many data scientists are leaving their jobs – a splash of cold water realism in the face

Nate Oostendorp (2019-03-01) “Radical Change Is Coming To Data Science Jobs”,

2.5 The skills of data science

Of course, how you set out to learn data science hinges on how you define data science. A typology based on data users might be helpful; knowing what sort of data scientist you are will shape what you might want to learn.

2.5.1 Data science leadership

Thomas H. Davenport and Jeanne G. Harris, Competing on Analytics: The New Science of Winning, Harvard Business School Press, January 2007. (Davenport and Harris 2007)

Angela Bassa, “Managing a Data Science Team”, Harvard Business Review, 2018-10-24

Executive Data Science (Coursera)

Building a Data Science Team (Coursera)

2.5.2 Business analytics & business intelligence

Sahil Arora, “Top Data Analytics Skills Required to Become a Data Analyst”, Digital Vidya, 2017-03-24

Samantha Leonard, “6 Must-Have Skills For Data Analysts”, Northeastern University, 2018-08-31

Jay Gendron, (2016) Introduction to R for Business Intelligence

2.5.3 Data science

Data Science, Johns Hopkins University via Coursera

Mango Solutions’ R training provides structure by user proficiency.

Chris Engelhardt, data_sci_guide – A wealth of data science learning resources. “The overarching goal here is to provide anyone interested in learning data science with a wealth of open source, industry-best learning materials and learning tracks.”


Carmichael, Iain, and J. S. Marron. 2018. “Data Science Vs. Statistics: Two Cultures?” Japanese Journal of Statistics and Data Science 1 (1).

Davenport, Thomas H., and Jeanne G. Harris. 2007. Competing on Analytics: The New Science of Winning. Harvard Business School Press.

Ismay, Chester, and Albert Y. Kim. 2019. Modern Dive: Statistical Inference via Data Science (a Moderndive into R and the Tidyverse). self-published.

Peng, Roger D. 2018. R Programming for Data Science.

Russell, Bertrand. 1960. “The Social Responsibilities of Scientists.” Science 131 (3398): 391–92.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science. O’Reilly Media.