Chapter 1 What and why?


This chapter introduces some key concepts of data science and provides an overview over the contents of this book — not primarily in terms of technology and tools (R, RStudio, R packages), but in terms of key concepts.

Key concepts and issues

Issues and questions addressed in this chapter:

  • Basic terminology:
    • What is data (types and shapes)?
    • What is science?
    • What is data science?
  • Which skills?
    • Relation to statistics?
    • Relation to computer programming?
  • Which tools?
    • The ecological rationality of tools
    • Using R Markdown for reproducible research

Important concepts introduced in this chapter include the terms representation and ecological rationality.


Before you read on, please take some time to reflect upon the following questions:

i2ds: Preflexions

  • Try defining the term data. How does it relate to information?

  • What are key characteristics of science?

  • What is data science?

  • Which skills does a data scientist need?

  • What are characteristics of a useful tool?

  • Which tools are you currently using for reading, writing, or calculating? Why these and not others?

  • What is reproducible research?

  • What kind of tool(s) would we want to adhere to its principles?

Please take some notes on your answers. After finishing this chapter, Exercise 1.5.4 will ask you to summarize them in an R Markdown document.