2.1 Why R?

  • Free and open source (think of science in developing countries)
  • Good online-documentation
  • Lively community of users (forums etc.)
  • Pioneering role
  • Visualization capabilities
  • Intuitiv
  • Cooperates with other programs
  • Used across wide variety of disciplines
  • Object-oriented programming language
  • Popularity (See popularity statistics on books, blogs, forums)
  • RStudio as powerful integrated development environment (IDE) for R
    • Evolves into a scientific work suite optimizing workflow (replication, reproducability etc.)
  • Institutions/people (Gary King, Andrew Gelman etc.)
  • Economic power (Revolution Analytics, Microsoft R Open)
  • Python is only real competitor.. can be used from R (e.g. reticulate package!)

Notes

The seminar consists of a mix of theoretical and applied sessions. For the applied session we will rely on the software R. While there are various programs one could use, the reasons mentioned above speak for R (my personal view). The only real contendor for data science is Python. See here for a nice overview of the differences between the two.