Introduction to Data Exploration and Analysis with R
2019-07-15
Welcome to IDEAR
There are only two kinds of programming languages: those people always bitch about and those nobody uses.
— Bjarne Stroustrup
Welcome to Introduction to Data Exploration and Analysis in R (IDEAr)! This book is designed as a crash course in coding with R and data analysis, built for people trying to teach themselves the skills needed for most analyst jobs today. You won’t need any past experience with R or data analytics - the aim of the book is to work as a primer for people of all backgrounds.
This book is currently being continuously deployed to bookdown.org and GitHub while editing continues. This is so that I can get feedback from the small group of people who are using this book to learn R themselves, so I can adjust and adapt the text as needed. If you’d like to help with this process, I’d love to hear from you - please feel free to reach out at mike.mahoney.218@gmail.com, or put in a pull request on GitHub. More information about me can be found at my website, which just so happens to have been built with R.
0.1 Book Outline
This book serves as an introduction to R for scientific and business applications, focusing specifically on exploratory data analysis, modeling techniques, data visualization, and communication of results. It requires no prior knowledge of programming, computer science, or statistics, though those with prior experience in those fields may find R a little easier to learn.
The goal is to leave you with the basic essentials of working in R, as well as a strong foundation in thinking like a data analyst that will help you tackle more complicated problems. You won’t be an R maestro, and you won’t have developed domain-specific knowledge - but you’ll have built a strong foundation that will help you in all your data analysis tasks moving forward. To that end, we’ll focus primarily on the basic language skills required to perform those more complicated tasks, leaving the more niche and complex areas of the language to other texts.
To begin, we’ll introduce you to programming and the quirks of R, and how you can use the language to make data visualizations. These chapters include:
- Introduction to R
- Data Visualization
- R Basics and Workflow
We’ll then get into the data analysis workflow, stepping through each component of this process in turn. These chapters are:
- Data Wrangling
- Exploratory Data Analysis
- Modeling
- Achieving Graphical Excellence
Towards the end of the course, we’ll shift our focus to skills which will let you work in professional settings and larger groups, using your skills more efficiently and communicating better via code. These chapters include:
- Functions and Scripting
- More Complicated Analyses
- Markdown and Clear Code
The end of the course then covers topics which I have found to be more specialized, and - while important - not as universally applicable to every project. This section includes the units:
- Working with Text
- Dates and Times
- Other Uses (What Next?)
The backmatter of the reader then concerns how to get help outside of this book, containing both links to useful resources and some frequently asked questions. The units in this section are:
- Basic Statistics Glossary
- Other Resources
- FAQ
- Changelog
0.2 Other Sources
If you find this book isn’t quite your style, I’d highly recommend Garrett Grolemund and Hadley Wickham’s R for Data Science, as well as Wickham’s Advanced R. Many other useful resources can be found in Chapter 15, at the end of this book.