Preface

This is a course on the use of tidyverse packages

tidyverse provides a complete suite of modern data-handling tools. It is an essential toolbox for any data scientist using R.

The tidyverse package is designed to be easy to install.

0.1 Prerequisites

This course will dive into using tidyverse. It will assume you have already installed r and rstudio and how some familiarity on how to use the rstudio.

0.2 How This Book Is Organized

This book will use the nycflights13 dataset

This package contains information about all flights that departed from NYC in 2013: 336,776 flights with 16 variables. To help understand what causes delays, it also includes a number of other useful datasets: weather, planes, airports, airlines. (Source: Bureau of transportation statistics)

Each chapter will cover one of the core tidyverse packages that you are likely to use in almost every analysis:

dplyr, for data manipulation.

ggplot2, for data visualisation.

tibble, for tibbles, a modern re-imagining of data frames.

readr, for data import.

tidyr, for data tidying.

stringr, for strings.

forcats, for factors.

lubridate, for date/times.

magrittr ,for chaining commands

purrr, for functional programming.

modelr, for simple modelling within a pipeline