1
NOTE: This version of the book is no longer updated, and will be taken down in the next month or so. The new version may be found
at this link
Welcome to IDEAR
1.1
The State of the Book
1.2
Book Outline
1.3
Other Sources
2
Introduction to R
2.1
Why Does This Book Exist?
2.2
What is R?
2.3
What is coding?
2.4
Conventions of the book
2.5
Things You’ll Need
2.6
Introduction to RStudio
2.7
Your First Program
2.8
The iris Dataset
2.9
Graphing with R
2.10
Exercises
2.10.1
Calculate the following:
3
Visualizing Your Data
3.1
What is a Visualization?
3.2
The Tidyverse Package
3.3
ggplot2
3.3.1
Functions in ggplot
3.3.2
Changing Aesthetics
3.3.3
Facetting
3.4
Diamonds
3.4.1
Visualizing Large Datasets
3.4.2
Axis Transformations
3.5
Other Popular Geoms
3.5.1
Histograms
3.5.2
Bar Charts
3.5.3
Jittered Points
3.5.4
Boxplot
3.6
Designing Good Graphics
3.7
Saving Your Graphics
3.8
More Resources
3.9
Exercises
3.9.1
Graph the following:
3.9.2
Use a new dataset:
3.9.3
Looking ahead:
4
R Functions and Workflow
4.1
Workflow
4.1.1
Scripts
4.1.2
Notebooks
4.2
Memory, Objects, and Names
4.3
Dataframes
4.4
Oddballs
4.5
R Studio Tips and Tricks
4.6
R Functions and Workflow Exercises
4.6.1
Do the following:
5
Data Wrangling
5.1
Thinking with Data
5.2
The Data Analytics Model
5.3
Wrangle
5.3.1
Tidy Data
5.3.2
Tidying Data
5.3.3
Separating Values
5.4
The Pipe
5.5
Data Transformations
5.5.1
Mutate
5.5.2
Tibbles
5.5.3
Subsetting Data
5.5.4
Filtering with the Tidyverse
5.5.5
Working with Groups
5.6
Missing Values
5.6.1
Explicit Missing Values
5.6.2
Implicit Missing Values
5.7
Count Data
5.7.1
Work with other datasets:
6
Introduction to Data Analysis
6.1
Exploratory Data Analysis
6.1.1
Sidenote
6.1.2
The EDA Framework
6.2
gapminder
6.3
Summarizing Data
6.3.1
Sidenote
6.4
Visualizing Data
6.5
Analyzing Patterns
6.6
Exercises
7
Modeling Data
7.1
Why Model?
7.2
Linear Models
7.3
Model Predictions
7.4
Classification
7.5
Logistic Models
7.6
Evaluating and Comparing Models
7.6.1
Confusion Matrices
7.7
Conclusion
7.8
Exercises
8
Achieving Graphical Excellence
8.1
Introduction
8.2
Getting Started
8.3
Themes
8.4
Colors
8.4.1
Viridis
8.4.2
Color Brewer
8.4.3
Other Packages
8.4.4
Making Your Own
8.5
Labels
8.6
Animation
8.7
Specialized Visualizations
8.7.1
Stacked Area Plots
8.7.2
ggridges
8.7.3
Maps
8.7.4
Circular Charts
8.8
Rearranging Groups
8.9
Further Reading
9
Functions and Scripting
9.1
Writing Functions
9.1.1
Our First Function
9.1.2
Returns
9.1.3
More Complicated Functions
9.2
About Names…
9.3
Conditional Statements
9.4
Stops
9.5
Function Dependencies
9.6
Saving and Loading Functions
9.7
Loops
9.8
Mapping Functions
9.9
More Information
9.10
Exercises
10
More Complicated Analyses
10.1
Other Datasets
10.1.1
Importing Your Own Data
10.1.2
Exporting Data
10.1.3
Data Exploration
10.1.4
Modeling Winners
10.2
Logistic Models
10.3
Modelling Metrics
10.3.1
Pseudo-R
2
10.3.2
Area Under the ROC Curve (AUC)
10.3.3
Model Comparisons
10.4
More Complicated Analyses
10.4.1
Model Selection
10.5
Relational Data
10.5.1
Inner Join
10.5.2
Left Join
10.5.3
Right Join
10.5.4
Full Join
10.5.5
Semi Join
10.5.6
Anti Join
10.5.7
Specifying Key Columns
10.5.8
Merging Multiple Dataframes
10.5.9
Binding Dataframes
10.6
Exercises
11
Playing Nicely With Others
11.1
R Markdown
11.1.1
Kable
11.2
LaTex
11.3
Git(Hub)
11.3.1
My First Repository
11.4
Commenting Code
11.5
Further Reading
12
Working with Text
12.1
Working with Stringr
12.2
Regular Expressions
12.3
Case Study
12.4
Further Reading
12.5
Exercises
13
Working with Dates and Times
13.1
Dates in R
13.2
Converting To Dates
13.3
Extracting From Dates
13.4
Math with Dates
13.5
Time Zones
14
What Next
14.1
Machine Learning Methods
14.2
Leaflet Maps
14.3
FlexDashboard
14.4
Bookdown
14.5
Blogdown
14.6
Shiny
15
Basic Statistics (Using R)
15.1
Purpose of the Unit
15.2
Definitions
15.2.1
Data Concepts
15.2.2
Statistical Terms
15.2.3
Models and Tests
15.2.4
How We’ll Compare Models
15.3
Exercises
16
Other Resources
Infographics
Courses
Textbooks
Blog Links
Data Sources
Graphing Aids
17
Frequently Asked Questions
17.1
Why R?
17.2
Why is my code broken?
17.3
Why is (
X
package) named that?
18
Changelog
18.1
Version 1.1.0
18.2
Version 1.0.1
18.3
Version 1.0.0
Introduction to Data Exploration and Analysis with R
Introduction to Data Exploration and Analysis with R
Michael Mahoney
2019-06-27
1
NOTE: This version of the book is no longer updated, and will be taken down in the next month or so. The new version may be found
at this link