RMRWR
1
Preface
1.1
Who This Book is For
1.2
Prerequisites
1.3
The (Upward) Spiral of Success Structure
1.4
Motivation for this Book
1.5
The Scientific Reproducibility Crisis
1.6
Features of a Bookdown electronic book
1.7
What this Book is Not
1.7.1
This Book is Not A Statistics Text
1.7.2
This Book Does Not Provide Comprehensive Coverage of the R Universe
1.8
Some Guideposts
1.9
Helpful Tools
1.9.1
Demonstrations in Flipbooks
1.9.2
Learnr Coding Exercises
1.9.3
Coding
2
Getting Started and Installing Your Tools
2.1
Goals for this Chapter
2.2
Website links needed for this Chapter
2.3
Pathway for this Chapter
2.4
Installing R on your Computer
2.5
Windows-Specific Steps for Installing R
2.5.1
Testing R on Windows
2.6
Mac-specific Installation of R
2.6.1
Testing R on the Mac
2.6.2
Successful testing!
2.7
Installing RStudio on your Computer
2.7.1
Windows Install of RStudio
2.7.2
Testing Windows RStudio
2.7.3
Installing RStudio on the Mac
2.7.4
Testing the Mac Installation of RStudio
2.7.5
Critical Setup - Tuning Up Your RStudio Installation
2.8
Installing Git on your Computer
2.8.1
Installing Git on macOS
2.8.2
Installing Git on Windows
2.8.3
Installing Git on Linux
2.9
Getting Acquainted with the RStudio IDE
3
A Tasting Menu of R
3.1
Setting the Table
3.2
Goals for this Chapter
3.3
Packages needed for this Chapter
3.4
Website links needed for this Chapter
3.5
Setting up RPubs
3.6
Open a New Rmarkdown document
3.7
Knitting your Rmarkdown document
3.7.1
Installing Packages
3.7.2
Loading Packages with library()
3.8
Your Turn to Write Text
3.9
Wrangle Your Data
3.10
Summarize Your Data
3.11
Visualize Your Data
3.12
Statistical Testing of Differences
3.13
Publish your work to RPubs
3.14
The Dessert Cart
3.14.1
Interactive Plots
3.14.2
Animated Graphics
3.14.3
A Clinical Trial Dashboard
3.14.4
A Shiny App
3.14.5
An Example of Synergy in the R Community
4
Introduction to Reproducibility
4.1
First Steps to Research Reproducibility
4.1.1
Have a Plan
4.1.2
Treat Your Raw Data Like Gold
4.1.3
Cleaning and Analyzing Your Data
4.1.4
The First Level of Reproducibility
4.1.5
The Second Level of Reproducibility
5
Importing Your Data into R
5.1
Reading data with the {readr} package
5.1.1
Test yourself on scurvy
5.1.2
What is a path?
5.1.3
Try it Yourself
5.2
Reading Excel Files with readxl
5.2.1
Test yourself on read_excel()
5.3
Bringing in data from other Statistical Programs (SAS, Stata, SPSS) with the {haven} package
5.4
Other strange file types with rio
5.5
Data exploration with glimpse, str, and head/tail
5.5.1
Taking a glimpse with
glimpse()
5.5.2
Try this out yourself.
5.5.3
Test yourself on strep_tb
5.5.4
Examining Structure with
str()
5.5.5
Test yourself on the scurvy dataset
5.5.6
Examining a bit of data with
head()
and
tail()
5.5.7
Test yourself on the printing tibbles
5.6
More exploration with skimr and DataExplorer
5.6.1
Test yourself on the
skim()
results
5.6.2
Test yourself on the
create_report()
results
5.7
Practice loading data from multiple file types
5.8
Practice saving (writing to disk) data objects in formats including csv, rds, xls, xlsx and statistical program formats
5.9
How do readr and readxl parse columns?
5.10
What are the variable types?
5.11
Controlling Parsing
5.12
Chapter Challenges
5.13
Future forms of data ingestion
6
Wrangling Rows in R with Filter
6.1
Goals for this Chapter
6.2
Packages needed for this Chapter
6.3
Pathway for this Chapter
6.4
Logical Statements in R
6.5
Filtering on Numbers - Starting with A Flipbook
6.5.1
Your Turn - learnr exercises
6.6
Filtering on Multiple Criteria with Boolean Logic
6.6.1
Your Turn - learnr exercises
6.7
Filtering Strings
6.7.1
Your Turn - learnr exercises
6.8
Filtering Dates
6.8.1
Your Turn - learnr exercises
6.9
Filtering Out or Identifying Missing Data
6.9.1
Working with Missing data
6.9.2
Your Turn - learnr exercises
6.10
Filtering Out Duplicate observations
6.11
Slicing Data by Row
6.12
Randomly Sampling Your Rows
6.12.1
Your Turn - learnr exercises
6.13
Further Challenges
6.14
Explore More about Filtering
7
Wrangling Columns in R with Select, Rename, and Relocate
7.1
Goals for this Chapter
7.2
Packages needed for this Chapter
7.3
Pathway for this Chapter
7.4
Tidyselect Helpers in R
7.5
Selecting a Column Variables
7.5.1
Try this out
7.6
Selecting Columns that are Not Contiguous
7.7
Selecting Columns With Logical Operators
7.8
Further Challenges
7.9
Explore More about Filtering
8
Using Mutate to Make New Variables (Columns)
8.1
Calculating BMI
8.2
Recoding categorical or ordinal data
8.3
Calculating Glomerular Filtration Rate
9
Mutating Joins to Combine Data Sources
9.1
What are Joins?
9.2
What are Mutating Joins?
9.3
Let’s Start with Left Joins
9.4
Left Join in Action
9.5
Left Join in Practice
9.6
Quick Quiz
9.7
Problem variable names
9.8
Right Join in Action
9.9
Right Join in Practice
9.10
Inner Joins
9.11
Quick Quiz
9.12
Now Let’s take a Look at the result
9.13
Full Joins
9.14
Quick Quiz
9.15
Now Let’s take a Look at the result
10
Interpreting Error Messages
10.1
The Common Errors Table
10.2
Examples of Common Errors and How to fix them
10.2.1
Missing Parenthesis
10.2.2
An Extra Parenthesis
10.2.3
Missing pipe
%>%
in a data wrangling pipeline
10.2.4
Missing + in a ggplot pipeline
10.2.5
Pipe
%>%
in Place of a
+
10.2.6
Missing Comma Within a Function()
10.2.7
A Missing Object
10.2.8
One Equals Sign When you Need Two
10.2.9
Non-numeric argument to a binary operator
10.3
Errors Beyond This List
10.4
When Things Get Weird
10.4.1
Restart your R Session (Shift-Cmd-F10)
10.5
References:
11
The Building Blocks of R: data types, data structures, functions, and packages.
11.1
Data Types
11.2
Data Structures
11.3
Examining Data Types and Data Structures
11.4
Functions
11.5
Packages
11.6
The Building Blocks of R
12
Tips for Hashtag Debugging your Pipes and GGPlots
12.1
Debugging
12.2
The Quick Screen
12.3
Systematic Hunting For Bugs in Pipes
12.4
Systematic Hunting For Bugs in Plots
12.5
Hashtag Debugging
12.6
Pipe 2
12.7
Plot 2
12.8
Plot3
12.9
Pipe 3
13
Finding Help in R
13.1
Programming in R
13.2
Starting with Help!
13.3
The Magic of Vignettes
13.4
Googling the Error Message
13.5
You Know What You Want to Do, but Don’t Know What Package or Function to Use
13.5.1
CRAN Task Views
13.5.2
Google is Your Friend
13.6
Seeking Advanced Help with a Minimal REPREX
14
The Basics of Base R
14.1
Dimensions of Data Rectangles
14.2
Naming columns
14.3
Concatenation
14.4
Sequences
14.5
Constants
14.6
Fancier Sequences
14.7
Mathematical functions
14.8
Handling missing data (NAs)
14.9
Cutting Continuous data into Levels
15
Updating R, RStudio, and Your Packages
15.1
Installing Packages
15.1.1
Installing Packages from Github
15.1.2
Problems with Installing Packages
15.2
Loading Packages with Library
15.3
Updating R
15.4
Updating RStudio
15.5
Updating Your Packages
16
Major R Updates (Where Are My Packages?)
16.1
Preparing for a Minor or Major R Upgrade
16.2
Saving a List of Your Packages
16.3
Upgrading R (and RStudio)
16.3.1
Reinstalling your list of Packages
16.4
Now Check your list of Packages
16.5
Updating Packages
17
Intermediate Steps Toward Reproducibility
17.1
Level 3 Reproducibility
17.1.1
Creating a New Project in RStudio
17.1.2
File paths and the {here} package
17.2
Code Review with a Coding Partner
17.2.1
Checklist for Code Review
17.3
Sharing code on GitHub
18
Comparing Two Measures of Centrality
18.1
Common Problem
18.1.1
How Skewed is Too Skewed?
18.1.2
Visualize the Distribution of data variables in ggplot
18.1.3
Visualize the Distribution of data$len in ggplot
18.1.4
Results of Shapiro-Wilk
18.1.5
Try it yourself
18.1.6
Mammal sleep hours
18.2
One Sample T test
18.2.1
How to do One Sample T test
18.2.2
Interpreting the One Sample T test
18.2.3
What are the arguments of the t.test function?
18.3
Insert flipbook for ttest here
18.3.1
Flipbook Time!
18.4
Fine, but what about 2 groups?
18.4.1
Setting up 2 group t test
18.4.2
Results of the 2 group t test
18.4.3
Interpreting the 2 group t test
18.4.4
2 group t test with wide data
18.4.5
Results of 2 group t test with wide data
18.5
3 Assumptions of Student’s t test
18.5.1
Testing Assumptions of Student’s t test
18.6
Getting results out of t.test
18.6.1
Getting results out of t.test
18.7
Reporting the results from t.test using inline code
18.7.1
For Next Time
19
Sample Size Calculations with
{pwr}
19.1
Sample Size for a Continuous Endpoint (t-test)
19.2
One Sample t-test for Lowering Creatinine
19.3
Paired t-tests (before vs after, or truly paired)
19.4
2 Sample t tests with Unequal Study Arm Sizes
19.5
Testing Multiple Options and Plotting Results
19.6
Your Turn
19.6.1
Scenario 1: FEV1 in COPD
19.6.2
Scenario 2: BNP in CHF
19.6.3
Scenario 3: Barthel Index in Stroke
19.7
Sample Sizes for Proportions
19.8
Sample size for two proportions, equal n
19.9
Sample size for two proportions, unequal arms
19.10
Your Turn
19.10.1
Scenario 1: Mortality on Renal Dialysis
19.10.2
Scenario 2: Intestinal anastomosis in Crohn’s disease
19.10.3
Scenario 3: Metformin in Donuts
19.11
add chi square
19.12
add correlation test
19.13
add anova
19.14
add linear model
19.15
add note on guessing effect sizes - cohen small, medium, large
19.16
Explore More
20
Randomization for Clinical Trials with R
20.1
Printing these on Cards
20.2
Now, try this yourself
20.3
Now Freestyle
21
Univariate ggplots to Visualize Distributions
21.1
Histograms
21.1.1
Comparisons of Distributions with Histograms
21.1.2
Histograms and Categories
21.2
Density Plots
21.2.1
Comparisons with Density plots
21.3
Comparing Distributions Across Categories
21.4
Boxplots
21.5
Violin Plots
21.6
Ridgeline Plots
21.6.1
Including Plots
21.6.2
Including Points
21.6.3
Including Points
21.6.4
Including Points
21.6.5
Including Points
22
Bivariate ggplot2 Scatterplots to Visualize Relationships Between Variables
22.1
Packages used in this Chapter
22.2
Data Exploration and Validation (DEV)
22.3
Scatterplots
22.3.1
Micro-quiz!
22.4
Mapping More Variables
22.5
Inheritance and Layering in ggplot2
22.6
Aesthetic mapping Micro-Quiz!
22.7
Controlling Point Shape, Size, and Color Manually
22.7.1
Manual Shapes
22.7.2
Manual Sizes
22.7.3
Manual Color
23
Extensions to ggplot
23.1
Goals for this Chapter
23.2
Packages Needed for this chapter
23.3
A Flipbook of Where We Are Going With ggplot Extensions
23.3.1
MAKE FLIPBOOK
23.4
A Waffle Plot
23.5
An Alluvial Plot
23.6
Lollipop Plots
23.7
Dumbbell Plots
23.8
Spaghetti Plots with Summary Smoothed Lines for Change Over Time
23.9
Swimmer Plots
23.10
Adding Significance Comparisons with {ggsignif}
24
Customizing Plot Scales
24.1
Goals for this Chapter
24.2
Packages Needed for this chapter
24.3
A Flipbook of Where We Are Going With Scales
24.4
A Basic Scatterplot
24.5
But what if you want the scale for risk to start at 0?
24.6
But this axis does not really start at Exactly 0
24.7
Control the Limits and the Breaks
24.8
Test what you have learned
24.9
Continuous vs. Discrete Plots and Scales
24.10
Using Scales to Customize a Legend
24.11
Test what you have learned
24.11.1
More Examples with Flipbooks
25
Helping out with ggplot
25.1
ggx::gghelp()
25.2
Getting more help with theming with ggThemeAssist
25.3
Website helpers for ggplot
25.4
Getting Even more help with esquisse
26
Functions
26.1
Don’t repeat yourself
26.2
Your Turn
26.3
Freestyle
26.3.1
Acknowledgement
26.4
Read More
27
Linear Regression and Broom for Tidying Models
27.1
Packages needed
27.2
Building a simple base model with {lm}
27.2.1
Producing manuscript-quality tables with {gtsummary}
27.3
Is Your Model Valid?
27.4
Making Predictions with Your Model
27.4.1
Predictions from new data
27.5
Choosing predictors for multivariate modeling – testing, dealing with collinearity
27.5.1
Challenges
27.6
presenting model results with RMarkdown
27.6.1
Challenges
27.7
presenting model results with a Shiny App
27.7.1
Challenges
28
Logistic Regression and Broom for Tidying Models
28.1
The Model Summary
28.2
Evaluating your Model Assumptions
28.3
Converting between logit, odds ratios, and probability
29
A Gentle Introduction to Shiny
29.1
What is Shiny?
29.2
The Basic Structure of a Shiny App
29.2.1
The weirdness of a Shiny app
29.3
The User Interface Section Structure
29.4
The Server Section Structure
29.5
How to Run an App
29.5.1
How to Stop an App
29.6
Building a Very Simple App (Version 1)
29.6.1
The ui section
29.6.2
The server section
29.7
Edit this App (Version 2)
29.8
Building a User Interface for Inputs and Outputs
29.8.1
Inputs
29.8.2
Outputs
29.9
Building a Functioning Server Section
29.9.1
Using the input values & Data
29.9.2
Wrangling and Calculating
29.9.3
Rendering to HTML Outputs
29.10
Building a Simple Shiny App (Version 3)
29.11
Publishing Your Shiny App on the Web
29.12
More to Explore
30
Sharing Models with Shiny
30.0.1
Packages Needed for this Chapter
30.1
Setting up and Saving Models
30.1.1
Linear Model
30.1.2
Logistic Model
30.1.3
Random Forest Model
30.2
Building a Shiny App for the Linear Model
30.2.1
The Default Shiny App
30.2.2
Editing the
ui
sidebarPanel
for the Input Predictor Variables
30.2.3
Editing the
server
section to make Predictions
30.2.4
Editing the mainPanel in the ui section to display your Prediction
30.3
Building a Shiny App for the Logistic Model
30.3.1
The Default Shiny App
30.3.2
Editing the
ui
sidebarPanel
for the Input Predictor Variables
30.3.3
Editing the
server
section to make Predictions
30.3.4
Editing the mainPanel in the ui section to display your Prediction
30.4
Building a Shiny App for the Random Forest Model
30.5
Challenge Yourself
31
Introduction to R Markdown
31.1
What Makes an Rmarkdown document?
31.2
Trying out RMarkdown with a Mock Manuscript
31.3
Inserting Code Chunks
31.3.1
Code Chunk Icons
31.4
Including Plots
31.5
Including Tables
31.6
Including Links and Images
31.6.1
Links
31.6.2
Images
31.7
Other languages in code chunks
31.8
Code Chunk Options
31.9
How It All (Rmarkdown + {knitr} + Pandoc) Works
31.10
Knitting and Editing (and re-Knitting() Your Rmd document
31.11
Try Out Other Chunk Options
31.12
The
setup
chunk
31.13
Markdown syntax
31.14
2nd Header
31.14.1
3rd Header
31.15
Line Breaks and Page Breaks
31.16
Making Lists
31.16.1
Ordered Lists
31.16.2
Un-ordered lists
31.16.3
Nested Lists
31.17
The Easy Button - Visual Markdown Editing
31.17.1
Try inserting a list, a table and a block-quote
31.18
Inline Code
31.18.1
Try inserting some in-line R code
31.19
A Quick Quiz
32
Rmarkdown Output Options
32.1
Microsoft Word Output from Rmarkdown
32.1.1
Making a Styles Reference File for Microsoft Word
32.1.2
Let’s Practice This.
32.1.3
Re-formatting Your Template
32.1.4
Using Your New Styles Template
32.1.5
Now you are ready!
32.2
PDF Output from RMarkdown
32.2.1
LaTeX and tinytex
32.2.2
Knitting to PDF
32.3
Microsoft Powerpoint Output from Rmarkdown
32.3.1
Tables in Powerpoint
32.3.2
Images in Powerpoint
32.3.3
Plots in Powerpoint
33
Adding Citations to your RMarkdown
34
Quarto is a Next-Generation RMarkdown
34.1
Goals for this Chapter
34.2
Packages Needed for this chapter
34.3
Introducing Quarto
34.4
A Tour of Quarto
34.5
Opening a New Quarto Document
34.6
Annotating code in Quarto
34.7
The Visual Editor vs. Source Editor in Quarto
34.8
Adding Code Chunks
34.9
Organized Options in Code Chunks with the Hash-Pipe #|
34.10
Stating Global Options in Your YAML Header
34.10.1
Code Options and Code Folding
34.10.2
Parameters
34.11
Figures
34.12
Tables
34.13
Inline Code and Caching
34.14
Quarto at the Command Line
34.15
Citations in Quarto
34.16
Challenge Yourself
34.17
Exploring further
35
Running R from the UNIX Command Line
35.1
What is the UNIX Command line?
35.2
Why run R from the command line?
35.3
How do you get started?
35.3.1
On a Mac
35.3.2
On a Windows PC
35.4
The Yawning Blackness of the Terminal Window
35.5
Where Are We?
35.6
Cleaning Up
35.7
Other helpful file commands
35.8
What about R?
35.9
What about just a few lines of R?
35.10
Running an R Script from the Terminal
35.11
Rendering an Rmarkdown file from the Terminal
Title holder
References
Published with bookdown
Reproducible Medical Research with R
Title holder