RMRWR
1
Preface
1.1
Who This Book is For
1.2
Prerequisites
1.3
The Spiral of Success Structure
1.4
Motivation for this Book
1.5
The Scientific Reproducibility Crisis
1.6
Features of a Bookdown electronic book
1.6.1
Icons
1.6.2
Sharing
1.6.3
Scrolling/Paging
1.7
What this Book is Not
1.7.1
This Book is Not A Statistics Text
1.7.2
This Book Does Not Provide Comprehensive Coverage of the R Universe
1.8
Some Guideposts
1.9
Helpful Tools
1.9.1
Demonstrations in Flipbooks
1.9.2
Learnr Coding Exercises
1.9.3
Coding
2
Getting Started and Installing Your Tools
2.1
Goals for this Chapter
2.2
Website links needed for this Chapter
2.3
Pathway for this Chapter
2.4
Installing R on your Computer
2.5
Windows-Specific Steps for Installing R
2.5.1
Testing R on Windows
2.6
Mac-specific Installation of R
2.6.1
Testing R on the Mac
2.6.2
Successful testing!
2.7
Installing RStudio on your Computer
2.7.1
Windows Install of RStudio
2.7.2
Testing Windows RStudio
2.7.3
Installing RStudio on the Mac
2.7.4
Testing the Mac Installation of RStudio
2.7.5
Critical Setup - Tuning Your RStudio Installation
2.8
Installing Git on your Computer
2.8.1
Installing Git on macOS
2.8.2
Installing Git on Windows
2.8.3
Installing Git on Linux
2.9
Getting Acquainted with the RStudio IDE
3
A Tasting Menu of R
3.1
Setting the Table
3.2
Goals for this Chapter
3.3
Packages needed for this Chapter
3.4
Website links needed for this Chapter
3.5
Setting up RPubs
3.6
Open a New Rmarkdown document
3.7
Knitting your Rmarkdown document
3.7.1
Installing Packages
3.7.2
Loading Packages with library()
3.8
Your Turn to Write Text
3.9
Wrangle Your Data
3.10
Summarize Your Data
3.11
Visualize Your Data
3.12
Statistical Testing of Differences
3.13
Publish your work to RPubs
3.14
The Dessert Cart
3.14.1
Interactive Plots
3.14.2
Animated Graphics
3.14.3
A Clinical Trial Dashboard
3.14.4
A Shiny App
3.14.5
An Example of Synergy in the R Community
4
Introduction to Reproducibility
4.1
First Steps to Research Reproducibility
4.1.1
Have a Plan
4.1.2
Treat Your Raw Data Like Gold
4.1.3
Cleaning and Analyzing Your Data
4.1.4
The First Level of Reproducibility
4.1.5
The Second Level of Reproducibility
5
Importing Your Data into R
5.1
Reading data with the {readr} package
5.1.1
Test yourself on scurvy
5.1.2
What is a path?
5.1.3
Try it Yourself
5.2
Reading Excel Files with readxl
5.2.1
Test yourself on strep_tb
5.3
Bringing in data from other Statistical Programs (SAS, Stata, SPSS) with the {haven} package
5.4
Other strange file types with rio
5.5
Data exploration with glimpse, str, and head/tail
5.5.1
Taking a glimpse with
glimpse()
5.5.2
Try this out yourself.
5.5.3
Test yourself on strep_tb
5.5.4
Examining Structure with
str()
5.5.5
Test yourself on the scurvy dataset
5.5.6
Examining a bit of data with
head()
and
tail()
5.5.7
Test yourself on the printing tibbles
5.6
More exploration with skimr and DataExplorer
5.6.1
Test yourself on the
skim()
results
5.6.2
Test yourself on the
create_report()
results
5.7
Practice loading data from multiple file types
5.8
Practice saving (writing to disk) data objects in formats including csv, rds, xls, xlsx and statistical program formats
5.9
How do readr and readxl parse columns?
5.10
What are the variable types?
5.11
Controlling Parsing
5.12
Chapter Challenges
5.13
Future forms of data ingestion
6
Wrangling Rows in R with Filter
6.1
Goals for this Chapter
6.2
Packages needed for this Chapter
6.3
Pathway for this Chapter
6.4
Logical Statements in R
6.5
Filtering on Numbers - Starting with A Flipbook
6.5.1
Your Turn - learnr exercises
6.6
Filtering on Multiple Criteria with Boolean Logic
6.6.1
Your Turn - learnr exercises
6.7
Filtering Strings
6.7.1
Your Turn - learnr exercises
6.8
Filtering Dates
6.8.1
Your Turn - learnr exercises
6.9
Filtering Out or Identifying Missing Data
6.9.1
Working with Missing data
6.9.2
Your Turn - learnr exercises
6.10
Filtering Out Duplicate observations
6.11
Slicing Data by Row
6.12
Randomly Sampling Your Rows
6.12.1
Your Turn - learnr exercises
6.13
Further Challenges
6.14
Explore More about Filtering
7
Wrangling Columns in R with Select, Rename, and Relocate
7.1
Goals for this Chapter
7.2
Packages needed for this Chapter
7.3
Pathway for this Chapter
7.4
Tidyselect Helpers in R
7.5
Selecting a Column Variables
7.5.1
Try this out
7.6
Selecting Columns that are Not Contiguous
7.7
Selecting Columns With Logical Operators
7.8
Further Challenges
7.9
Explore More about Filtering
8
Using Mutate to Make New Variables (Columns)
8.1
Calculating BMI
8.2
Recoding categorical or ordinal data
8.3
Calculating Glomerular Filtration Rate
9
Interpreting Error Messages
9.1
The Common Errors Table
9.2
Examples of Common Errors and How to fix them
9.2.1
Missing Parenthesis
9.2.2
An Extra Parenthesis
9.2.3
Missing pipe
%>%
in a data wrangling pipeline
9.2.4
Missing + in a ggplot pipeline
9.2.5
Pipe
%>%
in Place of a
+
9.2.6
Missing Comma Within a Function()
9.2.7
A Missing Object
9.2.8
One Equals Sign When you Need Two
9.2.9
Non-numeric argument to a binary operator
9.3
Errors Beyond This List
9.4
When Things Get Weird
9.4.1
Restart your R Session (Shift-Cmd-F10)
9.5
References:
10
The Building Blocks of R: data types, data structures, functions, and packages.
10.1
Data Types
10.2
Data Structures
10.3
Examining Data Types and Data Structures
10.4
Functions
10.5
Packages
10.6
The Building Blocks of R
11
Tips for Hashtag Debugging your Pipes and GGPlots
11.1
Debugging
11.2
The Quick Screen
11.3
Systematic Hunting For Bugs in Pipes
11.4
Systematic Hunting For Bugs in Plots
11.5
Hashtag Debugging
11.6
Pipe 2
11.7
Plot 2
11.8
Plot3
11.9
Pipe 3
12
Finding Help in R
12.1
Programming in R
12.2
Starting with Help!
12.3
The Magic of Vignettes
12.4
Googling the Error Message
12.5
You Know What You Want to Do, but Don’t Know What Package or Function to Use
12.5.1
CRAN Task Views
12.5.2
Google is Your Friend
12.6
Seeking Advanced Help with a Minimal REPREX
13
The Basics of Base R
13.1
Dimensions of Data Rectangles
13.2
Naming columns
13.3
Concatenation
13.4
Sequences
13.5
Constants
13.6
Fancier Sequences
13.7
Mathematical functions
13.8
Handling missing data (NAs)
13.9
Cutting Continuous data into Levels
14
Updating R, RStudio, and Your Packages
14.1
Installing Packages
14.1.1
Installing Packages from Github
14.1.2
Problems with Installing Packages
14.2
Loading Packages with Library
14.3
Updating R
14.4
Updating RStudio
14.5
Updating Your Packages
15
Major R Updates (Where Are My Packages?)
15.1
Preparing for a Minor or Major R Upgrade
15.2
Saving a List of Your Packages
15.3
Upgrading R (and RStudio)
15.3.1
Reinstalling your list of Packages
15.4
Now Check your list of Packages
15.5
Updating Packages
16
Intermediate Steps Toward Reproducibility
16.1
Level 3 Reproducibility
16.1.1
Creating a New Project in RStudio
16.1.2
File paths and the {here} package
16.2
Code Review with a Coding Partner
16.2.1
Checklist for Code Review
16.3
Sharing code on GitHub
17
Comparing Two Measures of Centrality
17.1
Common Problem
17.1.1
How Skewed is Too Skewed?
17.1.2
Visualize the Distribution of data variables in ggplot
17.1.3
Visualize the Distribution of data$len in ggplot
17.1.4
Results of Shapiro-Wilk
17.1.5
Try it yourself
17.1.6
Mammal sleep hours
17.2
One Sample T test
17.2.1
How to do One Sample T test
17.2.2
Interpreting the One Sample T test
17.2.3
What are the arguments of the t.test function?
17.3
Insert flipbook for ttest here
17.3.1
Flipbook Time!
17.4
Fine, but what about 2 groups?
17.4.1
Setting up 2 group t test
17.4.2
Results of the 2 group t test
17.4.3
Interpreting the 2 group t test
17.4.4
2 group t test with wide data
17.4.5
Results of 2 group t test with wide data
17.5
3 Assumptions of Student’s t test
17.5.1
Testing Assumptions of Student’s t test
17.6
Getting results out of t.test
17.6.1
Getting results out of t.test
17.7
Reporting the results from t.test using inline code
17.7.1
For Next Time
18
Sample Size Calculations with
{pwr}
18.1
Sample Size for a Continuous Endpoint (t-test)
18.2
One Sample t-test for Lowering Creatinine
18.3
Paired t-tests (before vs after, or truly paired)
18.4
2 Sample t tests with Unequal Study Arm Sizes
18.5
Testing Multiple Options and Plotting Results
18.6
Your Turn
18.6.1
Scenario 1: FEV1 in COPD
18.6.2
Scenario 2: BNP in CHF
18.6.3
Scenario 3: Barthel Index in Stroke
18.7
Sample Sizes for Proportions
18.8
Sample size for two proportions, equal n
18.9
Sample size for two proportions, unequal arms
18.10
Your Turn
18.10.1
Scenario 1: Mortality on Renal Dialysis
18.10.2
Scenario 2: Intestinal anastomosis in Crohn’s disease
18.10.3
Scenario 3: Metformin in Donuts
18.11
add chi square
18.12
add correlation test
18.13
add anova
18.14
add linear model
18.15
add note on guessing effect sizes - cohen small, medium, large
18.16
Explore More
19
Randomization for Clinical Trials with R
19.1
Printing these on Cards
19.2
Now, try this yourself
19.3
Now Freestyle
20
Univariate ggplots to Visualize Distributions
20.1
Histograms
20.1.1
Comparisons of Distributions with Histograms
20.1.2
Histograms and Categories
20.2
Density Plots
20.2.1
Comparisons with Density plots
20.3
Comparing Distributions Across Categories
20.4
Boxplots
20.5
Violin Plots
20.6
Ridgeline Plots
20.6.1
Including Plots
20.6.2
Including Points
20.6.3
Including Points
20.6.4
Including Points
20.6.5
Including Points
21
Bivariate ggplot2 Scatterplots to Visualize Relationships Between Variables
21.1
Packages used in this Chapter
21.2
Data Exploration and Validation (DEV)
21.3
Scatterplots
21.3.1
Micro-quiz!
21.4
Mapping More Variables
21.5
Inheritance and Layering in ggplot2
21.6
Aesthetic mapping Micro-Quiz!
21.7
Controlling Point Shape, Size, and Color Manually
21.7.1
Manual Shapes
21.7.2
Manual Sizes
21.7.3
Manual Color
22
Extensions to ggplot
22.1
Goals for this Chapter
22.2
Packages Needed for this chapter
22.3
A Flipbook of Where We Are Going With ggplot Extensions
22.3.1
MAKE FLIPBOOK
22.4
A Waffle Plot
22.5
An Alluvial Plot
22.6
Lollipop Plots
22.7
Dumbbell Plots
22.8
Test what you have learned
22.9
Dumbbell Plots with ggalt For Visualizing Change
22.10
Direct Labeling of Plots
22.10.1
GeomTextPath
22.11
Test what you have learned
22.11.1
More Examples with Flipbooks
23
Customizing Plot Scales
23.1
Goals for this Chapter
23.2
Packages Needed for this chapter
23.3
A Flipbook of Where We Are Going With Scales
23.4
A Basic Scatterplot
23.5
But what if you want the scale for risk to start at 0?
23.6
But this axis does not really start at Exactly 0
23.7
Control the Limits and the Breaks
23.8
Test what you have learned
23.9
Continuous vs. Discrete Plots and Scales
23.10
Using Scales to Customize a Legend
23.11
Test what you have learned
23.11.1
More Examples with Flipbooks
24
Helping out with ggplot
24.1
ggx::gghelp()
24.2
Getting more help with theming with ggThemeAssist
24.3
Website helpers for ggplot
24.4
Getting Even more help with esquisse
25
Functions
25.1
Don’t repeat yourself
25.2
Your Turn
25.3
Freestyle
25.3.1
Acknowledgement
25.4
Read More
26
Linear Regression and Broom for Tidying Models
26.1
Packages needed
26.2
Building a simple base model with {lm}
26.2.1
Producing manuscript-quality tables with {gtsummary}
26.3
Is Your Model Valid?
26.4
Making Predictions with Your Model
26.4.1
Predictions from new data
26.5
Choosing predictors for multivariate modeling – testing, dealing with collinearity
26.5.1
Challenges
26.6
presenting model results with RMarkdown
26.6.1
Challenges
26.7
presenting model results with a Shiny App
26.7.1
Challenges
27
Logistic Regression and Broom for Tidying Models
27.1
The Model Summary
27.2
Evaluating your Model Assumptions
27.3
Converting between logit, odds ratios, and probability
28
A Gentle Introduction to Shiny
28.1
What is Shiny?
28.2
The Basic Structure of a Shiny App
28.2.1
The weirdness of a Shiny app
28.3
The User Interface Section Structure
28.4
The Server Section Structure
28.5
How to Run an App
28.5.1
How to Stop an App
28.6
Building a Very Simple App (Version 1)
28.6.1
The ui section
28.6.2
The server section
28.7
Edit this App (Version 2)
28.8
Building a User Interface for Inputs and Outputs
28.8.1
Inputs
28.8.2
Outputs
28.9
Building a Functioning Server Section
28.9.1
Using the input values & Data
28.9.2
Wrangling and Calculating
28.9.3
Rendering to HTML Outputs
28.10
Building a Simple Shiny App (Version 3)
28.11
Publishing Your Shiny App on the Web
28.12
More to Explore
29
Sharing Models with Shiny
29.0.1
Packages Needed for this Chapter
29.1
Setting up and Saving Models
29.1.1
Linear Model
29.1.2
Logistic Model
29.1.3
Random Forest Model
29.2
Building a Shiny App for the Linear Model
29.2.1
The Default Shiny App
29.2.2
Editing the
ui
sidebarPanel
for the Input Predictor Variables
29.2.3
Editing the
server
section to make Predictions
29.2.4
Editing the mainPanel in the ui section to display your Prediction
29.3
Building a Shiny App for the Logistic Model
29.3.1
The Default Shiny App
29.3.2
Editing the
ui
sidebarPanel
for the Input Predictor Variables
29.3.3
Editing the
server
section to make Predictions
29.3.4
Editing the mainPanel in the ui section to display your Prediction
29.4
Building a Shiny App for the Random Forest Model
29.5
Challenge Yourself
30
Introduction to R Markdown
30.1
What Makes an Rmarkdown document?
30.2
Trying out RMarkdown with a Mock Manuscript
30.3
Inserting Code Chunks
30.3.1
Code Chunk Icons
30.4
Including Plots
30.5
Including Tables
30.6
Including Links and Images
30.6.1
Links
30.6.2
Images
30.7
Other languages in code chunks
30.8
Code Chunk Options
30.9
How It All (Rmarkdown + {knitr} + Pandoc) Works
30.10
Knitting and Editing (and re-Knitting() Your Rmd document
30.11
Try Out Other Chunk Options
30.12
The
setup
chunk
30.13
Markdown syntax
30.14
2nd Header
30.14.1
3rd Header
30.15
Line Breaks and Page Breaks
30.16
Making Lists
30.16.1
Ordered Lists
30.16.2
Un-ordered lists
30.16.3
Nested Lists
30.17
The Easy Button - Visual Markdown Editing
30.17.1
Try inserting a list, a table and a block-quote
30.18
Inline Code
30.18.1
Try inserting some in-line R code
30.19
A Quick Quiz
31
Rmarkdown Output Options
31.1
Microsoft Word Output from Rmarkdown
31.1.1
Making a Styles Reference File for Microsoft Word
31.1.2
Let’s Practice This.
31.1.3
Re-formatting Your Template
31.1.4
Using Your New Styles Template
31.1.5
Now you are ready!
31.2
PDF Output from RMarkdown
31.2.1
LaTeX and tinytex
31.2.2
Knitting to PDF
31.3
Microsoft Powerpoint Output from Rmarkdown
31.3.1
Tables in Powerpoint
31.3.2
Images in Powerpoint
31.3.3
Plots in Powerpoint
32
Adding Citations to your RMarkdown
32.1
Goals for this Chapter
32.2
Packages Needed for this chapter
32.3
Getting Set Up for Citations
32.3.1
Installing the Zotero Connector
32.3.2
Registration for Zotero
32.4
Building your First Zotero Collection
32.5
Adding References to Your Zotero Collection
32.6
Inserting References into Documents (Rmarkdown)
32.6.1
Formatting your Bibliography
32.7
Inserting References into Documents (MS Word)
33
Running R from the UNIX Command Line
33.1
What is the UNIX Command line?
33.2
Why run R from the command line?
33.3
How do you get started?
33.3.1
On a Mac
33.3.2
On a Windows PC
33.4
The Yawning Blackness of the Terminal Window
33.5
Where Are We?
33.6
Cleaning Up
33.7
Other helpful file commands
33.8
What about R?
33.9
What about just a few lines of R?
33.10
Running an R Script from the Terminal
33.11
Rendering an Rmarkdown file from the Terminal
Title holder
References
Published with bookdown
Reproducible Medical Research with R
Title holder