1
Introduction
1.1
Case Studies
1.2
Reproducibility, Scalability, and Writing Code
1.3
Course Structure
1.4
Computer and Software Requirements
1.5
Next Steps
2
Prerequisites
2.1
Online Textbooks
2.2
R and R Studio
2.2.1
Installing R
2.2.2
Installing R Studio
2.2.3
Setting R Studio Defaults
2.2.4
R Studio Panels
2.3
Directories (Folders)
2.3.1
The Home Directory
2.3.2
Course Folder
2.3.3
Creating New Directories
2.3.4
Paths
2.3.5
The Working Directory
2.4
Packages
2.4.1
Installing Packages
2.4.2
Compilation
2.5
Browser Settings
2.5.1
Chrome
2.5.2
Safari
2.6
R Markdown Documents
2.6.1
Learning R Markdown
2.7
Knitting a Document
2.8
Uploading Documents to Canvas
3
Madison Lakes
3.1
Lake Mendota Freezing and Thawing
3.1.1
Criteria for freezing/thawing
3.1.2
Map
3.1.3
Winter of 2020-2021
3.2
Lake Mendota Questions
3.2.1
Lake Mendota Data
3.2.2
Wrangling the Lake Mendota Data
3.2.3
Loading the Libraries
3.3
Transformed Lake Mendota Data
3.3.1
Lake Mendota Variables
3.3.2
Plotting Duration Closed versus Time
3.3.3
Modeling Lake Mendota Data
3.3.4
Residuals
3.4
The Journey Ahead
4
Exoplanets
4.1
The Night Sky
4.2
Exoplanet Discovery
4.3
Exoplanet Data
4.3.1
Methods of Discovery
4.3.2
Earth and Jupiter
4.3.3
Mass and Radius
4.3.4
Spectral Types
4.4
Exoplanet Questions
4.5
The Journey Forward
5
R Fundamentals
5.1
History of the R Language
5.2
Packages
5.3
Vectors
5.4
Assignment
5.5
Arithmetic with Vectors
5.6
Numerical Summaries of Vectors
5.7
Data Frames
5.7.1
Extracting Parts of Data Frames
5.8
Data Types
5.8.1
Conversions between types
5.9
Valid Object Names
5.10
Functions
5.10.1
Arguments
5.10.2
Accessing the Documentation
6
Visualization with ggplot2
6.1
Visualization
7
Madison Weather
7.1
Weather and Climate Change
7.2
Data
7.3
Weather Stations
7.4
Variables
7.5
Initial Data Transformations
7.6
Questions
7.7
Obtaining Data
8
Statistical Summaries
9
Data Transformation with dplyr
10
Airport Waiting Times
10.1
Customs at US Airports
10.2
My Travel Experience
10.3
Airport Wait Times
10.4
Questions
11
Data Import with readr
12
Dates with lubridate
13
Wisconsin Obesity
13.1
Obesity
13.1.1
Obesity Definitions
13.1.2
Obesity Data
13.1.3
Obesity Variables in Excel
13.2
Census Data
13.3
Files
13.4
General Questions
14
Reshaping Data with tidyr
15
Iteration with purrr
16
Strings with stringr
17
Functions in R
18
Chimpanzees and Prosocial Choice
18.1
Prosocial Choice Experiments
18.2
Experiment Description
18.3
Controls
18.4
Behavior
18.5
Data
18.6
Statistical Models
18.7
Assumptions
18.8
Probability Preview
18.9
Questions
19
Probability
19.1
What is Probability?
19.2
Notions of Probability
19.3
Probability Definitions and Examples
19.4
Outcome Space
19.5
Probability
19.6
Law of Large Numbers
19.7
Events
19.8
Disjoint Events
19.9
Probability Axioms
19.10
Random Variables
19.11
Addition Rule
19.12
General Addition Rule
19.13
Probability Distribution of Discrete Random Variables
19.14
Probability Distribution of Continuous Random Variables
19.15
Complement Rule
19.16
Independence
19.17
Multiplication Rule
19.18
Conditional Probability
19.19
General Multiplication Rule
19.20
The Law of Total Probability
19.21
Weighted Means
19.22
Expectation
19.23
Continuous Random Variables
19.24
Variance
19.25
Sums of Random Variables
19.26
Covariance
19.27
Correlation Coefficient
19.28
Linear Combinations
20
Binomial Distributions
20.1
The Binomial Probability Mass Function
20.2
Mean and Variance
20.3
Binomial Calculations Using R
20.4
Binomial Random Samples
20.5
Binomial Probabilities in R
20.6
Binomial Quantiles
20.7
Graphing Binomial Distributions
21
Normal Distributions
21.1
Parameters
21.2
Normal Probability Density
21.3
Standard Normal Density
21.4
Benchmark Normal Probabilities
21.5
Normal CDF
21.6
Central Limit Theorem
21.6.1
Notes
21.7
Normal Calculations using R
21.8
Graphing Normal Distributions
22
Estimation
22.1
This is a stub
23
Hypothesis Testing
23.1
Hypothesis Testing Logic
23.2
Statistical Significance
23.3
Connections to Confidence Intervals
23.4
Other Hypothesis Tests
23.4.1
Comparsions Between with and without a Partner
23.4.2
Simulation Approach
23.4.3
Z-Test
23.4.4
Likelihood Ratio Test
23.4.5
Interpretation
23.5
Comparing Multiple Probabilities
23.6
Likelihood Ratio Test
23.6.1
Calculation of the LRT Statistic
23.6.2
Chi-square approach to the p-value
24
Human Sex Ratio Modeling
24.1
Is it a boy or a girl?
24.2
A Note on Gender Identity
24.3
Sex Ratio Data
24.3.1
Geissler Data, Saxony
24.3.2
Malinvaud Data, France
24.3.3
Danish Data
24.3.4
World Sex Ratios
24.4
Sex Ratio Models
24.4.1
Data Summary
24.5
Simple Binomial Model
24.5.1
Numerical Optimization
24.5.2
Goodness of Fit
24.6
Beta Binomial Model
24.6.1
Parameter Estimation
24.6.2
Goodness of Fit
24.6.3
Likelihood Ratio Test
24.7
Further Analysis
24.8
Questions
25
Volleyball
25.1
UW Women’s Volleyball
25.2
Volleyball Basics
25.2.1
Volleyball Competition
25.3
2019 Season Data
25.3.1
Volleyball Team Season Statistics
25.3.2
2019 Division I Match Statistics
25.3.3
Volleyball Data Source
25.4
Volleyball Questions
26
Correlation and Regression
26.1
Correlation
26.1.1
Correlation Formula
26.1.2
Correlation Examples
26.2
Simple Linear Regression
26.2.1
Regression Model
26.2.2
Regression Estimates
26.2.3
Understanding Regression Parameters
26.2.4
Prediction Interpretation
26.3
Prediction of Match Outcomes
26.3.1
Match Data
26.3.2
Model
26.3.3
Likelihood
26.3.4
Simulation
26.4
Fitting the Model
27
Simulation and Prediction
27.1
Predicting Outcomes of Sporting Events
27.2
Model Recap
27.3
Estimation of Volleyball Model
27.3.1
Maximum Likelihood Estimate of
\(\theta\)
27.3.2
Exploration
27.4
Predicting the Tournament
27.4.1
The simulations
Statistics 240 Course Notes
22
Estimation
22.1
This is a stub