UMN Applied Biostats 2024

Tour and table of contents

Key Links

Table of Contents

I: Getting Started.

XX: Preface

A bit about this course and the history of this book. Learning in this era. Goals of this book. How to use this book.

00: Variable types

An example mini chapter introducing Explanatory and response variables and Types of categorical and continuous variables.

01: Intro to stats

An introduction to the goals of statistics and challenges faced while pursuing these goals. Goals of biostatistics. Sampling from populations. Models and Hypothesis Testing. Inferring cause

02: Intro to R.

Getting you up and running in R, with the ultimate goal of being able to load and look at data. Coding for biostats. Why use R?. Observations and suggestions for learning R / computer stuff. Getting RStudio working. A tour of the Rstudio environment. Intro to R. Assigning variables in R. Using functions in R. R Packages. Loading data in R. Looking into your data. R scripts

II: Describing and visualizing data.

03 Data in R

Simple things we commonly do to data. The Tidy data structure, Wrangling data in R, Modify data with mutate, Summarize data, Combine summarize with group_by() to summarize by groups. Change variable type with mutate, A simple mutate() to change class, Use Large Language Models to help you code.

04: Summarize data

How we summarize our data! Measures of location: Summarizing the location of our data in R, Summarizing shape of data, Skewness, Number of modes, Measures of width, Boxplots and Interquartile range (IQR), Variance, Standard Deviation & Coefficient of Variation, Parameters and estimates, Rounding

05: Intro to ggplot

A quick intro to data visualization: Exploratory and explanatory visualizations, Centering plots on biology.
**The idea of ggplot: Mapping aesthetics onto variables – Scatterplots, A categorical explanatory variable. Small multiples.

06: Associations

07: Linear models

LATER: Sampling

We take estimates from samples because we (almost) never have access to the entire population.
Populations have parameters. Estimate population parameters by sampling: (Avoiding) Sampling Bias, (Avoiding) non-independence of Samples, There is no avoiding sampling Error.
The sampling distribution, Building a sampling distribution, by repeatedly sampling, by simulation, or by math.
The Standard Error – Minimizing sampling error, Be wary of exceptional results from small samples, Small samples, overestimation, and the file drawer problem.

LATER: Uncertainty.

Describing uncertainty in our estimates, because we don’t have populations. Review: We estimate population parameters from samples. Estimation with uncertainty, Generating a sampling distribution, Resampling from our sample (Bootstrapping). Estimating the standard error, The bootstrap standard error Confidence intervals, The bootstrap confidence interval, Visualizing uncertainty,
Common mathematical rules of thumb

08: Stats Concept Review

09😳😳: Collecting and storing data

10: Reproducible analyses

11: R Review and pRactice.

III Probability and hypothesis testing

13: Probabilistic thinking

14: Simulation

15: Hypothesis Testing

16: Shuffling to generate a null

17: Associations between continuous variables

18: Analysis of independent dat set

19: Probability and NHST review

Part IV: Linear Models

20: Normal Distribution

21: Sampling a normal distribution

22: Two samples from normal distributions

23: More than two samples from normal distributions

24: Predicting one continuous variable from another

25: Predicting one continuous variable from two (or more) things

26: Interactions

27: Considerations when predicting one continuous variable from two (or more) thing

28: Review of Linear models

V Big picture