Preface

This book is intended to help students build a foundation for statistical thinking and methods. Three basic ideas motivate the way this book is structured:

Statistics is an applied field with a wide range of practical applications.
You don’t have to be a math guru to learn from interesting, real data.
Data are messy, and statistical tools are imperfect. However, when you understand the strengths and weaknesses of these tools, you can use them to learn interesting things about the world.

Textbook overview

Part 1: Descriptive statistics. Data structures, variables, summaries, graphics, and basic data collection and study design techniques. Data visualization and summarization, including relationships between variables.
Part 2: Foundations for inference. Case studies are used to introduce the ideas of statistical inference with randomization tests, bootstrap intervals, and mathematical models.
Part 3: Statistical inference. Further details of statistical inference using randomization tests, bootstrap intervals, and mathematical models for numerical and categorical data.

Each part contains multiple chapters. Each chapter ends with a review section which contains a chapter summary as well as a list of key terms introduced in the chapter. If you’re not sure what some of these terms mean, we recommend you go back in the text and review their definitions. We purposefully present them in alphabetical order, instead of in order of appearance, so they will be a little more challenging to locate. However you should be able to easily spot them as bolded text.

Examples and exercises

Examples are provided to establish an understanding of how to apply methods.

This is an example. When a question is asked here, where can the answer be found?

The answer can be found here, in the solution section of the example!

When we think the reader is ready to try determining a solution on their own, we frame it as Guided Practice.

The reader may check or learn the answer to any Guided Practice problem by reviewing the full solution in a footnote.¹

Exercises are also provided at the end of each chapter. Solutions are given for odd-numbered exercises in Appendix A.

Datasets and their sources

A large majority of the datasets used in the book can be found in various R packages. Each time a new dataset is introduced in the narrative, a reference to the package like the one below is provided. Many of these datasets are in the openintro R package that contains datasets used in OpenIntro’s open-source textbooks.²

The textbooks data can be found in the openintro R package.

The datasets used throughout the book come from real sources like opinion polls and scientific articles, except for a handful of cases where we use toy data to highlight a particular feature or explain a particular concept. References for the sources of the real data are provided at the end of the book.

Computing with R

The narrative and the exercises in the book are computing language agnostic, however while it’s possible to learn about modern statistics without computing, it’s not possible to apply it.

Self-paced and interactive R tutorials were developed using the learnr R package, and only an internet browser is needed to complete them.

You can access the full list of tutorials supporting this book here.

OpenIntro, online resources, and getting involved

OpenIntro is an organization focused on developing free and affordable education materials.

We encourage anyone learning or teaching statistics to visit openintro.org and to get involved.

All OpenIntro resources are free and anyone is welcomed to use these online tools and resources with or without this textbook as a companion.

We value your feedback. If there is a part of the project you especially like or think needs improvement, we want to hear from you. For feedback on this specific book, you can open an issue on GitHub. You can also provide feedback on this book or any other OpenIntro resource via our contact form at openintro.org.

Acknowledgements

The OpenIntro project would not have been possible without the dedication and volunteer hours of all those involved, and we hope you will join us in extending a huge thank you to all those who volunteer with OpenIntro.

The authors would like to thank

David Diez and Christopher Barr for their work on the 1st Edition of this book,
Ben Baumer and Andrew Bray for their contribution rethinking how and which order we present this material as well as their work as original authors of the interactive tutorial content,
Yanina Bellini Saibene, Florencia D’Andrea, and Roxana Noelia Villafañe for their work on creating the interactive tutorials in learnr,
Will Gray for conceptual diagrams,
Allison Theobold, Melinda Yager, and Randy Prium for their valuable feedback and review of the book,
Colin Rundel for feedback on content and technical help with conversion from LaTeX to R Markdown,
Christophe Dervieux for help with multi-output bookdown issues, and
Müge Çetinkaya and Meenal Patel for their design vision.

We would like to also thank the developers of the open-source tools that make the development and authoring of this book possible, e.g., bookdown, tidyverse, and icons8.

We are also grateful to the many teachers, students, and other readers who have helped improve OpenIntro resources through their feedback.

Authors

1 Hello data