Statistical Inference via Data Science

by Chester Ismay and Albert Y. Kim

2024-07-23

An open-source and fully-reproducible electronic textbook for teaching statistical inference using tidyverse data science tools. […] This is the website for Statistical Inference via Data Science: A ModernDive into R and the Tidyverse! Visit the GitHub repository for this site and find the book on Amazon. You can also purchase it at CRC Press using discount code ADC24. This work by Chester Ismay and Albert Y. Kim is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International … Read more →

1

R语言在心理学研究中的应用: 从原始数据到可重复的论文手稿(V2)

by 胡传鹏

2024-07-19

课程bookdown […] 第一讲：为什么要学习R 1.1 R在心理科学及社会科学中的运用 1.2 R语言使用的示例展示 1.3 课程安排 1.4 如何学好这门课第二讲：如何开始使用R 2.1 要解决的数据分析问题简介[介绍我们的数据和拟解决的问题，对比R和传统flow] 2.1 如何安装？ 2.2 如何方便使用？Rstudio的安装与界面介绍第三章：如何使用本课件/电子书资源 3.1 Git与Github 3.2 项目、文件与代码的规范化第四章：如何导入数据 4.1 路径与工作目录 4.2 读取数据 4.3 了解R里的数据（R语言中的对象）第五章：如何清理数据一 R语言编程基础 5.1 R对象的操控 5.2 逻辑运算 5.3 函数第六章：如何清理数据二数据的预处理 6.1 Tidyverse简介 6.2 … Read more →

2

Statistics 240 Course Notes

by Bret Larget

2024-01-24

This book contains case studies and course notes for STAT 240, Introduction to Data Modeling, at the University of Wisconsin, including instruction for many tidyverse packages […] Statistics 240 is a first course in data science and statistical modeling at the University of Wisconsin - Madison. The course aims to enable you, the student in the course, to gain insight into real-world problems from messy data using methods of data science. These notes chart an initial path for you to gain the knowledge and skills needed to become a data scientist. The structure of the course is to present a series … Read more →

3

Introduction to Environmental Data Science

by Jerry Davis, SFSU Institute for Geographic Information Science

2024-01-08

Background, methods and exercises for using R for environmental data science. The focus is on applying the R language and various libraries for data abstraction, transformation, data analysis, spatial data/mapping, statistical modeling, and time series, applied to environmental research. Applies exploratory data analysis methods and tidyverse approaches in R, and includes contributed chapters presenting research applications, with associated data and code packages. Read more →

4

数据科学中的 R 语言

by 王敏杰

2023-07-18

This book is an overview of how practitioners can acquire, wrangle, visualize, and model data with the R and Stan. […] 你好，这里是四川师范大学研究生公选课《数据科学中的R语言》的课程内容。R语言是统计编程的第一语言，近几年Tidyverse的推出大大降低了R语言的学习难度。Tidyverse是一系列R包的集合，包含了dplyr、ggplot2、tidyr、stringr等，从数据导入预处理，再到高级转化、可视化、建模和展示。因为其代码清晰可读的编程风格，得到越来越多人的喜爱。考虑到大家来自不同的学院，有着不同的学科背景，因此讲授的内容不会太深奥（要有信 … Read more →

5

Applied longitudinal data analysis in brms and the tidyverse

by A Solomon Kurz

2023-06-12

This project is a reworking of Singer and Willett’s classic (2003) text within a contemporary Bayesian framework with emphasis of the brms and tidyverse packages within the R computational framework. […] This project is based on Singer and Willett’s classic (2003) text, Applied longitudinal data analysis: Modeling change and event occurrence. You can download the data used in the text at http://www.bristol.ac.uk/cmm/learning/support/singer-willett.html and find a wealth of ideas on how to fit the models in the text at https://stats.idre.ucla.edu/other/examples/alda/. My contributions show … Read more →

6

MLFE R labs (2023 ed.)

by Prof. Michela Cameletti & Tutor Rasoul Samei

2023-05-29

Notes for the R labs of the MLFE course @ Unibg […] You are reading the lecture notes of the R labs for the Machine learning for Economics (MLFE) course at University of Bergamo (academic year 2022/23). The MLFE course is the second module of the Coding for Data Science course. The MLFE R labs are designed for students who already have some experience with R programming thanks to the first module of the Coding and Machine Learning course. Click here and here to access the R lab notes of the first module regarding introduction to R language and the tidyverse package. Enjoy the journey! … Read more →

7

Curso: R para análisis de datos

by Diana García Cortés

2023-05-27

Contenido del curso: R para análisis de datos […] Este libro contiene las notas desarrolladas para acompañar el curso de “R para análisis de datos”. Pretende ser un compendio de anotaciones de los temas tratados, con la finalidad de que las participantes cuenten con material de apoyo y consulta. No pretende ser una fuente exhaustiva sobre el uso de los paquetes de tidyverse ya que para eso existen excelentes recursos como: R for Data Science [1] y su versión en español: R para ciencia de datos [2]. Este contenido se encuentra en … Read more →

8

Handling, Measuring, Estimating and Visualizing Migration Data in R

by Guy J. Abel, James Raymer, Ellen Kraly

2023-05-22

In many countries and regions, migration is becoming or already is the largest component of population change and important mechansim for both social and economic change. However, migration data is often of poor quality, missing or provided without disaggregation. Methods to estimate migration flows have been developed by demographers and other researchers to help address shortfalls and provide a platform to better understand patterns, trends and consequences. This manaul explores methods for measuring, estimating and visualising migration flow data, and their implementation in R. Readers will become familiar with useful R functions for handling migration data, a range of measures to summarise and compare migration systems, common estimation methods to overcome inadequate or missing migration data and recently developed methods to visualize complex migration patterns. While plenty of code samples and exercises are provided throughout the manual to build up the readers experience, some prior knowledge is required on how to handle and plot data using the tidyverse set of R packages. Read more →

9

Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition

by A Solomon Kurz

2023-01-26

This book is an attempt to re-express the code in the second edition of McElreath’s textbook, ‘Statistical rethinking.’ His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] This ebook is based on the second edition of Richard McElreath’s (2020a) text, Statistical rethinking: A Bayesian course with examples in R and Stan. My contributions show how to fit the models he covered with Paul Bürkner’s brms package (Bürkner, 2017, 2018, 2022j), which makes it easy to fit Bayesian regression models in R (R … Read more →

10

Doing Bayesian Data Analysis in brms and the tidyverse

by A Solomon Kurz

2023-01-26

This project is an attempt to re-express the code in Kruschke’s (2015) textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] Kruschke began his text with “This book explains how to actually do Bayesian data analysis, by real people (like you), for realistic data (like yours).” In the same way, this project is designed to help those real people do Bayesian data analysis. My contribution is converting Kruschke’s JAGS and Stan code for use in Bürkner’s brms package (Bürkner, 2017, 2018, 2022g), … Read more →

11

Statistical rethinking with brms, ggplot2, and the tidyverse

by A Solomon Kurz

2023-01-25

This project is an attempt to re-express the code in McElreath’s textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] I love McElreath’s (2015) Statistical rethinking text. It’s the entry-level textbook for applied researchers I spent years looking for. McElreath’s freely-available lectures on the book are really great, too. However, I prefer using Bürkner’s brms package (Bürkner, 2017, 2018, 2022i) when doing Bayesian regression in R. It’s just spectacular. I also prefer plotting with … Read more →

12

Recoding Introduction to Mediation, Moderation, and Conditional Process Analysis

by A Solomon Kurz

2023-01-25

This ebook is an effort to connect Hayes’s conditional process analysis work with the Bayesian paradigm. Herein I refit his models with my favorite R package for Bayesian regression, Bürkner’s brms, and use the tidyverse for data manipulation and plotting. […] Andrew Hayes’s (2018) text, Introduction to mediation, moderation, and conditional process analysis: A regression-based approach, has become a staple in social science graduate education. Hayes’s work has been from a frequentist OLS perspective. This book is an effort to connect his work with the Bayesian paradigm. Herein I refit his … Read more →

13

tidy[ing] up POL345

by John Kim

2022-11-13

A guide to the tidyverse for POL345 Students. […] POL345 is often Princeton students’ first foray into the programming language R. Through POL345, students gain an introductory overview of R, and programming generally, to conduct basic data analysis on their own. However, many further courses (SML201, SOC306, POL346), along with industry users of R, use the tidyverse instead, a “language” within R to conduct clean, readable data analysis. This book seeks to bridge that gap, revisiting each of the POL345 handouts using the tidyverse to introduce students to this “language within a language”. … Read more →

14

An Introduction to Statistical Learning with the tidyverse and tidymodels

by Taylor Dunn

2022-10-24

Working through ISLR with the tidyverse and tidymodels […] I am a data scientist and statistician who is (mostly) self-taught from textbooks and generous people sharing their work online. Inspired by projects like Solomon Kurz’s recoding of Statistical Rethinking and Emil Hvitfeldt’s ISLR tidymodels labs, I decided to publicly document my notes and code as I work through An Introduction to Statistical Learning, 2nd edition by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. I prefer to work with the tidyverse collection of R packages, and so will be using those to wrangle … Read more →

15

Modern R with the tidyverse

by Bruno Rodrigues

2022-10-24

This book will teach you how to use R to solve your statistical, data science and machine learning problems. Importing data, computing descriptive statistics, running regressions (or more complex machine learning models) and generating reports are some of the topics covered. No previous experience with R is needed. […] I have been working on this on and off for the past 4 years or so. In 2022, I have updated the contents of the book to reflect updates introduced with R 4.1 and in several packages (especially those from the {tidyverse}). I have also cut some content that I think is not that … Read more →

16

Язык R для пользователей Excel

by Алексей Селезнёв

2022-09-15

Бесплатный видео курс по языку R. Введение в tidyverse. […] В связи с карантином многие сейчас львиную долю времени проводят дома, и это время можно, и даже нужно провести с пользой. В начале карантина я решил довести до ума некоторые проекты начатые несколько месяцев назад. Одним из таких проектов был видео курс “Язык R для пользователей Excel”. Этим курсом я хотел снизить порог вхождения в R, и немного восполнить существующий дефицит обучающих материалов по данной теме на русском языке. Если всю работу с данными в компании, в котороый вы работаете принято по-прежнему вести в Excel, то … Read more →

17

MLFE R labs (2022 ed.)

by Prof. Michela Cameletti & Tutor Marco Villa

2022-05-30

Notes for the R labs of the MLFE course @ Unibg […] You are reading the lecture notes of the R labs for the Machine learning for Economics (MLFE) course at University of Bergamo (academic year 2021/22). The MLFE course is the second module of the Coding for Data Science course. The MLFE R labs are designed for students who already have some experience with R programming thanks to the first module of the Coding and Machine Learning course. Click here and here to access the R lab notes of the first module regarding introduction to R language and the tidyverse package. Enjoy the journey! … Read more →

18

Statistical Thinking for Linguists

by Sakol Suethanapornkul

2022-03-28

This is a minimal example of using the bookdown package to write a book. The HTML output format for this example is bookdown::gitbook, set in the _output.yml file. [...] Linguists study language from diverse perspectives and specialize in particular areas within the field of linguistics. Regardless of perspectives and methodologies, linguists must work with language data. In some cases, l There are many reasons why linguists should learn to use R. R is free! I love working with R, particularly the tidyverse ... Read more →

19

An(other) introduction to R

by Felix Lennert

2022-02-07

This is a gentle introduction to R and the basic usage of some tidyverse packages (dplyr, tidyr, ggplot2, forcats, stringr) for data manipulation and visualization. […] Dear student, in the following, you will receive a gentle introduction to R and how you can use it to work with data. This tutorial was heavily inspired by Richard Cotton’s “Learning R” (Cotton 2013) and Hadley Wickham’s and Garrett Grolemund’s “R for Data Science” (abbreviated with R4DS). The latter can be found online (Wickham and Grolemund 2016). We will not immediately start out with the packages from the tidyverse … Read more →

20

Discovering Structural Equation Modeling Using Stata R, the Tidyverse, and Lavaan

by Nicholas R. Jenkins

2021-12-27

This project replicates the Stata code in Acocks’s (2013) text with R, the tidyverse, and lavaan. […] This project is a guide through Alan Acock’s Discovering Structural Equation Modeling Using Stata in R using the Tidyverse, and lavaan packages. The data needed to replicate the analyses in the book can be found on the book’s website here: Discovering Structural Equation Modeling Using Stata. My goal is to show how to fit these models in R and visualize their results. This is also, ver much, a work in progress. I assume that you have some familiarity with R and the tidyverse and won’t spend … Read more →

21

Introduction to R - tidyverse

by Brendan R. E. Ansell @ansellbr3

2021-06-10

Introduction to R - tidyverse […] This document contains the material covered in the Introduction to R (tidyverse) course taught at the Walter and Eliza Hall Institute of Medical Research. The course is taught to biomedical scientists, but the material and the teaching examples are very broad. Skills taught in this workshop can be applied to many disciplines in academia and industry. There is no assumed knowledge of R or other computer languages - we start from scratch. Chapters 1 through 5 make use of popular (non-biological) teaching data sets available through R. Chapters 6 onwards … Read more →

22

A Crash Course in Geographic Information Systems (GIS) using R

by Michael Branion-Calles

2021-06-01

A Crash Course in Geographic Information Systems (GIS) using R […] There is an assumption of some previous experience in R with this tutorial. If you have not used R before I would start with Chapter 1 of the free, and excellent textbook R for Data Science. The GIS operations in R from the sf package are designed to integrate well with the tidyverse suite of R packages. We will make use of some basic functionality from the dplyr package and will be using pipes (%>%) to sequence multiple operations. If you are unfamiliar with dplyr and pipes I would go through the base vignette before … Read more →

23

Machine Learning for Economics 2020/21: R labs

by Michela Cameletti

2021-05-31

Notes for the R labs of the MLFE course @ Unibg […] You are reading the lecture notes of the R labs for the Machine learning for Economics (MLFE) course at University of Bergamo (academic year 2020/21). The MLFE course is the second module of the Coding and Machine Learning course. The MLFE R labs are designed for students who already have some experience with R programming thanks to the first module of the Coding and Machine Learning course. Click here to access the R lab notes of the first module regarding introduction to R language and the tidyverse package. Enjoy the journey! … Read more →

24

基于R语言的科研信息分析与服务

by 王敏杰

2021-05-24

Scientific Research information service using R […] 在图书馆开设R语言系列讲座也有一年半载了，在此过程中我萌生了用R语言写一本书的想法，一方面是想为学生提供R语言学习范例，另一方面也借此为我校科研人员提供一些科研信息服务。如果此举能做到教学相长，更好地实践和应用数据科学，也算是一次很有意义的尝试，无奈自己时间精力有限，写书进展缓慢。本书数据处理和可视化用到 tidyverse, 您也可以参考我的课件《数据科学中的 R 语言》。本书的代码可以公开，您完全可以重复每一过程。本书使用的代码和数据集可能用到的开发版本的宏包 … Read more →

25

Introduction to R (Part 2)

by Nana Kim

2021-02-16

A document for Intro to R workshop (part 2) video […] In the next two chapters, we will learn how to manipulate and visualize data. We will use tidyverse packages (mainly dplyr, ggplot2, and tidyr) for easier and faster data manipulation/visualization. First, install and load the tidyverse by running: * Visit https://www.tidyverse.org/ to learn more about the … Read more →

26

R 資料科學與統計

by 林建甫 Jeff Lin

2020-10-29

R 資料科學與統計 […] R 可視為統計數學軟體, 也是一種程式語言, 而近年來的發展, R 更成為資料科學的熱門的工具之一. R 是一個免費的統計分析軟體 (open-source, GNU General Public License), R 由一群跨國際的志工人員組成的 {R} 核心發展組織 (R core-development team) 所維持, 運作與持續更新發展. 目前對初學者的 R 入門學習有二大主要論點, 一為學習使用 R Base 原始語言與原始套件, 二為直接學習外部套件, 如 ggplot2, tidyverse 系統. 無論哪一種方式各有其優缺點. 個人認為對未來必須經常性使用 R 進行資料分析工作的初學者, 則建議先學習使用 R Base 原始語言. 對於未來僅在少數時間必須使用 R 進行資料分析工作, 或是僅在統計學上課使用, 則建議學習 … Read more →

27

TidySimStat

by Edward J. Xu

2020-05-15

Stochastic Simulation and Statistics in Tidyverse. […] This is the website hosting all the theories and and practices regarding stochastic simulation and statistics. It has the following … Read more →

28

Notes for “Text Mining with R: A Tidy Approach”

by Qiushi Yan

2020-05-10

Notes for “Text Mining with R: A Tidy Approach” […] This is a notebook concerning Text Mining with R: A Tidy Approach(Silge and Robinson 2017). tidyverse and tidytext are automatically loaded before each chapter: I have defined a simiple function, facet_bar() to meet the frequent need in this book to make a facetted bar plot, with the y variable reordered by x in each facet by: As a quick demostration of this function, we can plot the top 10 common words in Jane Austen’s six books: … Read more →

29

R for data science: tidyverse and beyond

by Maxine

2020-05-10

R for data science: tidyverse and beyond […] 关于 R for Data Science (Wickham and Grolemund 2016) 的个人笔记，随缘更新。任何建议：https://github.com/enixam/rfordatascience/issues 或 565702994@qq.com tidyverse … Read more →

30

Manipulación de datos e investigación reproducible en R

by Derek Corcoran

2019-12-30

Este libro es una compañia al curso, análisis y manipulación de datos en R […] Para comenzar el trabajo se necesita la última versión de R y RStudio (R Core Team 2019).También se requiere de los paquetes pacman, rmarkdown, tidyverse y tinytex. Si no se ha usado R o RStudio anteriormente, el siguiente video muestra cómo instalar ambos programas y los paquetes necesarios para este curso en el siguiente link. El código para la instalación de esos paquetes es el siguiente: En caso de necesitar ayuda para la instalación, contactarse con el instructor del curso. Si nunca se ha trabajado con R … Read more →

31

Recoding Introduction to Mediation, Moderation, and Conditional Process Analysis

by A Solomon Kurz

2019-12-21

This project is an effort to connect his Hayes’s conditional process analysis work with the Bayesian paradigm. Herein I refit his models with my favorite R package for Bayesian regression, Bürkner’s brms, and use the tidyverse for data manipulation and plotting. […] Andrew Hayes’s Introduction to Mediation, Moderation, and Conditional Process Analysis text, the second edition of which just came out, has become a staple in social science graduate education. Both editions of his text have been from a frequentist OLS perspective. This project is an effort to connect his work with the Bayesian … Read more →

32

Interactive web-based data visualization with R, plotly, and shiny

by Carson Sievert

2019-12-19

Interactive web-based data visualization with R, plotly, and shiny

A useR guide to creating highly interactive graphics for exploratory and expository visualization. […] This is the website for “Interactive web-based data visualization with R, plotly, and shiny”. In this book, you’ll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R. It makes heavy use of plotly for rendering graphics, but you’ll also learn about other R packages that augment a data science workflow, such as the tidyverse and shiny. Along the way, you’ll gain insight into best practices for visualization of high-dimensional data, … Read more →

33

Statistical Rethinking with brms, ggplot2, and the tidyverse

by A Solomon Kurz

2019-05-05

This project is an attempt to re-express the code in McElreath’s textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] I love McElreath’s Statistical Rethinking text. It’s the entry-level textbook for applied researchers I spent years looking for. McElreath’s freely-available lectures on the book are really great, too. However, I prefer using Bürkner’s brms package when doing Bayeian regression in R. It’s just spectacular. I also prefer plotting with Wickham’s ggplot2, and coding with … Read more →

34

An Incomplete Solutions Guide to the NIST/SEMATECH e-Handbook of Statistical Methods

by Ray Hoobler

2019-02-16

Analysis of case studies and exercies with a focus on using the tidyverse and ggplot2. This handbook was created using the bookdown package in RStudio. The output format for this example is bookdown::gitbook. […] Exploratory Data Analysis (EDA) is a philosophy on how to work with data, and for many applications, the workflow is better suited for scientist and engineers. As a scientist, we are trained to formulate a hypothesis and design a series of experiments that allow us to test the hypothesis effectively. Most data, however, doesn’t come from carefully controlled trials, but from … Read more →

35

Tidyverse Cookbook

by Malte Grosser

2018-12-07

Simple cookbook for functions and idioms within the scope of the tidyverse. […] The basic idea of this book is to provide a documentation of the tidyverse written in a solution driven cookbook style. As an extra I would like to provide similar solutions based on base R functionality. Some reasons to write this book: One strength of the tidyverse is that it hides a lot of quirks that base R provides and inherits to many packages that rely on it. This allows to stick to a specific workflow from the point you enter the tidyverse until you leave it. This is why I highly recommend to head your … Read more →

36

Simulation And The James-Stein Estimator In R

by Alex Hallam

2017-11-07

Simple Simulation and the James-Stein Estimator […] This is the website for “Simulation And The James-Stein Estimator In R”. This technical document is short, covering some common ways to generate data and exploring the James-Stein Estimator. This will teach you how to do run simulations to observe the properties of the James-Stein Estimator in R — specifically using the tidyverse: You’ll learn how to generate data to prove theoretical results. In the computer age of statistics the data scientist has the power of machines to run simulations for testing a methods before putting a method into … Read more →

37

Data Science and Visualizations with R

by Jonathan Wong

2017-07-16

Data Science and Visualizations with R […] This is a course on the use of tidyverse packages tidyverse provides a complete suite of modern data-handling tools. It is an essential toolbox for any data scientist using R. The tidyverse package is designed to be easy to install. This course will dive into using tidyverse. It will assume you have already installed r and rstudio and how some familiarity on how to use the rstudio. This book will use the nycflights13 dataset This package contains information about all flights that departed from NYC in 2013: 336,776 flights with 16 variables. To … Read more →

38

Spreadsheet Munging Strategies

by Duncan Garmonsway

2024-07-25*

Spreadsheet Munging Strategies […] This is a work-in-progress book about getting data out of spreadsheets, no matter how peculiar. The book is designed primarily for R users who have to extract data from spreadsheets and who are already familiar with the tidyverse. It has a cookbook structure, and can be used as a reference, but readers who begin in the middle might have to work backwards from time to time. R packages that feature heavily are Tidyxl and unpivotr are much more complicated than readxl, and that’s the point. Tidyxl and unpivotr give you more power and complexity when you need it. … Read more →

39

Tidy Modeling with R

by Max Kuhn and Julia Silge

2024-07-25*

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. This book provides a thorough introduction to how to use tidymodels, and an outline of good methodology and statistical practice for phases of the modeling process. […] Welcome to Tidy Modeling with R! This book is a guide to using a collection of software in the R programming language for model building called tidymodels, and it has two main goals: First and foremost, this book provides a practical introduction to how to use these specific R packages to create models. We … Read more →

40

Yet another ‘R for Data Science’ study guide

by Bryan Shalloway

2024-07-25*

Notes and solutions to Garrett Grolemund and Hadley Wickham’s ‘R for Data Science’ […] This book contains my solutions and notes to Garrett Grolemund and Hadley Wickham’s excellent book, R for Data Science (Grolemund and Wickham 2017). R for Data Science (R4DS) is my go-to recommendation for people getting started in R programming, data science, or the “tidyverse”. First and foremost, this book was set-up as a resource and refresher for myself1. If you are looking for a reliable solutions manual to check your answers as you work through R4DS, I would recommend using the solutions created and … Read more →

41